Professional Documents
Culture Documents
A Tutorial
Introduction
Impure is an online application which empowers people to be part of the information revolution. It is a powerful tool to gather, combine, analyze and deeply understand data in the Internet. You can work with your own data or with many sources available online, such as news feeds, social media streams, real time or historical financial information, search results, images and many more. Impure's modular interface lets you design information flows with ease, linking data sources to operators, controls and visualization methods within a graphical interface that clearly displays the structure of your process. In this way, it helps enabling even non-programmers to work with information in a professional way and to explore complex bodies of data. Among other possibilities, impure allows you to: easily read data from diverse sources and repositories load your own data locally or remotely visualize it in a wide range of ways (more than 100 visualization methods so far) process it.. compare it... mix it.. filter it... (more than 300 controls and operations so far) publish and share your projects
Using Impure is easy and intuitive; you don't need to type any code. All is done by linking modules together to set up information flows, that begin with feeds or other data inputs and end with processed data or visualizations. In between, you can set up interactive controls to let users choose or modify parameters dynamically and see results change in real time. Often the visualization modules themselves can be used as controls and feed the whole process back to enable exploration. Impure has been conceived as a flexible tool that contributes to the democratization of the information age for all Internet citizens, and turns the Web into an unlimited resource for the generation of insights and knowledge in your preferred area of interest.
If you are used to a text-based programming environment, you can think of the inlets as the arguments of based a function or method, and of the outlet as its return value.
Direct input
Some types of data structures give you the possibility of typing the values you wish directly into the inlet itself, saving you the need to place other modules on the stage for that purpose. These inlets are modules identified by a small arrow in the bottom right of the icon. There is also a gray rectangle to the left of the inlet: just click on it and start typing the input value.
After you click in any other part of the space, or do nothing for a few seconds, your input will automatically rt be converted into a Data Structure module of the appropriate type.
Connections
Modules can be very powerful, but they will do nothing by themselves. Impure comes to life only when you link them together to define an information flow, that begins with some data source and typically ends with one or more visualizations or processed outputs. Defining connections is easy. You just need to hover on an outlet, click on the purpl purple circle that appears, drag the connecting line to your destination module, and release it in the appropriate inlet.
After the connection is established, it will be visible as a line with an arrow tip at the middle that shows the flow direction. If data is flowing through that channel, the line will be red. Otherwise (if the source has a void or null value) it will be yellow.
Module types
Modules in Impure can do many different things. They are organized into five broad categories according to the function they perform.
Knowing which type of module you need in a particular situation is the first step to find your way around Impure's libray and to be able to build spaces quickly and easily. In this section you will find an explanation for each of these types and what they are useful for.
Data Structures
Data Structures are identified by the color RED Data Structures hold a piece of information of a given type. They have no inlets and only they have one outlet, which is the source from where you can read the data contained inside that "box". There are many different kinds of Data Structures, but only some of the most basic ones can be placed as modules in an Impure space. You will be able to recognize them in the Library by the small arrow in the bottom right corner of their icons. Those are modules you can drag into the space, and type information directly on them.
Draggable (aka typable) Data Structures are the ones that can be defined by typing or pasting a text, such a Number or a NumberList. Many Data Structures modules are not allowed to be placed on stage; that's because is not possible (so far) to define its content only by typing. In the future more Data Structures will be typable (once we define a text code for them). Untypable Data Structures exist only as inlets and outlets. Why are there in the Library? Because Data Structures are the basic pieces and it is very important to have always access to its entire list and documentation. If you want to place on the space a Data Structure that's not typable there are ways to do so. For instance, StringList is not typable, but you can place a String, write a text using a separator character, and then using the splitString in order to build the desired StringList.
Operators
Operators are identified by the color CYAN
Operators always have at least one inlet. They perform some kind of operation on the data that is fed to them, and return the result in the outlet.
The operators Library is the most populated (there are more than 300 operators so far)
Visualizators
Visualizators are identified by the color MAGENTA Visualizators build some kind of visual representation from the data they receive. There are many options available for any kind of data you can manipulate within Impure. Different visualizations can reveal different aspects of a certain data set. Many visualizators also allow interaction, giving users the possibility to dynamically explore the data. Some also have an outlet that makes the result of that interaction available to feed other processes in the space.
Controls
Controls are identified by the color ORANGE Controls let users interact with the space or perform some complex task (such as downloading data).
Apis
Apis are identified by the color GREEN GREEN. Apis allow communication with many sources of information on the Internet. Some of the most frequently used ones are: Google search, Twitter search, Twitter word historical behavior, Market data, Flickr search, Flickr sets loader, Delicious account data load, Ebay items information, Dictionary definitions, Semantic expansion, etc.
Library
They are the access point for Impure's library, which is a set of lists containing all the modules that are currently part of the application. There are many of them! Finding exactly what you need might not be easy when you are just getting started, but there are several resources to help you with that task, and you will soon get used to obtaining what you are looking for quickly and effectively. There's one broad classification to start with: according to the type of process they perform, modules belong to one of five module types. Each of them is identified by a distinctive color, which you can see on the left border tabs. Click on any of them to unfold the corresponding list. For example, these are the Operators: ponding
How To
Quickly bring data to impure
There are three main ways to bring data to an impure space:
There are many api modules in Impure and we plan on keep adding new ones all the time.
For the specific (and very common) case of .csv, you could also use csvLoader, which does everything in a single step. CSV is a text format that encodes tables: Database and spreadsheet software, such as Excel, can usually export .csv with ease.
NumberList Histogram
NumberTable SimpleNumberTableVisualizator
Network Oracle
[browser_name] compared with [browser_name]" or: [browser_name] is faster than [browser_name] Quotation marks are important because they guarantee the search will be strict, meaning Google will only return pages in which the complete sentences are found. Once your StringList is ready, just connect it to InternetMultiNSearchResults. Just watch as the module performs the searches one after another, and populates a NumberList with the amount of results for each of them.
As soon as you have entered a valid path or url, csvLoader will start loading it and giving you feedback on the progress, in case it is a large file and takes a while to load. It will turn green when it is finished.
Voila! Your Excel data is already available within the Impure space to do whatever you want with it. Pay attention the List and Table operators: you will find many ways of filtering, sorting, analyzing and combining tabular data. Before processing the data in any way, though, you will probably want to see it and check everything is all right. Just plug in the TableVisualizator, and you will have a convenient representation with scrollbars for panning around large tables. This module does also let you click on a cell to select it, making its contents available in the outlet.
Draw maps
Drawing a Simple Map Around the Empire State Building
First let's see how we can draw a simple map of a location using the Google API in Impure. For this we will use the GeocoderGoogleMaps api and the GoogleMapVisor First we need to find the GEO coordinates of the center of our map. For example, let's say that we want to draw a map around the Empire State Building in New York. We pass this address as a String to the GeocoderGoogleMaps. In order to check the output we will use a TableVisualizator (Note: we set the optional parameter "column's width" of the TableVisualizator to 300 to see the entire string of the columns).
Now we need to use the GoogleMapVisor to draw the map. As in put, we need to pass the first cell of the Table returned by GeocoderGoogleMaps; we do this using a getElementFromTable (with input 0,0). We use a zoom level of 16 and a map type of 1 which draws the buildings in 3D.
If you now move the new visualizator exactly over the map, and then go to the view (eye) menu (top left of Impure) and tick off "visualizator panels", you will be able to see the map, with the marker superimposed!
Notice that if you move the map (dragging with the mouse) left and right, the marker moves correctly. However if you move up and down the marker gets misaligned. This is because the GoogleMapVisor uses mercator projected coordinates, whereas the Polygon2DSimpleVisualizator uses standard geometric coordinates. We will fix this in the next section.
Transformations
Geo coordinates in Google are returned using the Mercator coordinate system, which is different form the normal Impure coordinate system, for several reasons: the cover the surface of a sphere (the Earth) rather than a plane, and zero designates the equator. In order to represent properly in Impure a coordinate given in Mercator we need to use a universalProjectionOnTransformationGeo module. Here is an example:
We will need to use this technique to transform both the Geo coordinate of the place being marked and the coordinates of the Rectangle given by the GoogleMapVisor before passing them to the
Polygon2DSimpleVisualizator. If we do this we will finally obtain a properly marked map which can be moved:
(We cheated a little bit in the image above: we painted the marker red instead of the default grey. You can do this providing a color to the 'color' parameter of the Polygon2DSimpleVisualizator. To learn more about colors you can see color
Multiple Markers
Once we have this schema for drawing a single marker, it is extremely easy to extend it to multiple markers. All it takes is to change the GeocoderGoogleMaps to a GeocoderMultiGoogleMaps and to provide multiple addresses! Here is the final schema (we highlighted in blue the changes from the last one, to emphasize how little we changed!).