Large-scale high dimensional data visualization is highly valuable for scientific discovery in many fields of data mining and information retrieval. PlotViz is a 3D data point browser that visualizes large volume of 2- or 3-dimensional data as points in a virtual space on a computer screen and enable users to explore the virtual space interactively. PlotViz was initially designed to consume outputs of dimension reduction algorithms for visualizing high-dimensional data in a lower-dimensional space, such as Multi-dimensional Scaling (MDS) and Generative Topographic Mapping (GTM). Used together with such dimension reduction algorithms, PlotViz can help users to discover intrinsic structures of high-dimensional data and browse large volumes of data points interactively and efficiently in a virtual 3D space.
Key Features and Functions
PlotViz has the following main features:
- 3D data point visualization in a virtual space on a Windows desktop client, implemented by using C# and Microsoft XNA library
- Overlaying meta data information, such as cluster information, name, or images, over data points to provide rich information to users
- User-friendly interactive GUI, including zooming and rotating.
- An interface of SPARQL query, a standard semantic web technology, to interact with external systems. Currently PlotViz can interact with Chem2Bio2RDF through a WS interface.
Download and installation
PlotViz binary is available to use. We provides PlotViz installation binaries and sample input files. At this moment, we only support Windows system. We are currently working on the next version of PlotViz, which can work on Linux, Mac, and Windows.
Detailed install instructions are as follow:
- Download and install Microsoft XNA 3.1 runtime library from here
- Download the most recent version of PlotViz from here
- Unzip the archive PlotViz.Setup.zip
- Run setup.exe to install PlotViz
Once installed, you can use PlotViz immediately with our sample input files which are available for downloading from here. You can see the input files through PlotViz by double clicking *.pviz file (e.g., PlotViz_xml_input.pviz) or using File > Open menu in PlotViz for both *.txt (e.g., PlotViz_simple_input.txt) and *.pviz file. More details about the input file format are described in the next section.
Input File Format
In order to see 3D points through PlotViz, an input file should be created. Currently PlotViz can read two types of input files: simple text file and xml file. Sample files are also available from here for your reference.
Simple text file
PlotViz can read a simple text file containing the following 6 columns separated by a space or a tab for each line:
- Column 1 : ID number. Should be unique
- Column 2-4 : 3D XYZ positions
- Column 5 : Group ID
- Column 6 (optional) : Label (Strings to be displayed as annotations) . No space is allowed.
An example is shown as follows:
1 -0.99 0.05 -0.26 1 VT-00594413
2 -0.59 0.09 0.08 1 DAH1522401
3 0.15 0.46 -0.99 2 CID
PlotViz can also read a XML file containing the following elements:
- Cluster <cluster> ... </cluster> : each cluster contains the following elements
- <key>number</key> : group id
- <label>string</label> : group name
- <color r="num" g="num" b="num" /> : point color
- <size>number</size> : point size
- <key>number</key> : group id
- All clusters should be enclosed between <clusters> ... </clusters>
- Point <point> ... </point> : each data point contains the following elements
- <key>number</key> : point id
- <clusterkey>number</clusterkey> : group id
- <location x="num" y="num" z="num" />
- <key>number</key> : point id
- All points should be enclosed between <points> ... </points>
- Alll clusters and points should be enclosed between <plotviz> ... </plotviz>
An example is shown below:
<?xml version='1.0' encoding='utf-8'?>
<color r="55" g="126" b="184" a="255" />
<location x='-0.99' y='0.05' z='-0.26' />
Screenshots and Samples
A child window displaying attributes of data, including images and metadata
Displaying a GTM result to visualize chemical compounds with disease information by using PubChem and CTD database. (An animated version is available at YouTube)
Displaying a MDS result with the same data set above (PubChem and CTD data)
A SPARQL query interface to interact with an external system, Chem2Bio2RDF, an aggregated repository of chemogenomic and chemical biology data
PlotViz supports the following key functions:
- Zoom in/out : Use a slide bar in a tool bar
- Auto rotate and stop : Click a play/stop button in a tool bar or use Mode > Rotation (F3)
- Tracking : Use Mode > Tracking (F4) to a label of each point
- Selection : Use Mode > Selection (F5) to select multiple points and see selected labels
- Bounding box : Use Ctrl-B to turn on/off bounding box display
Others are experimental at this moment and more details will be updated here frequently. See contacts for any question.
The follwing papers have been published related with PlotViz.
- Jong Youl Choi, Seung-Hee Bae, Judy Qiu, Geoffrey Fox, Bin Chen, and David Wild, "Browsing Large Scale Cheminformatics Data with Dimension Reduction," Proceedings of Emerging Computational Methods for the Life Sciences Workshop of ACM HPDC 2010 conference, Chicago, Illinois, June 20-25, 2010.
- Seung-Hee Bae, Jong Youl Choi, Judy Qiu, Geoffrey Fox, "Dimension Reduction and Visualization of Large High-dimensional Data via Interpolation," In the Proceedings of ACM HPDC 2010 conference, Chicago, Illinois, June 20-25, 2010
- Jong Youl Choi, Judy Qiu, Marlon Pierce, Geoffrey Fox, "Generative Topographic Mapping by Deterministic Annealing," In the Proceedings of the 10th International conference on Computational Science and Engineering (ICCS 2010), May 31 - Jun 2, 2010. Amsterdam, The Netherlands.
- Jong Youl Choi, Seung-Hee Bae, Xiaohong Qiu and Geoffrey Fox, "High Performance Dimension Reduction and Visualization for Large High-dimensional Data Analysis," In the Proceedings of the The 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), May 17-20, 2010, Melbourne, Australia.
Currently PlotViz is under beta testing and we are working hard toward public opening including source codes. If you have any question or problem in using our PlotViz or dimension algorithms, please contact us one of the following members:
Jong Youl Choi (Graduate student)
Yang Ruan (Graduate student)
Seung-Hee Bae (Graduate student)
Judy Qiu (Professor)
Geoffrey Fox (Professor)