## Introduction

Multidimensional Scaling (MDS) is an algorithm for constructing mappings of the given points
in a target dimensional space based on pairwise proximity information.
It is a compute and data intensive application, which requires O(N^{2}) computation as well as memory.
For the purpose of large-scale data visualization via MDS algorithm,
In addition to parallelization of MDS as shown in HP-MDS,
we also develop interpolation method (a.k.a. *out-of-sample* method) of MDS,
which reduce computational complexity and memeory requirement of MDS dramatically.

As shown in the above figure, the idea of MDS interpolation method is two-step approach.
As a preprocessing, we divide the given data set (total *N* points) into *In-Sample* data set, which has *n* points,
and the remaining *Out-of-Sample* data set, which has *N - n* points.
First, we run full MDS with In-Sample data set, and the points in the Out-of-Sample data set are mapped in the target dimension
based on the mapping result of In-Sample data set via full MDS method.
The interpolation method (we called *MI-MDS*) reduces the computational complexity from O(N^{2}) to O(n(N-n)).
It also reduces the memory requirement from O(N^{2}) to O(n),
since the proposed interpolation method finds a mapping for an interpolated point based on the mappings of In-Sample data points
and the distances between the interpolated point and the points in In-Sample data set.

For the details of the proposed MDS interpolation method, please refer to our HPDC 2010 paper.

## Download

Source code is available from here.

## Building and Dependency

MI-MDS implementation is developed under .NET framework 3.5 and you can recompile or build through MS Visual Studio 2008 or higher version. As shown in the paper, we implemented MI-MDS in hybrid parallel model, which combines MPI and threading, so you need to configure the following runtimes appropriately:

The DLL file for CCR library could be found in

**bin/Debug/**directory.

## User Guide

You can run the MI-MDS application by using command as follows:

```
[USAGE]: > mpiexec -n [#Proc] hybrid_MIMDS.exe [sampleMappingFile] [sampleDataFile] [outOfSampleDataFile]
[labelFile("NoLabelFile")] [outputFile] [threshold] [origDim] [targetDim] [#NN] [numSample] [#thread] [ignore1?] [ignore2?] [stress?]
```

The following is the parameter list which should be included to run MI-MDS:

```
arguments:
```

args[0]: **sampleMappingFile** - prior MDS mappings of the in-sample data.

args[1]: **sampleDataFile** - in-sample data.

args[2]: **outOfSampleDataFile** - out-of-sample data to each process.

args[3]: **labelFile** - label file (or "NoLabelFile")

args[4]: **outputFile** - output file name.

args[5]: **threshold** - threshold value for the stop condition.

args[6]: **origDim** - the original dimension number

args[7]: **targetDim** - the target dimension number

args[8]: **#NN** - the number of nearest neighbors for interpolation.

args[9]: **numSample** - the number of points of in-sample data.

args[10]: **numThread** - the number of threads in each process.

args[11]: **ignore1?** - 1: ignore first line of sampleData, 0: otherwise

args[12]: **ignore2?** - 1: ignore first line of outSampleData, 0: otherwise

args[13]: **stress?** - 1: skip STRESS calculation, 0: not skip