Home

User login

Introduction

Obesity epidemic is a well-documented public health problem in the United States. The dramatic increase in childhood obesity raises concern about the health of these youth as they approach adulthood.  Environmental conditions have been identified as intervening factors through their impact on physical activity and eating habits.  MDS provides a visualization method, which can easily integrate with other clustering models (e.g. Pairwise clustering), for analysis of the structure of medical Geographic’s data and relationship between obesity and environment.

 
  Step 1 Extracting medical record and contextual data

Drawing on existing databases maintained by university research centers, health care systems, government agencies, and commercial data providers.

Step 2 Parallel Pairwise data clustering by Deterministic Annealing

Pre-processing data and generating pairwise distance matrix from patient records and Geographics data.

 
  Table Classification of patient data

Clusters of 3000 patient data

 
  Step 3 MDS visulalization of clustering results of patient data

Displaying with multi dimensional scaling for dimensional reduction

Childhood Obesity Patient Database
2000 records
6 Clusters
Will use our 8 node system to run 36,000 records
Working with IU Medical School to map patient clusters to environmental factors
 
  Clusters of 4000 patient data

Clusters of 3000 patient data

 
 
4000 records
8 Clusters
Will use our 8 node system to run 36,000 records
Working with IU Medical School to map patient clusters to environmental factors
 
   
 

Step 4 Analysing associations between child obesity and urban form and social environment

  • Using clustering results to relate patient data and geographical distribution with environmental and social data highlights
  • Conducting multi-variate analyses to explore the relative contributions of individual factors, physical environmental factors, and social environmental factors to the risk of obesity in children.
  • Data development is often the primary bottleneck in launching data mining projects.
  • The complexity of several important data mining algorithms, including Deterministic Annealing Clustering (DAC), scale proportionally to the square of the number of data points and so need significant computer resources.
  • The graphics above shows good parallel performance on our parallel DAC implementation that combines multi-threaded and classic MPI style parallelism (run on multicore and cluster systems)