|
Deterministic Annealing for Pairwise Clustering

|
There’s one class of mobile elements in the genome called "Alu". The purpose of this classification is to make subfamily and analysis hidden structure .
- "chr3“ is chromosome name
- “4579922" is start position, “4580207" is end position
- "+"/"C" is strand("+" is plus strand and "C" is minus strand since genome sequences is double strand),
- AluJb is the family name. All the sequences in this data set are AluJb.
|
|
|
Complete decomposition of 3000 ALU sequences into clusters

|
4500 Points Pairwise Annealing with distances determined pairwise

|
|
|
Same ALU Sequences with all sequences aligned with Clustal W (Multiple Alignment)

|
- Distances scaled before visualization to correspond to effective dimension of 4
- Original effective dimension 20
|