TTClust : A molecular simulation clustering program
TTclust is a python program used to clusterize molecular dynamics simulation trajectories. It only requires a trajectory and a topology file (compatible with most molecular dynamic packages such as Amber, Gromacs, Chramm, Namd or trajectory in PDB format thanks to the MDtraj package). Easy to use, the program produces a visual feedback of the clustering through a dendrogram graph. Other graphic representations are made to describe all clusters.
TTclust is published in Journal of Chemical Information and Modeling (JCIM). If you use it please cite this paper :
TTClust: A versatile molecular simulation trajectory clustering program with graphical summaries.
Thibault Tubiana; Jean-Charles Carvaillo, Yves Boulard, Stéphane Bressanelli
J. Chem. Inf. Model, Just Accepted Manuscript, 2018
doi : 10.1021/acs.jcim.8b00512
Sources and installation
You can find sources on my github page : https://github.com/tubiana/ttclust or install it with pypi :
sudo pip install ttclust
Note that it can be easer to install it using conda with precompiled libraries. The compilation of mdtraj and wxpython can be quite tricky… To do so :
git clone https://github.com/tubiana/ttclust.git cd ttclust conda env create -f environment.yml
It will create a new virtual environement within your conda installation and you can afterwards use ttclust with :
conda activate ttclust
Usage and more details can be found on the README file of the project : https://github.com/tubiana/ttclust
A dendrogram is generated at the end of the clustering with the corresponding cluster colors. The name of this file will be the same as the logfile with a « .png » extension. example: example.log –> example.png
The grey horizontal line is the cutoff value used.
A linear projection of cluster is made for the trajectory. Every barline represents a frame and the color a cluster number. Note that:
- If less or equal than 12 clusters: a defined color map was made in this order: red, blue, lime, gold, darkorchid, orange, deepskyblue, brown, gray, black, darkgreen, navy
- Else, the matplotlib « hsv » color map is used but the color change according to the number of clusters.
A vertical barplot is generated to have an overview of the cluster size. Each bar color corresponds to the cluster’s color in the LinearProjection’s representation and dendrogram cluster’s color.
2D distance projection
A 2D projection of the distance(RMSD) between the representative frame of each cluster is made. The method used is the multidimentional scaling method from the sk-learn python module. We can follow the evolution of each cluster thanks to the relative distance between them. The color of the points is the same as for other graphs (i.e. cluster’s color) and the radius of each point depends on the cluster’s spread.
Distance matrix plot
A plot of the distance matrix is also made and allows to easily visualize the distance between two frames.