Transcriptome Platform Genomic Service Biology department Ecole Normale Superieure
SGDB logo SGDB banner
SGDB Navigation
Introduction
Principles
Genopole
Staff
Facilities
Plateform functioning
Contact/Access
Communications
News
Jobs
Training
Forum
Publications
Services
Devices reservation
Hybridization - Analysis
Protocoles
RNA preparation
Labelling
Hybridisation
Slides production
Analysis tools
Image analysis
Excel for genomics
Normalization
Pretraitement
Differential Analysis
Clustering
Data mining
Data management
FAQ
Protocols
Protocols
Restricted Access
Open a Session

/ Home / Analysis Tools / Clustering

Finding co-expressed genes - clustering methods

Version 1.9

This tutorial can be downloaded as a PDF file (2 Mb).

J-Express:

The J-express software allows gene classifications according to their expression profiles. Several clustering methods are available such as:

  • Principal Component Analysis (PCA)
  • K-means clustering
  • Hierarchical clustering

We propose the following analysis:

  1. Open an expression matrix in J-Express software. To do so, launch the software and select “Load Data”. A new window opens. Select “Get Data” and then “Set Data End”. You must then click on the last line rightmost value of your result table. Select “Set Data Start” and choose the upper leftmost value of your data table. Be careful not to select a gene name or an experiment name but a ratio value. Validate by clicking on “Ok” to load the data in J-Express. If the data have been successfully loaded in J-Express, the number of genes found will appear in the upper leftmost part of the software window.



  2. Perform a PCA using J-Express. The resulting distribution shows you the majority of invariant genes at the plot centre and the most variant genes spread over the axes. You can select some genes (by encircle them with the mouse) in order to obtain a gene list and their profiles.





TIGR Multiexperiment viewer (MeV) :

As with J-express, MeV has several clustering methods available. The MeV PCA visualization system is not very friendly and is quite difficult to use to select interesting genes, therefore, we prefer J-express. We will use MeV to continue the analysis we began with J-express:

  1. Load the file you used previously in J-Express using MeV. The file format (Stanford file) is the same. As for J-Express you must select the first data cell: MeV is able to load several annotation columns.





  2. Next, perform a K-means clustering (KMC). You can change the group number you want in order to see how this number influences the profile classification.



  3. With MeV, you can have a global view of every clusters or display each one. By right-clicking on an expression cluster, you can save the corresponding expression matrix or only the names of the genes involved in it. In addition, the function “all cluster view” scales automatically to set up the biggest gene expression variation.





  4. You can display the “centroid view” that shows you the cluster mean profile and evaluates the variability of gene expression inside it.



  5. You can also apply a hierarchical clustering method. It is crucial to restrict this method to a small subset of genes, not to be perturbed by invariant profiles that slow down calculation. To help you with this filtering step, you can use the filering options found in GEPAS (see the tutoriel on expression matrices pre-treatment). Hierarchical clustering allows visualizing all the profiles, so if you keep a too high number of genes, it will make the analysis impossible.




  6. MeV can perform a hierarchical clustering on the cluster obtained with the K-means algorithm. This allows to merge the power of group detection using k-means algorithm and genes ordering using hierachical clustering.



  7. MeV allows you to build each gene expression profile toward all the different experiments loaded.



Useful links:





This page is also available in french | Last page update: 9/7/2011 - 11:28
For any questions or comments send an e-mail to the webmaster