|
|
 |
/ Home / Analysis Tools / Clustering
Finding co-expressed genes - clustering methods
Version 1.9
This tutorial can be downloaded as
a PDF file (2 Mb).
J-Express:
The J-express software allows gene classifications according
to their expression profiles. Several clustering methods are available such
as:
- Principal Component Analysis (PCA)
- K-means clustering
- Hierarchical clustering

We propose the following analysis:
- Open an expression matrix in J-Express software. To do so, launch the software
and select “Load Data”. A new window opens. Select “Get
Data” and then “Set Data End”. You must then click on the
last line rightmost value of your result table. Select “Set Data Start”
and choose the upper leftmost value of your data table. Be careful not to
select a gene name or an experiment name but a ratio value. Validate by clicking
on “Ok” to load the data in J-Express. If the data have been successfully
loaded in J-Express, the number of genes found will appear in the upper leftmost
part of the software window.
- Perform a PCA using J-Express. The resulting distribution shows you the
majority of invariant genes at the plot centre and the most variant genes
spread over the axes. You can select some genes (by encircle them with the
mouse) in order to obtain a gene list and their profiles.

TIGR Multiexperiment viewer (MeV) :
As with J-express, MeV has several clustering methods available.
The MeV PCA visualization system is not very friendly and is quite difficult
to use to select interesting genes, therefore, we prefer J-express. We will
use MeV to continue the analysis we began with J-express:
- Load the file you used previously in J-Express using MeV. The file format
(Stanford file) is the same. As for J-Express you must select the first data
cell: MeV is able to load several annotation columns.


- Next, perform a K-means clustering (KMC). You can change the group number
you want in order to see how this number influences the profile classification.
- With MeV, you can have a global view of every clusters or display each one.
By right-clicking on an expression cluster, you can save the corresponding
expression matrix or only the names of the genes involved in it. In addition,
the function “all cluster view” scales automatically to set up
the biggest gene expression variation.

- You can display the “centroid view” that shows you the cluster
mean profile and evaluates the variability of gene expression inside it.
- You can also apply a hierarchical clustering method. It is crucial to restrict
this method to a small subset of genes, not to be perturbed by invariant profiles
that slow down calculation. To help you with this filtering step, you can
use the filering options found in GEPAS (see the tutoriel on expression matrices
pre-treatment). Hierarchical clustering allows visualizing all the profiles,
so if you keep a too high number of genes, it will make the analysis impossible.
- MeV can perform a hierarchical clustering on the cluster obtained with the
K-means algorithm. This allows to merge the power of group detection using
k-means algorithm and genes ordering using hierachical clustering.
- MeV allows you to build each gene expression profile toward all the different
experiments loaded.
Useful links:
- J-Express, one of the first clustering program available.
The principal component analysis part is still very clear and easy to use.
But this software is no longer free for academic and newer version must be
bought. Website: http://www.ii.uib.no/%7Ebjarted/jexpress/
- TIGR Multi-experiment Viewer, is a very complete set of
tools including a wide variety of clustering techniques. Website: http://www.tigr.org/software/tm4/mev.html
- Genesis, a very complete clustering program using the most
classical method to search for co-regulated genes. There is also a lot of
tool to mine in microarray results using Gene Ontology. Website: http://genome.tugraz.at/Software/Genesis/Description.html
|