Home - About - Screenshots - Tutorial - FAQ - Mailing list - What's new

Using yMGV

yMGV features Troubleshooting


If you didn't find an answer to your question in the above list, please email us.



Using yMGV

Can I filter the data so that I will only see the more interesting datasets?

Yes, filtering options is provided in the left frame. Available options are:
   -> no: no filtering
   -> 1.5: show only experiments where the gene is regulated (up or down) 1.5 fold = abs(log2)>0.58 at least one time
   -> 2: show only experiments where the gene is regulated (up or down) 2 fold = abs(log2)>1 at least one time
   -> 3: show only experiments where the gene is regulated (up or down) 3 fold = abs(log2)>1.58 at least one time

How to visualise orthologs expression data ?

Access 1: the S. pombe othologs are accessible when interogating yMGV via the "Show orthologs" checkbox of the "Analyse one gene" access. The S. cerevisiae <--> S. pombe orthologs table was compiled by Valerie Wood (val@sanger.ac.uk).
Brief assignment methods: all inferences of orthology/homology in this dataset have been assessed manually by pairwise or multiple alignment. This provides many advantages over automated analysis of BLAST/FASTA results.  Although global comparisons of this type are useful to provide an overview of a genome and assess the level of redundancy, they are not accurate enough to be useful for the evaluation of functional genomics datasets if specific functional inferences are made for individual genes.
Coverage information:
S. cerevisiae ortholog assigned                 3320
S. cerevisiae ortholog not yet assigned       477
No S. cerevisiae ortholog                          1172
Total NO of CDS                                       4969
More detailed information about this work is available here.

Access 3: you can specify some filter on experiments/publications from different organisms. In this case, a supplemental filter is implicitly added to filter out all ORFs that are not presents in the orthologs table. If one of the ORF pair does not meet the organism specific criterion, the pair is discarded.
For example, if you enter a query like:
-I want to select orf from experiment L of S.cerevisiae with ratio > 2.5
and
-I want to select orf from experiment N of S.cerevisiae with ratio < 0.33
and
-I want to select orf from experiment M of S.pombe with ratio >3.5
yMGV will return all the ORF pairs having 1) a S.cerevisiae ORF meeting the two S.cerevisiae criteria and 2) a S.pombe orf meeting the S.pombe criterion.

Warning on orthologs comparison queries

Using an ortholog table with the idea that sequence similarity is equal to function conservation is dangerous. When looking at orthologs expression, one have to keep in mind that this sequence/function relationship is fuzzy.
Similar expression of orthologs genes during same biological process or stress exposure can gives some interesting hints about underlying regulation, but one have to keep in mind that this kind of relationship can occurs by chance. This is especially true when you are looking to conditions affecting a large part of the transcriptome.
yMGV philosophy is to provide tools helping biologist to dig the genomic data in order to find interesting tips/ideas about the biological system they are studying. This ortholog expression comparison tool have been designed with this idea in mind. You should not conclude anything by just looking to orthologs genes transcriptome, it's only an indication that have to be confirmed by other data.

How does Gene Ontology (GO) annotation is done?

The goal of the Gene OntologyTM Consortium is to produce a dynamic controlled vocabulary that can be applied to all organisms even as knowledge of gene and protein roles in cells is accumulating and changing.
Please visit GO project homepage at http://www.geneontology.org for complete information.

Thinks to know about GO before surfing yMGV website:

        -> (C) = Cellular Component
                  Subcellular structures, locations, and macromolecular complexes.
                  Examples include nucleus, telomere, and origin recognition complex .
        -> (M) = Molecular Function
                  The tasks performed by individual gene products.
                  Examples are transcription factor and DNA helicase.
        -> (P) = Biological Process
                  Broad biological goals, that are accomplished by ordered assemblies of molecular functions.
                  Examples are mitosis or purine metabolism.
        Full listing of terms used for C, M, P is available at http://www.geneontology.org/#ontologies
 
IC inferred by curator
IDA inferred from direct assay
IEA inferred from electronic annotation
IEP inferred from expression pattern 
IGI inferred from genetic interaction
IMP inferred from mutant phenotype
IPI  inferred from physical interaction
ISS inferred from sequence or structural similarity
NAS non-traceable author statement
ND no biological data available
NR not recorded
TAS traceable author statement
Full details about GO evidence codes usage  is available at http://www.geneontology.org/doc/GO.evidence.html
Currently, most of the annotations are classified IEA (inferred from electronic annotation). For statistics on current annotations, see http://www.geneontology.org/#annotations

Can I use other filters than 1.5, 2 or 3?

You can use another cut-off only using access 3 when comparing experiments. For access1 and 2, yMGV uses pre-calculated tables containing information for filtering according to cut-off. Statistics are precomputed for threshold 1.5, 2 and 3.

How do I check that a ratio are not artefactual?

yMGV was designed to store ratio and a significant score for each measure. But most of the published datasets do not contain such data, so this feature is not currently in use. We recommand to go to the publication website (you can find the URL in the publication description page) and look for original intensities and background data, if available.

What does the overall log2 ratio distribution provided means?

The standard deviation gives you and idea of the overall distribution of the log2(ratio) for the experiment. Usually, most of genes have unchanged expression and you should roughly observe a bell curve centered on log2=0 (ratio=1).
A distribution very different from this model curve can be obtain if the expression dataset compared two very different mRNA samples or when microarray quality or computational post procesing was not good.
The standard deviation of ratio over one microarray dataset is calculated using the formula: 
Note: For the statistic calculation, we exclude ratio inferior or equal to 0 and ratio superior or equal to 100 (cf here for details).

How to interpret provided statistics about ORFs?

Query from "Access 1: view one gene profile" gives you some basic statistics on the gene variation accross all experiments contained in yMGV database.
1.5
2
3
changed 323 (31%) 169 (16%) 69 (7%)
induced 146 (14%) 81 (8%) 40 (4%)
repressed 177 (17%) 88 (9%) 29 (3%)
This table present a view of ORF transcriptional variations accross all the experiments contained in yMGV. First line represent threshold used to define that orf transcription is changed. "induced" means that only ratio above threshold was considered, "repressed" means that only experiments where ratio was less than 1/threshold where considered. "changed" means that variations up or down was considered.
Next figure present the variation of all the ORF contained in yMGV using "changed" and threshold 1.5, 2 and 3.

How to interpret provided statistics about experiments?

Some basic statistic are available for each individual experiment. You can see it by clicking on any publication name.
General rules for statistics calculation are: Individual statistical elements are:
PLEASE NOTE: general rules described above are applied.

How to create direct link to yMGV pages?

A special script is provided to link ymgv from remote servers.
This page is available at: http://www.transcriptome.ens.fr/ymgv/incoming.php
Parameters are:
  
Parameter name Description Value
access The access 1 (one gene) and access 2 (list of genes) can be reached
If no precised, access 1 is used
Access 1=1
Access 2=2
organism_id The internal yMGV organism identifier. If not provided, is set to 0 (S. cerevisiae search). S. cerevisiae=0
S.pombe=1
query The orf/gene name (access 1) or list of orfs/genes (access 2). For access 2, list must any string without space
(case insensitive)
onlysig Ratio based filtering option. If not provided, is set to 3. [0, 1.5, 2, 3]
show_orthologs If equal to 'ok', show expression profile for othologs from other organisms. If not provided, is set to 'ok' [ok, no]
Script name and parameters are delimited by '?'. The parameters are added to the URL of the script separated by '&'.
If you are not using a parameter, you don't need to add it to the adress.
Please check the spelling of your parameters, wrong spelling lead to non utilisation of the parameter without error message (except for query).
Examples:
yMGV information for S.cerevisiae PDR5 gene:
http://www.transcriptome.ens.fr/ymgv/incoming.php?organism_id=0&query=pdr5
yMGV information for S.cerevisiae PDR5 gene, with no ratio filter and no ortholog display:
http://www.transcriptome.ens.fr/ymgv/incoming.php?organism_id=0&query=pdr5&onlysig=0&show_orthologs=no
yMGV information for S.pombe CDC2 gene:
http://www.transcriptome.ens.fr/ymgv/incoming.php?organism_id=1&query=cdc2


yMGV features

Do you plan to add other organisms to yMGV?

Yes, the database has been designed with this idea in mind. A lot of yeast species will be fully sequenced soon and we plan to add them when microarray datasets will be available.
Fo more information, you can visit the homepage of following yeast sequencing projects: A mouse/rat version is also part of the project.
To add an organism, we just need microarray datasets, ORF description in GO format and orthology tables between this organism and other present organisms.

Can I use yMGV representations?

Yes of course, but please cite yMGV publication.

Can I download the whole dataset?

No, data are used in yMGV with permission of the authors. We don't have authorization to distribute them. But you can find the download URL of each publication in the details from "publication description" page.

Can I do clustering online?

Not yet. But it will be one of the next improvements of yMGV.
You can use the external tool MiCOViTO to dig yMGV S. cerevisiae datasets.

I'm going to publish a paper containing microarray results. Can I send you the data?

Best way is to ditribute a file containing data required by MIAME in MAGE format (see MGED website). This will make your data compatible with all the major microarray database.
For yMGV more specificaly, you can send us a file containing one row per orf and one column per experiment (Please discard ratio corresponding to artefactual and low intensity spots) or the adress of your MAGE file.

What software are you using?

yMGV is a postgresql database, pages are generated by PHP scripts and send to you by a Apache server. All these software are freeware distributed under GNU licence. You can find them as RPM package for most of the linux distributions.

What hardware are you using?

The development database computer is a bi-pentium III 450Mhz with three 9Go SCSI 10k rpm hard drives and 256Mo RAM.

How to cite yMGV?

Please cite the articles:
 
yMGV : helping biologists for yeast microarray data mining
      S. Le Crom, F. Devaux, C. Jacq and P. Marc
     Nucleic Acid Research, vol. 30(1):p76-79  (january 2002) [medline] [free full text]
yMGV : a database for visualisation and data mining of published genome-wide yeast expression data
      P. Marc, F. Devaux, C. Jacq
     Nucleic Acid Research, vol. 29, n 13 (july 2001) [medline] [free full text]

Who supported this work?

French Genopole network
Centre National de la Recherce Scientifique(CNRS)
French Cancer Research Association(ARC)


Troubleshooting

I don't see log2 distribution images

Log2 ratio distribution are images saved in PNG format.
Recent browsers are able to deal with it, for older ones, you have to install a plug-in in order to be able to view it.
PNG was designed to be the successor to the once-popular GIF format, which became decidedly less popular right around New Year's Day 1995 when Unisys and CompuServe suddenly announced that programs implementing GIF would require royalties, because of Unisys' patent on the LZW compression method used in GIF. Since GIF had been showing its age in a number of ways even prior to that, the announcement only catalyzed the development of a new and much-improved replacement format. PNG is the result.

How do I unsubscribe from yMGV mailing list?

If you want to unsubscribe, please send a mail containing "unsubscribe your_email_pasted_here" at ymgv@biologie.ens.fr

Last updated August 5th 2003
The yMGV team
home