Using yMGV
yMGV features
Troubleshooting
If you didn't find an answer to your question in the above list,
please
email us.
Using yMGV
Can I filter the data so that I will only see the more interesting datasets?
Yes, filtering options is provided in the left frame. Available options
are:
-> no: no filtering
-> 1.5: show only experiments where the gene is regulated
(up or down) 1.5 fold = abs(log2)>0.58 at least one time
-> 2: show only experiments where the gene is regulated
(up or down) 2 fold = abs(log2)>1 at least one time
-> 3: show only experiments where the gene is regulated
(up or down) 3 fold = abs(log2)>1.58 at least one time
How to visualise orthologs expression data ?
Access 1: the S. pombe othologs are accessible when interogating yMGV
via the "Show orthologs" checkbox of the "Analyse one
gene" access. The S. cerevisiae <--> S. pombe orthologs
table was compiled by Valerie Wood (val@sanger.ac.uk).
Brief assignment methods: all inferences of orthology/homology in this
dataset have been assessed manually by pairwise or multiple alignment.
This provides many advantages over automated analysis of BLAST/FASTA results.
Although global comparisons of this type are useful to provide an overview
of a genome and assess the level of redundancy, they are not accurate enough
to be useful for the evaluation of functional genomics datasets if specific
functional inferences are made for individual genes.
Coverage information:
S. cerevisiae ortholog assigned
3320
S. cerevisiae ortholog not yet assigned
477
No S. cerevisiae ortholog
1172
Total NO of CDS
4969
More detailed information about this work is available here.
Access 3: you can specify some filter on experiments/publications from different organisms. In this case, a supplemental filter is implicitly added to filter out all ORFs that are not presents in the orthologs table.
If one of the ORF pair does not meet the organism specific criterion, the pair is discarded.
For example, if you enter a query like:
-I want to select orf from experiment L of S.cerevisiae with ratio > 2.5
and
-I want to select orf from experiment N of S.cerevisiae with ratio < 0.33
and
-I want to select orf from experiment M of S.pombe with ratio >3.5
yMGV will return all the ORF pairs having 1) a S.cerevisiae ORF meeting the two S.cerevisiae criteria and 2) a S.pombe orf meeting the S.pombe criterion.
Warning on orthologs comparison queries
Using an ortholog table with the idea that sequence similarity is equal to function conservation is dangerous. When looking at orthologs expression, one have to keep in mind that this sequence/function relationship is fuzzy.
Similar expression of orthologs genes during same biological process or stress exposure can gives some interesting hints about underlying regulation, but one have to keep in mind that this kind of relationship can occurs by chance. This is especially true when you are looking to conditions affecting a large part of the transcriptome.
yMGV philosophy is to provide tools helping biologist to dig the genomic data in order to find interesting tips/ideas about the biological system they are studying. This ortholog expression comparison tool have been designed with this idea in mind. You should not conclude anything by just looking to orthologs genes transcriptome, it's only an indication that have to be confirmed by other data.
How does Gene Ontology (GO) annotation is done?
The goal of the Gene OntologyTM Consortium
is to produce a dynamic controlled vocabulary that can be applied to all
organisms even as knowledge of gene and protein roles in cells is accumulating
and changing.
Please visit GO project homepage at http://www.geneontology.org
for complete information.
Thinks to know about GO before surfing yMGV website:
-
The GO project divide the annotations in three major branches represented
in yMGV by a colored letter code:
-> (C)
= Cellular Component
Subcellular structures, locations, and macromolecular complexes.
Examples include nucleus, telomere, and origin recognition complex .
-> (M)
= Molecular Function
The tasks performed by individual gene products.
Examples are transcription factor and DNA helicase.
-> (P)
= Biological Process
Broad biological goals, that are accomplished by ordered assemblies of
molecular functions.
Examples are mitosis or purine metabolism.
Full listing of terms used
for C, M, P is available at http://www.geneontology.org/#ontologies
-
The way the annotation was inferred is coded
by a 2 or 3 letters code:
| IC |
inferred by curator |
| IDA |
inferred from direct assay |
| IEA |
inferred from electronic annotation |
| IEP |
inferred from expression pattern |
| IGI |
inferred from genetic interaction |
| IMP |
inferred from mutant phenotype |
| IPI |
inferred from physical interaction |
| ISS |
inferred from sequence or structural similarity |
| NAS |
non-traceable author statement |
| ND |
no biological data available |
| NR |
not recorded |
| TAS |
traceable author statement |
Full details about GO evidence codes usage is available at http://www.geneontology.org/doc/GO.evidence.html
Currently, most of the annotations are classified IEA (inferred from
electronic annotation). For statistics on current annotations, see http://www.geneontology.org/#annotations
Can I use other filters than 1.5, 2 or 3?
You can use another cut-off only using access 3 when comparing experiments. For access1 and 2, yMGV uses pre-calculated tables containing information for filtering according to cut-off. Statistics are precomputed for threshold 1.5, 2 and 3.
How do I check that a ratio are not artefactual?
yMGV was designed to store ratio and a significant score for each measure.
But most of the published datasets do not contain such data, so this feature
is not currently in use. We recommand to go to the publication website
(you can find the URL in the publication description page) and look for
original intensities and background data, if available.
What does the overall log2 ratio distribution provided means?
The standard deviation gives you and idea of the overall distribution
of the log2(ratio) for the experiment. Usually, most of genes have unchanged
expression and you should roughly observe a bell curve centered on log2=0
(ratio=1).
A distribution very different from this model curve can be obtain if
the expression dataset compared two very different mRNA samples or when
microarray quality or computational post procesing was not good.
The standard deviation of ratio over one microarray dataset is calculated
using the formula:
Note: For the statistic calculation, we exclude ratio inferior or equal
to 0 and ratio superior or equal to 100 (cf here
for details).
How to interpret provided statistics about ORFs?
Query from "Access 1: view one gene profile" gives you some basic statistics
on the gene variation accross all experiments contained in yMGV database.
|
1.5
|
2
|
3
|
| changed |
323 (31%) |
169 (16%) |
69 (7%) |
| induced |
146 (14%) |
81 (8%) |
40 (4%) |
| repressed |
177 (17%) |
88 (9%) |
29 (3%) |
This table present a view of ORF transcriptional variations accross all
the experiments contained in yMGV. First line represent threshold used
to define that orf transcription is changed. "induced" means that only
ratio above threshold was considered, "repressed" means that only experiments
where ratio was less than 1/threshold where considered. "changed" means
that variations up or down was considered.
Next figure present the variation of all the ORF contained in yMGV
using "changed" and threshold 1.5, 2 and 3.
How to interpret provided statistics about experiments?
Some basic statistic are available for each individual experiment.
You can see it by clicking on any publication name.
General rules for statistics calculation are:
-
Only ratio corresponding to an ORF existing in current GO table is considered.
-
Ratio inferior or equal to 0 and ratio superior or equal to 100 are discarded.
-
If more than one spot correspond to an individual ORF, if at least one
of them reach specified threshold, ORF is considered as regulated.
Individual statistical elements are:
PLEASE NOTE: general rules described above are applied.
-
Ratio min: minimal ratio observed
-
Ratio max: maximal ratio observed
-
log2 avg: mean of log2(ratio) observed
-
Std dev: standard deviation of log2(ratio) over
the experiment (see also here)
-
ORF measured: number of distinct ORFs present in GO and having non null
ratio in this experiment.
-
-3 fold: number of distinct ORFs having ratio
inferior or equal to 0.333 (1/3)
-
-2 fold: number of distinct ORFs having ratio
inferior or equal to 0.5
-
-1.5 fold: number of distinct ORFs having ratio inferior or
equal to 0.667 (2/3)
-
+1.5 fold: number of distinct ORFs having ratio superior or equal
to 1.5
-
+2 fold: number of distinct ORFs having ratio superior
or equal to 2
-
+3 fold: number of distinct ORFs having ratio superior
or equal to 3
How to create direct link to yMGV pages?
A special script is provided to link ymgv from remote servers.
This page is available at: http://www.transcriptome.ens.fr/ymgv/incoming.php
Parameters are:
| Parameter name |
Description |
Value |
| access |
The access 1 (one gene) and access 2 (list of genes) can be reached If no precised, access 1 is used |
Access 1=1
Access 2=2 |
| organism_id |
The internal yMGV organism identifier. If not provided, is set to 0
(S. cerevisiae search). |
S. cerevisiae=0
S.pombe=1 |
| query |
The orf/gene name (access 1) or list of orfs/genes (access 2). For access 2, list must |
any string without space
(case insensitive) |
| onlysig |
Ratio based filtering option. If not provided, is set to 3. |
[0, 1.5, 2, 3] |
| show_orthologs |
If equal to 'ok', show expression profile for othologs from other organisms.
If not provided, is set to 'ok' |
[ok, no] |
Script name and parameters are delimited by '?'. The parameters are added
to the URL of the script separated by '&'.
If you are not using a parameter, you don't need to add it to the adress.
Please check the spelling of your parameters, wrong spelling lead to
non utilisation of the parameter without error message (except for query).
Examples:
yMGV information for S.cerevisiae PDR5 gene:
http://www.transcriptome.ens.fr/ymgv/incoming.php?organism_id=0&query=pdr5
yMGV information for S.cerevisiae PDR5 gene, with no ratio filter
and no ortholog display:
http://www.transcriptome.ens.fr/ymgv/incoming.php?organism_id=0&query=pdr5&onlysig=0&show_orthologs=no
yMGV information for S.pombe CDC2 gene:
http://www.transcriptome.ens.fr/ymgv/incoming.php?organism_id=1&query=cdc2
yMGV features
Do you plan to add other organisms to yMGV?
Yes, the database has been designed with this idea in mind. A lot of
yeast species will be fully sequenced soon and we plan to add them when
microarray datasets will be available.
Fo more information, you can visit the homepage of following yeast sequencing
projects:
-
Genolevures (Candida
glabrata, Kluyveromyces thermotolerans, Yarrowia lipolytica, Kluyveromyces
lactis, Debaryomyces hansenii)
-
Pc genome project (Pneumocystis
carinii)
-
Stanford Genome Technology Center (Candida
albicans, Cryptococcus
neoformans )
-
TIGR (Aspergillus fumigatus)
-
Washington University
Medical School Saccharomyces sequencing (S. mikatae, S. kudriavzevii,
S. bayanus, S. castellii, S. kluyveri)
-
Whitehead Institute for
Biomedical Research (Aspergillus nidulans, Magnaporthe grisea, Neurospora
crassa )
A mouse/rat version is also part of the project.
To add an organism, we just need microarray datasets, ORF description
in GO format and orthology tables between this organism and other present
organisms.
Can I use yMGV representations?
Yes of course, but please cite yMGV publication.
Can I download the whole dataset?
No, data are used in yMGV with permission of the authors. We don't
have authorization to distribute them. But you can find the download URL
of each publication in the details from "publication
description" page.
Can I do clustering online?
Not yet. But it will be one of the next improvements of yMGV.
You can use the external tool MiCOViTO
to dig yMGV S. cerevisiae datasets.
I'm going to publish a paper containing microarray results. Can I send you
the data?
Best way is to ditribute a file containing data required by MIAME in
MAGE format (see MGED website). This
will make your data compatible with all the major microarray database.
For yMGV more specificaly, you can send us a file containing one row
per orf and one column per experiment (Please discard ratio corresponding to artefactual and low intensity spots) or the adress of your MAGE file.
What software are you using?
yMGV is a postgresql database,
pages are generated by PHP scripts and
send to you by a Apache server. All
these software are freeware distributed under GNU
licence. You can find them as RPM package for most of the linux
distributions.
What hardware are you using?
The development database computer is a bi-pentium III 450Mhz with three
9Go SCSI 10k rpm hard drives and 256Mo RAM.
How to cite yMGV?
Please cite the articles:
Who supported this work?
French Genopole network
Centre National de la Recherce
Scientifique(CNRS)
French Cancer Research Association(ARC)
Troubleshooting
I don't see log2 distribution images
Log2 ratio distribution are images saved in PNG format.
Recent browsers are able to deal with it, for older ones, you have
to install a plug-in in order to be able to view it.
PNG was designed to be the successor to the once-popular GIF format,
which became decidedly less popular right around New Year's Day 1995 when
Unisys and CompuServe suddenly announced that programs implementing GIF
would require royalties, because of Unisys' patent on the LZW compression
method used in GIF. Since GIF had been showing its age in a number of ways
even prior to that, the announcement only catalyzed the development of
a new and much-improved replacement format. PNG is the result.
How do I unsubscribe from yMGV mailing list?
If you want to unsubscribe, please send a mail containing "unsubscribe
your_email_pasted_here" at ymgv@biologie.ens.fr