Start a conversation

How are the compounds unified from multiple data sources?

The compounds section in GeneAnalytics is the only section which does not yet rely on a single unique database that unifies multiple compound sources and provides one web page for each compound (in contrast: tissues and cells data are unified in LifeMap Discovery, diseases are unified in MalaCards, pathways are unified in PathCards and GO terms in the GeneOntology database). The GeneAnalytics Compounds section takes advantage of multiple sources which relate to more than 83,000 compounds, including those found in GeneCards® (for more information about the compounds data sources click here). 

The Novoseek data source extracts knowledge from biological databases and text repositories, providing relationships between chemical compounds and genes based on scoring algorithm running on Pubmed articles. Note that the Novoseek website is no longer available. Read more about Novoseek data in GeneCards and about its literature-text mining algorithm.

We have applied a unification process which seeks out similar compounds described in different data sources, to enable gene aggregation for unified compounds, and to avoid redundancy in the resulting compounds list. Compounds unification is established by an identical name and/or combination of other identifiers as: CAS number, PubChem ID and synonyms. Unified compounds are shown with links to all relevant data sources (the exact compound name is shown near the original data source name).

Metabolites unification: the following compound families contain thousands of metabolites which were unified based on their primary name and associated genes.

If genes associated with these compounds are matched to your gene set, GeneAnalytics presents only the matched group, to avoid a multitude of identical results. The evidence link enables viewing all the relevant metabolites in the original database. The unified compounds and their specific groups are as following:

1. Triglycerides

Group name # of associated genes # of metabolites in the group
Triglycerides group A 26 170
Triglycerides group B 30 113
Triglycerides group C 39 6
Triglycerides group D 34 13631

2. Diglycerides

Group name # of associated genes # of metabolites in the group
Diglycerides group A 130 803
Diglycerides group B 131 39
Diglycerides group C 131 1
Diglycerides group D 115 435


3. Phosphatidylcholines

Group name # of associated genes # of metabolites in the group
Phosphatidylcholines  group A 78 955
Phosphatidylcholines  group B 72 119
Phosphatidylcholines  group C 44 73


4. Phosphatidylethanolamines

Group name # of associated genes # of triglycerides in the group
Phosphatidylethanolamines group A 43 959
Phosphatidylethanolamines group B 30 114

 

Choose files or drag and drop files
Was this article helpful?
Yes
No
  1. Yaron Guan Golan

  2. Posted

Comments