One of the exciting things about our installation of Semantic Mediawiki over at GeneWiki+ is the opportunity to merge in arbitrary datasets and resources. We’ve already brought in the SNP database over at SNPedia and linked the gene pages and SNP pages together with the Disease Ontology.
The Disease Ontology (DO) and Mediawiki’s category system share the same structure (a directed acyclic graph, or (DAG)) so mapping the DO onto GeneWiki+ was actually fairly painless. It created a series of common “nodes” in our wiki for annotations mined from the gene and SNP text to point to. And because it’s a semantic mediawiki, we can transitively associate SNPs, genes, and diseases. Now we can create queries that ask for all the genes and SNPs that are related to various diseases, and if a gene->disease link is known, and a SNP->gene link is known, we can then posit that that SNP is also related to that disease.
It’s pretty cool. We wrote a paper about it that should be out in Journal of Biomedical Statistics soon.
While we’re waiting for that paper to go through the review process, we’ve discussed expanding the resources grafted onto GeneWiki+. A low-hanging fruit is the Gene Ontology, a massive collection of terms describing most eukaryotic gene functions. Like the Disease Ontology, it is also a DAG and as such can be mapped directly onto Mediawiki’s categories. Sites like GONUTS have already done it- all we’d be doing is bringing it into a Wiki that understood semantic relationships. Now all the genes that contain GO annotations can be linked to common GO nodes. If (when?) we start bringing in genes from other organisms, these links can serve as transitive bridges between different species. This sort of linkup was described in the original GO paper, so it’s really nothing too original, just really neat.