Interactive Biomedical Data Visualization

TripleMap

Continuing our theme of visualization, it looks like some pretty interesting tools are continuing to be developed.  One example is called TripleMap:

TripleMap is a data-driven software framework which gives biomedical research scientists access to massive interconnected networks of life science data. Using TripleMap you can analyze, visualize and share this information by creating “maps” of associated data which are relevant to your research.

Using a proprietary algorithm called Inferential Connectivity Analysis (ICA), TripleMap can identify connections for you between any two entities in its network. Want to know about potential connections between a protein and a disease? Want to know about potential connections between a compound and a cellular pathway? With ICA, TripleMap can perform a comprehensive, “deep” traversal of the entire TripleMap data network and identify any connecting entities. How powerful is identification of novel connections? It can be the difference between success and failure, novel insight and (less than) blissful ignorance.

Although they’re still in a closed “alpha” mode, the developer told me that they will be integrating the MedDRA ontology into it over the weekend, and he’ll send me a trial code early next week.  I’ll post a follow-up after I give it a try.

visualcomplexity.com | A visual exploration on mapping complex networks

visualcomplexity.com | A visual exploration on mapping complex networks.

I found an interesting site for interesting visualizations of networks… here’s their description of what this site is about:

VisualComplexity.com intends to be a unified resource space for anyone interested in the visualization of complex networks. The project’s main goal is to leverage a critical understanding of different visualization methods, across a series of disciplines, as diverse as Biology, Social Networks or the World Wide Web. I truly hope this space can inspire, motivate and enlighten any person doing research on this field.

A GitHub of science

A conversation on scientists’ favorite online tools on Quora led to several ideas on online tools scientists wish existed. The most popular was Marius Kembe’s idea:

Github for scientists – a distributed hosting and version control system for all parts of scientific communication, including writing, code, data, and audio/video/images. So that you could build on somebody else’s work by versioning it! Isn’t that what science is meant to be about?”

As a GitHub user in non-biomedical domains, this makes so much sense to me. Marium went on to describe the idea further on his blog:

“GitHub is a social network of code, the first platform for sharing validated knowledge native to the social web…I believe it represents a demonstrably superior way of distributing validated knowledge than academic publishing. How are these even related? Software developers rarely write applications from scratch. Instead, they often start with various modular bundles of open source code…Scientists never begin a research project from an intellectual vacuum. They stand on the shoulders of giants, building on the knowledge contained in previous publications to form a new, coherent finding…Gems on GitHub are not just code.  They also have authors whose relative contributions are automatically catalogued…This impact graph can let you know precisely which developers are responsible for this awesome-ness…By contrast, current Open Science efforts that ask scientists to ‘share all your data’ have not become mainstream, because they do not appropriately reward knowledge producers.”

[Link]

Data sharing licenses to avoid

So you want to share scientific data, but what license to use? The Panton Principles have something to say:

“Many widely recognized licenses are not intended for, and are not appropriate for, data or collections of data. A variety of waivers and licenses that are designed for and appropriate for the treatment of data are described here. Creative Commons licenses (apart from CCZero), GFDL, GPL, BSD, etc are NOT appropriate for data and their use is STRONGLY discouraged.”

Instead, the Panton Principles recommends the four licenses conforming to the 11 requirements of the Open Knowledge Definition: the Open Data Commons Public Domain Dedication and Licence (PDDL), the Open Data Commons Attribution License, the Open Data Commons Open Database License (ODbL), and Creative Commons’ CC Zero license.

Database replication for global health applications

Can solid database replication support have global health impacts? Global health tech company Dimagi discusses how they use CouchDB (a NoSQL document-oriented database) for health data management in rural Zambia:

“We’ve got computers at clinics that are maintaining patient records…None of these clinics have Internet out of the box, so most of the time our only Internet connection is through a GSM modem that connects over the local cell network. It’s very hard to move data in that environment, and you can’t do anything that relies on an always-on Internet connection with a web app that is always accessing data remotely…CouchDB was a really good option for us because we could install a Couch database at each clinic site, and then that way all the clinic operations would be local. There would be no Internet use in terms of going out and getting the patient records, or entering data at the clinic site. Couch has a replication engine that lets you synchronize databases — both pull replication and push replication — so we have a star network of databases with one central server in the middle and all of these satellite clinic servers that are connecting through that cell network whenever they’re able to get on, and sending the data back and forth. That way we’re able to get data in and out of these really remote, rural areas without having to write our own synchronization protocols and network stack.” (via)

Link