Healthy Communities Data Summit

The Healthy Communities Data Summit held at UC San Francisco’s Mission Bay campus, organized by Health 2.0, the Foundation for Healthcare Innovation and sponsored by the California HealthCare Foundation, attracted a mix of civic leaders, medical/health professionals, academics, hackers and communicators – all eager to gain a better understanding and share the innovative uses of open health data.

Key topics from the event included cooperation and trust building between government, community and enterprise players, along with high impact applications by making health data accessible to the public (See tweets below).

As a communicator, I took a particular interest in the “A Better Pie Chart & Beyond: The Evolution of Visualization & Analysis” panel. Wess Grubbs, founder of Pitch Interactive, emphasized the human element behind all this data, and reminded users to prioritize the narrative when communicating health data. Although data is lifted from its silos and becomes easily accessible through visualizations, it should tell a complete story – not just grab for views, or “instant gratification.”

Here’s an event summary from the California HealthCare Foundation’s California Healthline.

Measuring total scholarly impact, beyond the cite

The new Total-Impact tool takes a series of references to someone’s work (e.g. publications, Slideshare slides, URLs, Github or Mendeley accounts) and generates reports based on a wide variety of impact metrics. It starts with traditional citations, but adds in bookmarks (from Mendeley, Delicious, etc.), mentions (on Twitter, Facebook), downloads (from publishers’ websites).

Check out some examples:

It’s fun seeing the various metrics. Chad, for example, has work cited on Wikipedia, and read on Mendeley:

This 2011 entry from Clay’s report doesn’t appear to have any cites from PubMed, but shows interest and activity from a variety of sources, including the PLoS website, CrossRef, CiteYouLike, and Mendeley. Some of his papers have even been discussed on Facebook!

Read more:

Keys to a Successful Data Repository

Recently, Cameron Neylon posted an interesting article on his blog, reflecting on some of the challenges in building a data repository:

One of the problems with many efforts in this space is how they are conceived and sold as the user. “Making it easy to put your data on the web” and “helping others to find your data” solve problems that most researchers don’t think they have. Most researchers don’t want to share at all, preferring to retain as much of an advantage through secrecy as possible. Those who do see a value in sharing are for the most part highly skeptical that the vast majority of research data can be used outside the lab in which it was generated. The small remainder who see a value in wider research data sharing are painfully aware of how much work it is to make that data useful.

A successful data repository system will start by solving a different problem, a problem that all researchers recognize they have, and will then nudge the users into doing the additional work of recording or allowing the capture of the metadata that could make that data useful to other researchers. Finally it will quietly encourage them to make the data accessible to other researchers. Both the nudge and the encouragement will arise by offering back to the user immediate benefits in the form of automated processing, derived data products, or other more incentives.

He goes on to discuss how the system needs to be as simple as possible, and as automated as can be.  He also mentions a few tools that could help in this process.  All in all, required reading for those of us interested in this domain space.

New Online Lab Network at UCSF

This morning UCSF’s McCormick lab announced the launch of LabCollaborate, a new website with the goal to “provide a way to easily share data, ideas and generally foster communication between labs as well as provide some useful tools for running the lab.”

I signed up to learn more about how it works. Here is what I have learned so far:

1. Lab Home Page: This is the page you see when you sign in. All the lab members profiles appear across the top, and you can see individual contact info and research interests (as well as update your own) by clicking on the pictures. As the first person to sign up the lab, you are an “admin”. Admins can add/remove lab members, edit library files and approve/delete friendships with other labs. You can extend these powers to any other user by clicking “Make admin” on their profile. If you want to.

2. Whiteboard: Here you can post comments or questions- they will be seen by your lab as well as your lab friends, but not by labs you are not friends with.

3. Friends: These are labs you want to keep in touch with and share data with. They can see and download all protocols, presentations and papers in your Library (unless marked “visible to my lab only”) as well as write on your whiteboard. A newsfeed to keep updated with what they’re doing is coming soon.

4. Libraries: These are collections of papers, presentations and protocols. Files can be tagged with keywords to organize into projects, ideas, lab members, whatever. And they are searchable! So you can group any number of protocols, literature references and presentations by whatever tag(s) you choose and find them all later with a simple search.

5. Ordering: The ordering system records vendor, quantity, and description as well as providing a direct link to the product page. It is also searchable to easily find past orders. Admins can mark orders as placed and the time of initial reqest and placement is recorded.

6. Find collaborators: The search box at the top of the page searches for words in the research interests of all labs and lab members on the network. So if you want to find other labs interested in “cancer”, just search and connect with new friends.

I am wondering whether – at some point – we can leverage the information LabCollaborate provides to enrich UCSF Profiles, and how on other hand LabCollaborate  can benefit from the UCSF Profiles data (tools).

I guess our tech team is aware of this. Looking forward to getting your thoughts, guys.

Take Advantage of Web-Based Tools to Present Complex Data

Research to Action published a great overview article that highlights an “ever-growing open-data source for development statistics in the fields of economics, healthcare, education, social science, technology,” and more.

Including data and statistics within research findings can enhance their impact, however, large tables or spreadsheets of numbers take time to decipher and sometimes the true meaning behind the data itself can be misinterpreted.

Here are some of the tools that the article points out:

  • StatPlanet: browser-based interactive data visualization and mapping application to create a wide range of visualizations, from simple Flash maps to more advanced infogrpahics.
  • Xtimeline:  to create your own timelines of data.
  • Gap Minderto upload data and create an interactive motion charts and graphs.
  • Creately:  to use Online Diagramming software – purpose built for team collaboration.
  • Google Chart Tools: lets you include constantly changing research data sourced online. Google has also released Fusion Tables where you can share, discuss and track your charts and graphs with specific people online.
  • Tagcrowdto upload texts and highlight the most common concepts. The clouds can be exported as images and inserted in a website or power point presentation.
  • Wordle: similar to tagcloud; lets you create images out of key phrases and words relevant to your research, great for using in PowerPoint presentations.
  • Tableau: a free Windows-only software for creating colourful data visualisations.

View all and read the original article

Is outsourcing experiments “the future of research”?

Palo Alto-based Science Exchange, which bills itself as “an online marketplace for science experiments”, thinks so.

According to their website: “Our goal is to make it easier for researchers to access core resources across institutions. Our first product, brings together research scientists looking to outsource experiments with other scientists at core facilities of major research universities who have the capacity to conduct the experiments. By dealing with all the paying/billing administration, quality assurance and dispute resolution, makes outsourcing experiments easy.”

Wired magazine on open publishing

Biologist Howard Eisen's son tries to put his late father's papers freely online. Hilarity ensues.

Trying to explain open publishing to someone outside the field? “Free Science, One Paper at a Time” in the May Wired tells a compelling story about the need for open publishing, referencing PLoS, academic librarians’ woes, Mendeley, and Michael Weiner’s work on ADNI, through the story of Jonathan Eisen‘s attempt to make his late father’s scientific publications freely accessible online. Very readable, and highly recommended.

Read more…

Upcoming talk on Open Science by Michael Nielsen

For those of us interested in open science, Dr. Michael Nielsen will be speaking in San Francisco later this month.  Dr. Nielsen is a leading advocate in this field and his book, “Reinventing Discovery” will be published later this year.  Here’s some information about his upcoming talk:

The net is transforming many aspects of our society, from finance to friendship.  And yet scientists, who helped create the net, are extremely conservative in how they use it.  Although the net has great potential to transform science, most scientists remain stuck in a centuries-old system for the construction of knowledge. Michael will describe some leading-edge projects that show how online tools can radically change and improve science using projects in Mathematics and Citizen Science as examples, and he will then go on to discuss why these tools haven’t spread to all corners of science, and how we can change that. [via]

The wine, beer, and cheese event will be held at the Public Library of Science on June 29th at 6pm.  The event is free and open to the public,but they ask people to RSVP at if you plan to attend.

Open Notebook Science

Thinking about our recent posting  regarding project and document management, along with a number of postings on open source data, people might be interested in learning more about a movement that takes open source to a basic level.  As described in Wikipedia:

Open Notebook Science is the practice of making the entire primary record of a research project publicly available online as it is recorded. This involves placing the personal, or laboratory, notebook of the researcher online along with all raw and processed data, and any associated material, as this material is generated. The approach may be summed up by the slogan ‘no insider information’.

While not everyone thinks this is a great idea, a number of labs in a variety of disciplines have begun to embrace the concept.  Similar to the Creative Commons movement, there are a number of ways to implement open science in your lab (with associated logos, of course!).

So, does open notebook science have a place in biomedical research, and does it have a role in translational science?

Further reading:

Open Source Genetics

We’re familiar with open source software and open source data.  Now it looks like we need to add open source molecular biology to the list.

The same concepts that have lead to open source rockin the software world have spawned the beginning of a revolution in biotech. An organization called Biofab, funded by the NSF and run through teams at Stanford and Berkeley, is applying open development approaches to creating building blocks (BioBricksTM from BioBricks Foundation) for the bio products of the future. Now, the first of those building blocks based on E. coli are just rolling off the production line. This, according to the organizers, represents “a new paradigm for biological research.” (via)

Read more:


Credit: Penguin Books

When we think of Translational Science, we imagine going from bench to bedside to community.  But what if the research itself is happening in the community?  Meet the Biohackers:

These do-it-yourself biology hobbyists want to bring biotechnology out of institutional labs and into our homes. Following in the footsteps of revolutionaries like Steve Jobs and Steve Wozniak, who built the first Apple computer in Jobs’s garage, and Sergey Brin and Larry Page, who invented Google in a friend’s garage, biohackers are attempting bold feats of genetic engineering, drug development, and biotech research in makeshift home laboratories.

In Biopunk, journalist Marcus Wohlsen surveys the rising tide of the biohacker movement, which has been made possible by a convergence of better and cheaper technologies. For a few hundred dollars, anyone can send some spit to a sequencing company and receive a complete DNA scan, and then use free software to analyze the results. Custom-made DNA can be mail-ordered off websites, and affordable biotech gear is available on Craigslist and eBay.

Is there a place for this movement in the CTSI continuum?

A GitHub of science

A conversation on scientists’ favorite online tools on Quora led to several ideas on online tools scientists wish existed. The most popular was Marius Kembe’s idea:

Github for scientists – a distributed hosting and version control system for all parts of scientific communication, including writing, code, data, and audio/video/images. So that you could build on somebody else’s work by versioning it! Isn’t that what science is meant to be about?”

As a GitHub user in non-biomedical domains, this makes so much sense to me. Marium went on to describe the idea further on his blog:

“GitHub is a social network of code, the first platform for sharing validated knowledge native to the social web…I believe it represents a demonstrably superior way of distributing validated knowledge than academic publishing. How are these even related? Software developers rarely write applications from scratch. Instead, they often start with various modular bundles of open source code…Scientists never begin a research project from an intellectual vacuum. They stand on the shoulders of giants, building on the knowledge contained in previous publications to form a new, coherent finding…Gems on GitHub are not just code.  They also have authors whose relative contributions are automatically catalogued…This impact graph can let you know precisely which developers are responsible for this awesome-ness…By contrast, current Open Science efforts that ask scientists to ‘share all your data’ have not become mainstream, because they do not appropriately reward knowledge producers.”


Data sharing licenses to avoid

So you want to share scientific data, but what license to use? The Panton Principles have something to say:

“Many widely recognized licenses are not intended for, and are not appropriate for, data or collections of data. A variety of waivers and licenses that are designed for and appropriate for the treatment of data are described here. Creative Commons licenses (apart from CCZero), GFDL, GPL, BSD, etc are NOT appropriate for data and their use is STRONGLY discouraged.”

Instead, the Panton Principles recommends the four licenses conforming to the 11 requirements of the Open Knowledge Definition: the Open Data Commons Public Domain Dedication and Licence (PDDL), the Open Data Commons Attribution License, the Open Data Commons Open Database License (ODbL), and Creative Commons’ CC Zero license.


Get every new post delivered to your Inbox.

Join 731 other followers

%d bloggers like this: