How to run a hack day

Science Hack Day San Francisco

Science Hack Day San Francisco 2010

We’re considering running a half-day event for campus developers and webmaster to learn about and tinker with UCSF Profiles’ open APIs and OpenSocial development platform. Whether you call it a hack day, a hackathon, a code-a-thon, or a developer day, the idea’s the same—bringing together technologists to learn, experiment, create, and share.

So how do you run a hack day? Here are some essential hackathon to-dos from my friend Sumana Harihareswara, based on work done for the Wikimedia Foundation:

  • A public wiki page stating the date, time, and venue, and specifying that everyone is welcome. Also tell people what to bring (laptop and power cord), ask them for topic ideas, and ask them to put their names down — no obligation.
  • Outreach/publicity drive, starting at least six weeks in advance, to relevant communities. Ideally you’d get the word out to technical interest groups, local user groups, consultants and other businesses in the industry, individuals whom you want to attend, professors and colleges and universities and technical schools and trainers, email lists, and (if relevant to your audience) newspapers.
  • Some experienced developers. I don’t know the exact ratio, but perhaps a fifth of your participants should be people who have had some experience in developing Wikimedia/MediaWiki stuff, loosely defined. You need some seeds.
  • Documentation tools & some people who will take notes with them (more below).
  • Lightweight tracking. At some point, somehow, at the event, get every participant’s name and email address. That way you can follow up and continue encouraging them after the event.

Because this would be our first time sharing our UCSF Profiles APIs with a wide internal audience, we’ll also need to get our own house in order, to make sure we’re ready to share:

  • Document every API that will be presented, and ensure that it’s comprehensible to our target audience
  • Develop sample “hello world” applications, so our audience can get started quickly, and pull apart working examples
  • Finalize policies around API licensing and data reuse, so developers aren’t left in the lurch if they want to build on our work

Read more:

Too many websites?

Sometimes it feels like UCSF has way too many separate websites, but we’re not the only ones with that problem. The US federal government’s .gov Task Force has identified 1,759 distinct federal websites, most operating under the .gov domain. The .gov Task Force is cracking down on confusing duplicative content, e.g. www.invasivespecies.gov and www.invasivespeciesinfo.gov, or redundant websites like www.centennialofflight.gov, untouched since 2003.

How are they dealing with out-of-control namespace and content?

  • there’s now a freeze on the issuance of new executive branch .gov domains, up till the end of 2011
  • 25% of executive branch .gov domain websites must be eliminated or redirected by the end of September 2011
  • 50% of executive branch .gov domain websites must be eliminated or redirectd by July 2012

Harsh, but effective.

Read more:

How to analyze internal site search stats


Web analytics guru Avinash Kaushik outlines a five-step process to understand data about internal search engine usage on A List Apart.

Why is this important?

“Now when people show up at a website, many of them ignore our lovingly crafted navigational elements and jump to the site search box.…All the search and clickstream data you have (from Google Analytics, Omniture, WebTrends, etc.) is missing one key ingredient: Customer intent. You have all the clicks, the pages people viewed, and where they bailed, but not why people came to the site, except where your referral logs contain information from search engines. For example, you can look at the “top ten pages viewed” report in your web analytics tool and know what people saw, but how do you know what they wanted to see? Your internal site-search data contains that missing ingredient: intent. Internal search queries contain, in your customers’ own words, what they want and why they’re there. Once you understand intent, you can easily figure out whether your website has the content your users need, and, if it does, where they can actually find it.”

Read more:

Health 2.0 code-a-thons in DC, SF

Andy Oram writes about his first Health 2.0 code-a-thon, held in Washington, DC. He discusses the setup, and how five teams of biomedical health technologists competed to build a quick and dirty system over the course of a day. The winning project:

“Team Avanade, the quietly intense team whose activity was totally opaque to me, pulled off a stunningly deft feat of programming. They are trying to improve patient compliance by using SMS text messaging to help the patient stay in contact with the physician and remain conscious of his own role in his treatment. A patient registers his cell phone number (or is registered by his doctor) and can then enter relevant information, such as a daily glucose reading, which the tool displays in a graph.”

There will be a Health 2.0 code-a-thon in San Francisco September 24-25. Anyone interested?

Read more:

Mining internal search engine data

We do some limited of search terms on CTSI web properties, but this is a big gap, per user experience author Lou Rosenfeld in his new book Search Analytics for Your Site. Rosenfeld’s the author of the seminal Information Architecture for the World Wide Web, so when he speaks, I tend to pay attention. An interview in O’Reilly Radar digs into the details of what analyzing search data in internal search engines and systems:

“[Site search isn’t] necessarily overlooked by users, but definitely by site owners who assume it’s a simple application that gets set up and left alone. But the search engine is only one piece of a much larger puzzle that includes the design of the search interface and the results themselves, as well as content and tagging. So search requires ongoing testing and tuning to ensure that it will actually work.

“[Site search analytics Does SSA reveal user intent better than other forms of analytics?
I think so, as the data is far more semantically rich. While you might learn something about users’ information needs by analyzing their navigational paths, you’d be guessing far less if you studied what they’d actually searched for. Again, site search data is the best example of users telling us what they want in their own words. Site search analytics is a great tool for closing this feedback loop. Without it, the dialog between our users and ourselves — via our sites — is broken.”

Read more:

Algorithms for diagnosis

According to O’Reilly Radar, Predictive Medical Systems is touting algorithms it has developed which can reportedly predict cardiac arrest and respiratory failure in an ICU setting, based on analysis of electronic medical record data. They’re currently running a validation trial, and working towards a formal FDA trial.

Read more:

Blogging about peer-reviewed research

The aptly-named Research Blogging service aggregates blog posts about peer-reviewed research.They scan supported blogs for references to published papers, aggregating the content, and supporting structured links to references. While the tagging and citation addition process seems still a bit too manual and brittle for my tastes, it’s clearly scaled, with bloggers publishing about 17 posts a day in 7 supported languages.

Research Blogging looks like a great way to explore post-publication review and discussion—but would they open up their data so that it can be mashed up by third parties? Imagine being able to read a publication, and see not only articles that cite it, but blog posts as well.

Read more:

Cigarette warning labels around the world

The FDA’s new cigarette warning labels have been getting a lot of buzz, underscoring the role of design in public health communication. The new designs take up half the cigarette pack, and 20% of the size of ads. According to the Wall Street Journal, the FDA estimates that the design will reduce the number of smokers by over 200,000 in the first year after launch, based on the impact of new warning labels in Canada.

Cigarette Health Warning ImagesCigarette Health Warning ImagesCigarette Health Warning ImagesCigarette Health Warning Images

There are a variety of approaches to tobacco packaging warnings, but bold graphic warnings are clearly the emerging international consensus. Here are some examples from around the world:

Brazil:
Cigarette warning labels

Thailand:
Gruesome

Click to see more…

Wired magazine on open publishing

Biologist Howard Eisen's son tries to put his late father's papers freely online. Hilarity ensues.

Trying to explain open publishing to someone outside the field? “Free Science, One Paper at a Time” in the May Wired tells a compelling story about the need for open publishing, referencing PLoS, academic librarians’ woes, Mendeley, and Michael Weiner’s work on ADNI, through the story of Jonathan Eisen‘s attempt to make his late father’s scientific publications freely accessible online. Very readable, and highly recommended.

Read more…