Clinical trial? Research study? Medical trial? Medical study? Research trial? Paid study?

It helps to use the same language as your customers.

As part of the UCSF Clinical Trials website, we worked hard to make the language as accessible as possible, particularly in text that will be seen by search engine users. The most important phrase on the site is, obviously, clinical trials. But is that the wording our users actually use? Google Trends to the rescue!

I used Google Trends to see what language the general public uses to look for trials. And as it turns out, it’s complicated.

While clinical trial is most searched-for term among Google users across the U.S., it’s closely trailed by research study, followed by a bevy of lesser-used terms. (Not all research studies are clinical trials, but patients sometimes use these and other terms interchangeably.)

UCSF has locations both in the San Francisco Bay Area and in Fresno. And in Fresno, research studies is actually more popular than clinical trials.

via Google Trends search for clinical trials vs. research studies

But it’s not just a matter of those two terms. I tried using Google Trends to explore the relative number of searches for a wide variety of popular synonyms for clinical trials. Every one of the terms are used.

So while we will continue to use clinical trials across our clinical trials website, we’ve been making increasing use of alternate terms in places where it makes sense, to reflect the language that our users prefer.

RNS SEO 2016: How 90 research networking sites perform on Google — and what that tells us

RNS SEO 2015 header

Research networking systems (RNS) like Vivo, Profiles, and Pure are often sometimes undiscoverable by real users because of poor search engine optimization (SEO).

Last year, we released RNS SEO 2015, the first-ever report describing how RNS performs in terms of real-world discoverability on Google.

We re-ran our analysis for 2016, to see which of 90 different research networking sites has the highest proportion of their people pages among the top 3 search results on Google.

1. Methodology

  • Pick 90 different VIVO, Profiles, Pure, and custom RNS websites
  • Retrieve a large number of people page URLs (via sitemaps, crawling)
  • Grab 100 random people names and URLs from each site
  • For each name, search Google for PersonName InstitutionName
    • e.g. “Jane Doe Harvard”
  • Count what % have pages come up in the top 3

2. Results

  1. Brown 93% [VIVO] [under official domain]
  2. University of California, San Francisco 90% [Profiles] [under official domain]
  3. University of Colorado Profiles 87% [Profiles] [under official domain]
  4. Stephenson Cancer Center 87% [Pure]
  5. Mayo Clinic 85% [Pure]
  6. University of Bristol 84% [Pure] [under official domain]
  7. Royal Holloway, University of London 83% [Pure] [under official domain]
  8. University of Stirling 83% [Custom] [under official domain]
  9. King’s College London 80% [Pure] [under official domain]
  10. Oregon Health & Science University 78% [Pure]
  11. University of the Highlands and Islands 76% [Pure] [under official domain]
  12. University of Melbourne 76% [Custom] [under official domain]
  13. Lancaster University 73% [Pure] [under official domain]
  14. University of New Mexico 72% [VIVO] [under official domain]
  15. Queen’s University Belfast 71% [Pure] [under official domain]
  16. University of Strathclyde 70% [Pure] [under official domain]
  17. University of St. Andrews 70% [Pure] [under official domain]
  18. Northern Arizona University 69% [Pure]
  19. Duke 69% [VIVO] [under official domain]
  20. MD Anderson Cancer Center 68% [Pure]
  21. University of Michigan 64% [Pure] [under official domain]
  22. UT Health Science Center at San Antonio 60% [Pure] [under official domain]
  23. University of Texas at Tyler 59% [Pure] [under official domain]
  24. University of York 59% [Pure] [under official domain]
  25. The University of Texas at Austin 56.% [Pure] [under official domain]
  26. Medical College of Wisconsin 56.% [Custom] [under official domain]
  27. Boston University 55.% [Profiles] [under official domain]
  28. Northwestern University 55.% [Pure] [under official domain]
  29. University of Texas at San Antonio 51% [Pure] [under official domain]
  30. University of Dundee 50% [Pure] [under official domain]
  31. University of Minnesota 50% [Pure] [under official domain]
  32. Heriot-Watt University, Edinburgh 49% [Pure] [under official domain]
  33. UT Southwestern Medical Center 48% [Pure] [under official domain]
  34. Johns Hopkins University 48% [Pure]
  35. University of Miami 46% [Pure]
  36. University of Arizona 46% [Pure]
  37. University of Nebraska 45% [Pure]
  38. University of Utah 44% [Pure]
  39. Michigan State University 44% [Pure] [under official domain]
  40. University of Texas of the Permian Basin 44% [Pure] [under official domain]
  41. Wake Forest Baptist Medical Center 42% [Profiles] [under official domain]
  42. University of Massachusetts 41% [Profiles] [under official domain]
  43. The University of Texas at Dallas 40% [Pure] [under official domain]
  44. UT Health Northeast 40% [Pure] [under official domain]
  45. Scripps 39% [VIVO] [under official domain]
  46. Case Western Reserve University 39% [Pure]
  47. Augusta University 38% [Pure]
  48. Western Michigan University 38% [Pure]
  49. University of Texas Medical Branch at Galveston 38% [Pure] [under official domain]
  50. University of Illinois at Chicago 36% [Pure]
  51. Houston Methodist 35% [Pure] [under official domain]
  52. Albert Einstein College of Medicine 33% [Pure]
  53. University of Edinburgh 33% [Pure] [under official domain]
  54. University of Florida 32% [VIVO] [under official domain]
  55. Arizona State University 31% [Pure]
  56. University of Texas Arlington 30% [Pure] [under official domain]
  57. Stanford University 30% [Custom] [under official domain]
  58. Thomas Jefferson University 29.% [Profiles] [under official domain]
  59. The University of Texas at El Paso 28.% [Pure] [under official domain]
  60. Cornell 28.% [VIVO] [under official domain]
  61. University of Rochester 26% [Profiles] [under official domain]
  62. New York University 23% [Pure]
  63. University of Iowa 23% [Custom] [under official domain]
  64. Clemson University College 22% [Pure]
  65. Baylor College of Medicine 20% [Profiles]
  66. Indiana University School of Medicine 18% [Pure]
  67. Wayne State University 18% [Pure] [under official domain]
  68. University of Texas Health Science Center at Houston 17% [Pure] [under official domain]
  69. University of South Africa 12% [Pure]
  70. University of Idaho 10% [VIVO] [under official domain]
  71. Dartmouth 9% [VIVO] [under official domain]
  72. Griffith 8% [Custom] [under official domain]
  73. George Washington University 4% [VIVO] [under official domain]
  74. Tufts 4% [Profiles] [under official domain]
  75. US Department of Agriculture 3% [VIVO] [under official domain]
  76. University of Montana 2% [VIVO]
  77. East Carolina University 1% [VIVO] [under official domain]
  78. Texas A&M 0% [VIVO] [under official domain]
  79. Boise State 0% [VIVO]
  80. University of Hawai‘i 0% [VIVO]
  81. Idaho State 0% [VIVO]
  82. Montana State University 0% [VIVO]
  83. New Mexico State 0% [VIVO]
  84. University of Alaska Anchorage 0% [VIVO]
  85. UCLA School of Medicine 0% [Custom] [under official domain]
  86. University of Nevada, Las Vegas 0% [VIVO]
  87. University of Nevada, Reno 0% [VIVO]
  88. University of Pennsylvania 0% [VIVO] [under official domain]
  89. University of Wyoming 0% [VIVO]
  90. Virginia Commonwealth University 0% [VIVO] [under official domain]

3. Conclusions

Which software has the best real-world SEO performance?

Average scores by platform

  • Pure = 50%
  • Profiles RNS = 44%
  • Custom = 39%
  • VIVO = 15%

Average scores, by use of official vs. other domain

  • Official domain? (e.g. vivo.cornell.edu)
    average score = 44%
  • Other domain? (e.g. stephenson.pure.elsevier.com)
    average score = 30%

Average scores by platform, taking domain names into account (where n >= 5)

  • Pure + Institutional Domain = 53%
  • Profiles + Institutional Domain = 47%
  • Pure + other domain = 45%
  • Profiles + other domain = 35%*
  • Custom + Institutional Domain = 39%
  • VIVO + Institutional Domain = 26%
  • VIVO + other domain = 18%*

* includes some data from 2015 survey

Does getting lots of incoming links help?

It appears to. The top 10 sites have a median 560 linking root domains — one of several metrics related to incoming link diversity mentioned in the Moz Search Engine Ranking Factors 2015.

The correlation between linking root domains and search rankings holds true across our dataset:

RNS SEO 2016 root linking domains

4. How do you increase your site’s search rankings?

Read our helpful guides:

RNS SEO: How 52 research networking sites perform on Google, and what that tells us

Research networking systems (RNS) like Vivo, Profiles, SciVal, and Pure are meant to be used — but often fail to be discoverable by real users because of poor search engine optimization (SEO).

That’s why we’re releasing RNS SEO 2015, the first-ever report describing how RNS performs in terms of real-world discoverability on Google.

Continue reading

SEO for Research Networking: How to boost Profiles/VIVO traffic by an order of magnitude

"Redwoods" by Michael Balint (cc-by)

The UCSF Profiles team has increased site usage by over an order of magnitude since the site’s big campus-wide launch in 2010. This “growth hacking” cheat sheet distills the key lessons learned during that period, and can be applied to almost any research networking platform, including VIVO, Profiles, and home-grown solutions. Continue reading

“Am I having a stroke?” UCSF Researchers Test New Way of Connecting Physicians With Information Seekers Online

Five questions with UCSF neurologist and stroke researcher Anthony Kim about his new study on how the Internet can help to connect with people who are searching for information online and potentially reduce incidences of preventable diseases.

Anthony S. Kim, MD, MAS is assistant clinical professor of neurology at the University of California, San Francisco (UCSF) and Medical Director of the UCSF Stroke Center. His research focuses on improving the diagnosis and cost-effective management of stroke and transient ischemic attack (TIA, also called “mini-stroke”).  An estimated 800,000 new strokes occur each year in the U.S., making it the fourth leading cause of death in America. Anthony Kim believes that the Internet opens up new opportunities that will change the way we develop interventions and conduct research to improve health.

Q: Millions of Americans search online for health information each year. Scientists are using this type of data to better understand flu outbreaks, the seasonal variance of kidney stones, the demographic prevalence of stroke, and even to demonstrate the online effectiveness of health awareness campaigns. What did you learn in your latest study?

We were surprised to see that tens of thousands of people were regularly ‘asking’ a search engine about stroke-related symptoms in many cases shortly after the onset of symptoms. In fact, every month, about 100 people were finding our study website by entering the query: “Am I having a stroke?” directly into their Google search box.

One of the challenges with mini-stroke is that most people do not seek urgent medical attention because the symptoms are transitory by definition. So people don’t realize that it is a medical emergency. Even though the symptoms may have resolved, the risk of a subsequent stroke is very high—upwards of 11% within the next 90 days—with most of this risk concentrated in the first hours and days after the mini-stroke. So getting the message out there about urgent medical attention is key.

We started this study because we thought that if people who have had a mini-stroke are looking online for information on their symptoms, then rather than just listing static health information about the disease on a website, maybe we can engage them by making the website more interactive and asking them to enter some of their symptoms online. And we wondered whether we could use this information to assess whether or not it was a true TIA or stroke and then encourage them to get the urgent medical as appropriate.

One third of the people we identified hadn’t had a medical evaluation for mini-stroke yet, which is critical, because it is a medical emergency. Instead of calling a doctor or going to the emergency room, many people were turning to the Internet as the first source for health information.

Q: How did your approach work exactly?

When a person searched on Google for stroke-related keywords, a paid text advertisement “Possible Mini-Stroke/TIA?” appeared with a link to the study website (Image). The ad appeared on the search results page and on related websites with existing content about the topic.

When users clicked on the text ad link, they were directed to the study website. Those visitors who met all of the study’s entry criteria were asked to provide informed consent online. They then reported their demographic information and symptoms based on a risk score developed for use by clinicians.

We were notified in real-time as soon as someone enrolled, and then we arranged for two vascular neurologists to follow up with the patient by telephone.

Q: You tested the approach for about four months. What’s your verdict?

We definitely think that there is a lot of potential here. About 60% of U.S. adults say that their real-life medical decisions have been influenced by information they read online. This changes the way we think about providing medical care and conducting research.

With a modest advertising budget, we were able to attract more than 200 people to our study website each day from all 50 states. About one percent of them (251 out of 25,000) completed the online questionnaire, which allowed us to contact them for follow up. Although this seems low at first, it is comparable to conversion rates in other domains of online advertising.

Also, even though the people who joined the study were a highly selected group, the incremental costs for reaching an additional person were low and the potential for applying a targeted and cost-effective public health intervention in this group would still be very interesting to evaluate in the future.

Before we started, we thought that we might lose people throughout the enrollment process since we confirmed eligibility and asked for consent online, but we didn’t. For the most part, if people were interested in participating, they completed the entire online enrollment process.

During follow up calls, we learned that 38% of enrollees actually had a mini-stroke or stroke. But fully a third of them had not seen a doctor yet. Our approach made it possible to connect with these people fairly efficiently and early on in order to influence their behavior acutely.

Despite these potential advantages, Internet-based public health interventions that target people who are looking for health information online are still underdeveloped and understudied. There’s a lot for us to learn in this space.

Q: What online tools did you use to carry out your project?

We used Google AdWords and Google’s Display Network to target English-speaking adults in the U.S. During the four-month enrollment period, the tool automatically displayed our ads more than 4.5 million times based on criteria such as location, demographics, and search terms.

Ideally, to minimize ongoing costs you would want to build and optimize a website so that it ranks highly among the non-paid (organic) search results. Non-profits can also take advantage of Google Grants, a program that supports in-kind donations of advertising resources to help selected organizations promote their websites on Google.

Q: Do you have any tips for others who want to develop similar projects?

We quickly realized that it helped to work closely with our Institutional Review Board (IRB) given that this is a new and evolving area of research, and to ensure data security and safety mechanisms are in place to protect participants. I definitely recommend that.

It’s also important to be realistic about the goals and metrics of success, and not to over-interpret numbers that seem to reflect low engagement. We saw that most visitors (86%) immediately exited the website within a few seconds of arriving at the home page. This probably reflected people who were looking for something else and clicked away immediately. But the beauty of the Internet is that it is very efficient to reach people across a wide geographic area very quickly. So it is not unexpected that we would also screen visitors who may not be qualified for the study or are not interested in enrolling.

Groups interested in using this approach should think about selection bias, authentication, validation, and the “digital divide”. Even though there is some evidence that disparities in access and adoption of Internet technologies are narrowing in the U.S., depending on the goals and target for your study or intervention the reach of the Internet is not uniform.

But selection bias issues aside, for a public health intervention you may be most interested in other metrics such as the number of people reached per dollar spent, or the burden of disease averted per dollar spent, which the Internet is particularly suited to help optimize.

And, it’s definitely beneficial to bring different subject matter and methods experts to the table. Knowledge of search engine optimization, online accessibility, website and user interface design is not necessarily part of the core expertise of a traditional clinical researcher, but developing these skills and interacting with experts in these areas could become very important for the new cadre of clinical researchers and public health professionals coming down the pipeline.

The original article was published on CTSI at UCSF

AMIA 2012 Joint Summit: a report back in tweets

Eric, Leslie, and I from CTSI at UCSF’s Virtual Home team spent the past three days at the AMIA 2012 Joint Summit in San Francisco.

Here’s some of what was happening on the researcher networking, social networking, knowledge representation fronts, and public search front, via Twitter:

Other tweets that caught my eye from the rest of the conference:

Notes from the 2011 Medicine 2.0 Summit at Stanford

Some argue that as technology advances it turns into a barrier and prevents essential human interactions, such as at the bedside. Even though this is a concern that we need to address, the Medicine 2.0 Summit 2011 provided a lot of examples that showed how technology can turn into a powerful mediator.

For those interested who did not get the chance to attend the event, here is a list of the main topics and initiatives presented that use social media, mobile computing applications, as well as Web 2.0 in healthcare and medicine to create new ways for people to connect. Please feel free to add your impressions and ideas of the summit and conference. Thanks!

1. If you are interested in learning from ePatients on how to build and leverage communities of practice and participatory medicine, you might want to explore the following blogs and platforms: 

  • Amy Tenderich’s blog Diabetesmine.com,
  • SmartMobs, authored by Howard Reingold, who was diagnosed with colon cancer and shared his experience on a blog called Howard’s Butt
  • PatientsLikeMe, where more than 115,000 members with over 1,000 conditions share their experiences to see what interventions are working for others

2. Patients have been connecting for some time. However, how can we help connect physicians and patients in a meaningful way? During the session “The Healthcare Transformers”, the panelists presented their views on personalizing healthcare and new ways for physicians and patients to communicate. 

  • Jay Parkinson, founder of HelloHealth and Futurewell, shared his passion about using creative design to improve health — and a few critical lessons learned (including” innovation is lonely” and “colleagues are critics”) as he and colleagues opened a “virtual clinic”, a “web-based patient communication, practice management and electronic health record in one solution”.
  • Lee Aase from the Mayo Clinic Center for Social Media gave a very entertaining talk on social media in the spirit of “Suus non ut Difficile” (It’s not that hard).  See one of their latest success stories: “When Patients Band Together – Using Social Networks To Spur Research for Rare Diseases”. They are very proactive about arming their health care professionals with the right tools to leverage social media for their successful communication. They even started a “Social Media Residency”. Aase also introduced the Social Media University, Global (SMUG), a post-secondary educational institution dedicated to providing practical, hands-on training in social media to lifelong learners.
  • Bryan Vartabedian, pediatric gastroenterologist, writes an interesting blog 33charts  about “the convergence of social media and medicine”.
  • Wendy Sue Swanson, practicing pediatrician, mother, and author of SeattleMamaDoc, walks a fine line and shares resources and methods that she learns from her patients, friends and family, both in and out of the field of medicine. She applies the concept of storytelling to achieve her goal of helping parents decipher some of the current medical news.
  • Ron Gutman, founder and CEO of HealthTap , who we wrote about in our earlier post, presented his solution to ending health care communication in silos. Some of the latest updates include 1) peer review features which will help give great questions more weight in the HealthTap environment, 2) offering a mobile solution, and 3) allowing participating doctors to be notified of questions coming from local patients.

3. “The Knowledge Revolution”: If you are interested in using innovations in Medical Education, you might find the following projects of interest:

  • Bertalan Mesko from Webicina.com provides curated medical social media resources in over 80 medical topics in over 17 languages to help patients and medical professionals access the most relevant social media content in their own languages on a customizable, easy-to-use platform for free.
  • Parvati Dev from Clinispace presented their virtual, 3D virtual training environment for healthcare professionals where learners can practice on realistic virtual medical scenarios and recover safely from errors.

4. The panel on  “The Interconnected Life” discussed social tools and platforms such as Epocrates, Google Correlate, which finds search patterns which correspond with real-world trends, and Quora.

5. During the panel “The New Scientist”, Michael Conlon presented VIVO , an “open source semantic web application”, a tool that is – like Profiles, Loci and others –  used or being implemented by universities across the nation to enable and support scientific collaborations and expertise discovery. 

  • Jan Reichelt, Co-Founder and President at Mendeley, talked about how the tool, a free reference manager and academic social network, helps investigators organize their research, collaborate with others online, and discover the latest research.
  • Peter Bienfield from PlosOne reminded us that most of the 1.5 Million papers published every year are still “closed access”. However, as established publishers experiment with “open access”, e.g.,  Sage Open , BMJ Open , Biology Open ,and Scientific Reports ,  they validate the model…
  • And, David Pescovitz explained how he is looking for “signals” to identify far-out ideas. He is editor for Boing Boing and MAKE as well as research director with the Institute for the Future.

6. Dennis Boyle, IDEO Founding Member and Partner, gave an interesting closing keynote on “design thinking” and “a human-centered approach to innovation.” He highlighted some of their recent projects… worth exploring….

 More information:

Mining internal search engine data

We do some limited of search terms on CTSI web properties, but this is a big gap, per user experience author Lou Rosenfeld in his new book Search Analytics for Your Site. Rosenfeld’s the author of the seminal Information Architecture for the World Wide Web, so when he speaks, I tend to pay attention. An interview in O’Reilly Radar digs into the details of what analyzing search data in internal search engines and systems:

“[Site search isn’t] necessarily overlooked by users, but definitely by site owners who assume it’s a simple application that gets set up and left alone. But the search engine is only one piece of a much larger puzzle that includes the design of the search interface and the results themselves, as well as content and tagging. So search requires ongoing testing and tuning to ensure that it will actually work.

“[Site search analytics Does SSA reveal user intent better than other forms of analytics?
I think so, as the data is far more semantically rich. While you might learn something about users’ information needs by analyzing their navigational paths, you’d be guessing far less if you studied what they’d actually searched for. Again, site search data is the best example of users telling us what they want in their own words. Site search analytics is a great tool for closing this feedback loop. Without it, the dialog between our users and ourselves — via our sites — is broken.”

Read more:

“Search Needs a Shake-Up: From Simple Document Retrieval to Question Answering”

If you are thinking about making Internet trawling more efficient, take a look at the recent perspective published in Nature. Researcher Oren Etzioni (whose lab introduced open information extraction) “calls on researchers to think outside the keyword box…”

Open information extraction obviates topic-specific collections of example sentences, and instead relies on its general model of how information is expressed in English sentences to cover the broad, and unanticipated, universe of topics on the Internet.

The basic idea is remarkably simple: most sentences contain highly reliable syntactic clues to their meaning. For example, relationships are often expressed through verbs (such as invented, married or elected) or verbs followed by prepositions (such as invented by, married to or elected in). It is often quite straightforward for a computer to locate the verbs in a sentence, identify entities related by the verb, and use these to create statements of fact. Of course this doesn’t always go perfectly. Such a system might infer, for example, that ‘Kentucky Fried Chicken’ means that the state of Kentucky fried some chicken. But massive bodies of text such as the corpus of web pages are highly redundant: many assertions are expressed multiple times in different ways. When a system extracts the same assertion many times from distinct, independently authored sentences, the chance that the inferred meaning is sound goes up exponentially.

Much more research has to be done to improve information-extraction systems — including our own. Their abilities need to be extended from being able to infer relations expressed by verbs to those expressed by nouns and adjectives. Information is often qualified by its source, intent and the context of previous sentences. The systems need to be able to detect those, and other, subtleties. Finally, automated methods have to be mapped to a broad set of languages, many of which pose their own idiosyncratic challenges.

One exceptional system — IBM’s Watson — utilizes a combination of information extracted from a corpus of text equivalent to more than 1 million books combined with databases of facts and massive computational power. Watson won a televised game of Jeopardy against two world-class human players in February this year. The multi-billion dollar question that IBM is now investigating is ‘can Watson be generalized beyond the game of Jeopardy?’