Monday, August 25, 2014

Preparing for MIE2014

After a fantastic warm and sunny summer here in Sweden it's time for me to get prepared for the European Medical Informatics Conference - MIE2014, Istanbul, 31 Aug. to 3 Sept.

Our joint paper co-authored by members across the IMI EHR4CR, Open PHACTS, SALUS projects and W3C HCLS community describing "A Framework for Evaluating and Utilizing Medical Terminology Mappings" has been accepted. And I have got the opportunity to present it in the main conference on the 2th September. 

For me the paper started from some great discussions at ICBO (Int. Conference Biomedical Ontologies) in Montreal last year with Trish Whetzel (@TrishWhetzel) and Jim McCusker (@jpmccu) on the topic: "mappings are not sufficient - need the justifications for the mappings". We started to talk about using so called Nanopublications to capture the justification for the mapping for users to make better use of for example the mappings provided via the NCBO Bioportal.

When I came back from the ICBO conference I wrote a blog post outlining some more ideas on using Nanopublications and/or Linksets, both stemming from the IMI Open PHACTS project. Some nice comments and sharing of my blog post: Justifications of Mappings encourage me to work more on these ideas. My colleague in the EHR4CR project, Sajjad Hussain (+Sajjad Hussain), pointed me to a very interesting blog post: SALUS project on Terminology Mappings. After some great discussion over a lunch at the SWAT4LS conference in Edinburgh with Hong Sun, from SALUS, Charlie Mead and Eric Prud'hommeaux, from W3C HCLS, Alasdair Gray (@gray_alasdair) from Open PHACTS, and many more, Sajjad and I started to outline a paper decribing a framework combining solutions and ideas on evaluating and utilizing terminology mappings.

Beside presenting this paper I look forward to participate in an MIE2014 tutorial and workshop:
  • Tutorial on the IEEE 11073 Standards for Personal Health Devices (Wikipedia: ISO/IEEE_11073). This is a standard I have been looking into earlier. It nicely combines my interest in clinical trials and health care data standards together with my previous industrial PhD studies in Mobile Informatics (see the slides presenting my PhLic thesis from 2001: Mobile Newsmaking).
  • Workshop on Interoperability Challenges for enabling secondary use of Electronic Health Records — ICEH 2014 In this workshop I look foward to meet and talk with many including the great metadata and ontology experts Gokce Laleci Ertukmen and Anil Pacaci (@aasinaci), Software Research, Development and Consultancy, Turkey.

I hope to be able to use my Twitter (@kerfors) feed to share interesting things I learn about in the conference, and from the historic city of Istanbul. And gather tweets, links, photos from each day using Storify. In the same way as I have done from eralier conferences. 

So, have a look at my MIE2014 Storify for daily updates 31 Aug. to 3 Sept.

Friday, June 13, 2014

openFDA a Game Changer?

I’ve been fascinated by innovative people in the FDA organization since I had the pleasure to meet Dr Norman Stockbridge, the father of FDA’s Janus datawarehouse model, F2F back in 2005 in Washington, DC. 

So when I saw some early notes about an openFDA initiative in June 2013 and early 2014 I posted a couple of tweets.

In April I wrote a short blog post about openFDA. And, when I saw how the new Chief Health Informatics Officer at FDA, Taha Kass-Hout (@DrTaha_FDA) started to count down on Twitter a couple of weeks ago I got really excited. It was nice to follow the #hdpalooz feed on Twitter from the health care data event in early June when openFDA was launched.

And, also to see services that directly were picking up the first openFDA API and launced services and apps to search the 3.4 million adverse events, such as Research AE

For a brilliant intro to what sits behind the first openFDA API I recommend Alex Howard's (@digiphile) excellent article: openFDA launches open data platform for consumer protection openFDA launches open data platform for consumer protection.
"Instead of contracting with a huge systems integrator.. FDA worked with a tiny data science startup.. to harmonize the data, create a cutting-edge website, and write and release open source code for a data publishing platform for it [on GitHub]"

I think this will be a game changer for how we think about open data, open source and open communities in industry. And yes, I do think we will soon will see much more Open, and Linked Data from FDA, and hopefully also from EMA and across industry.

Kudos to the devlopers behind all of this great work,
e.g. Sean Herron (@seanherron) and  Brian Norris (@Geek_Nurse)

Wednesday, April 2, 2014


It's exiciting to see how the FDA (Food and Drug Administration) now starts to make some nice buzz about their new project called openFDA:  A research project to provide open APIs, raw data downloads, documentation and examples, and a developer community for an important collection of FDA public datasets.

Excellent blog post from Dr. Taha Kass-Hout (@DrTaha_FDA), Chief Health Informatics Officer of FDA. He writes: "Our initial pilot project will cover a number of datasets from various areas within FDA, defined into three broad focus areas: Adverse Events, Product Recalls, and Product Labeling."

Introducing oepnFDA
I do hope that the idea of not only open, but also linked data, will be part of this effort. For a quick intro to Why Linked Data? check out this nice video explaining the utility of linked data and how its being used by the UK's Ordnance Survey.

I don't have the full context to all of this, but I may think there are some excellent opportunties for Dr Kass-Hout and his team to leverage linked data intitative such as these:

Thursday, February 27, 2014

Why I am so obsessed with this Semantic Web thing

In an earlier blog post I reflected on the fact that it is now 25 years since the web was born. I had the opportunity to bring web technology into a large organisation. Many colleagues asked Why are you so obsessed by this "Web thing"? (remember that this was the time when a Swedish minister said that "Internet är bara en fluga").

So, now in 2014 many ask me Why are you so obsessed with this "Semantic Web thing"?.

I had a good chance to reflect on this question when I was asked to be one of the keynote speaker at a very nice conference: SWAT4LS, Semantic Web Applications and Tools for Life Science, in Edinburgh. 

I was also interviewed together with other speakers by the eCancer organisation in relation  to the EURECA (Enabling information re-Use by linking clinical REsearch and Care) project, Always scary to see, and hear yourself, but I think I managed to convey some of my thoughts. And it is really nice to watch the interviews with Frank van Harmelen,Eric Prud'hommeaux, Robert Stevens and David Kerr.

However, I think the one that best expressed the answer to the question was Charlie Mead. Charlie has been around in a long time in the standard world, working with HL7 for health care data and CDISC for clinical research data. Charlie is now a co-chair of the W3C interest group for semantic web for health care and life sciences (HCLS). I recommend this 7 minutes interview with Charlie. Below I have transcribed the last part of it as I think Charlie well express the reasons for Why I'm so obsessed by this "Semantic Web thing".

Charlie Mead
W3C HCLS semantic web interest group
"The thing that is really astonishing about the semantic web, the tools and technologies, really solve all of the core problems that we struggled with for a very long time. 
And they solve them in a very elegant way, which almost by magic, that live on top of the Internet that we now works and have brought tremendous value. 
And I think the real barrier to adopt these technologies is that is if more people understood what they can do I think the change curve will come faster and the resistance would melt more quickly."

Kudos to Scott Marshall, W3C and EURECA project, (@mscottmarshall)
for arranging the interviews and to the eCancer TV team.

Thursday, February 6, 2014

Why are you so obsessed with this Semantic Web thing

A lot of nice buzz today in sociala media when Tim Berners-Lee discusses the future of the web in the March issue of Wired UK. The web turns 25 years in March.

It reminded me of what collegues asked me almost 20 years ago: Why are you so obsessed with this "Web thing"??

Thanks to some great people in the Volvo business and data organisations I was exposed to "this web thing" and it made me change direction in my professional carrier. From a fancy job as Account Mananger to leading a small network of people get the Volvo Web Wave moving.

Today, 2014, my collegues ask me: Why are you so obsessed with this "Semantic Web thing"?

Recently I, together with other speakers at the SWAT4LS (Application and Tools in Semantic Web for Health Care and Life Sciences) conference, had the opportunity to reflect on the main difference the semantic web can make for patients, health care and clinical research professionals in video interviews by for the EURECA project. Stay tuned for these via my Twitter (@kerfors) feed and in a coming blog post.

Sunday, November 17, 2013

De-identification and Informed Consent in Clinical Trials

Thursday evening I was following the great #PACCR feed on Twitter from a "Patients at Center of Clinical Research" discussion hosted by Eli Lilly Clinical Open Innovation team. (Thank you Rahlyn Gossen, @RebarInter, for the pointer)

A couple of interesting comments came up in some tweets on the topic of de-identification. As de-identification (sometimes called anonymization) is a key topic for clinical trial data transparency, I did find these quotes really interesting.
It was said in the meeting by Regina Holliday (@ReginaHolliday), a great tweeter promoting patients rights within medicine.
Daniel Barth-Jones (@dbarthjones), Columbia University and expert in Data Privacy and De-identification Policy, asked in another tweet and referenced a very interesting blog post from Harvard Law School on Ethical Concerns, Conduct and Public Policy for Re-Identification and De-identification Practice
"When re-identification risks are exaggerated, we need to recognize that the resulting fears cause needless harms. Such fears can push us toward diminishing our use of properly de-identified data, or distorting the accuracy of our statistical methods because we’ve engaged in ill-motivated de-identification and have altered data even in cases where there was not anything more than de minimis re-identification risks."
From the same blog post from the Online Symposium on the Law, Ethics & Science of Re-identification Demonstrations, at the Bill of Health at Harvard Law School, in the fields of health law policy, biotechnology, and bioethics.
“We must achieve an ethical equipoise between potential privacy harms and the very real benefits that result from the advancement of science and healthcare improvements which are accomplished with de-identified data."
There were also a couple of interesting #PACCR tweets on the topic of Informed Consent quoting Sharon Terry (@sharonfterry), CEO of Genetic Alliance:

I would like to learn more about this thinking and how they potentially could be realized by:
Structuring and formalizing the Informed Consent content to become a semantic rich, and machine-executable, contract/policy for transparency and accountability in using clinical trial data. 
For more information see:

I do find all of this very interesting. And I hope such a "dynamic, granular, matrixed and contextual" approach can be part of new clinical trial data transparency policies:  
"To find solutions that are 'good enough' and provide both dramatic privacy protections and useful analytic data" (from the same blog post).

Monday, October 7, 2013

The future of CDISC CT:s

A poll posted by Lex Jansen (@lexjansen) in the LinkedIN group for CDISC (Clinical Data Interchange Standards Consortium) triggered me to write down some thoughts on the future of CDISC's so called Controlled Terminologies (CT:s):

When you import CDISC Controlled Terminology from NCI EVS at or, which format do you use?
  (Excel, Text, ODM XML, or OWL/RDF)

My vote goes to the formats with the best potential for the future, that is the formats serializing RDF modeled data e.g. turtle, json, n-triples, json and xml (See the blog post: Understanding RDF serialisation formats)

Today's RDF version

The recently published OWL/RDF version of the CT:s (serialized in xml) uses the first version of the CDISC2RDF schema 1) implementing the model behind the existing export of a limit part of  the content in NCI Thesaurus (NCIt). 

It is modeled to support today's use of the CT:s only as text strings to populate variables in CDISC defined data sets (e.g. SDTM domains) with submission values.That is, it provide study specific clarity making it easy for humans to read the clinical data and metadata.

Next RDF version

Based on very useful discussions with the terminology expert Julie James (LinkedIn profile) working for HL7, IMI EHR4CR and FDA/PhuSE Metadata definition project, these are my thoughts for the next RDF version:

To provide cross study semantic interoperability making it easy for machines to directly integrate and query clinical data and metadata across health care and clinical research we need an enhanced model.

That is, a model that fully leverage the content in NCIt. And address the issues people have experienced when using the CT:s in attempts to implement them in BRIDG / ISO21090. Using the insights from the IMI EHR4CR project and from the development of the IHE DEX profile (Data Element Exchange).

I think there is also an opportunity to leverage the work on binding value sets to data elements part of the HL7 FHIR (Fast Healthcare Interoperability Resources) development 2). Julie also pointed me to a new ISO standards: ISO/CD 17583 3) The next version should also apply both the OID (Object identfier) standard and the URI (Uniform Resource Identifier) standard to identify each value set and each value.

1)  CDISC2RDF poster (presented at DILS 2013, Data Integration in Life Science conference) and FDA/PhUSE Semantic Technology project 
3) ISO/CD 17583: Health informatics -- Terminology constraints for coded data elements expressed in (ISO 21090) Harmonized Data Types used in healthcare information interchange.