Why Context and Relevance Matter to Health Data Computability


In a December 8 report, “Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward,” the President’s Council of Advisors on Science and Technology (PCAST) urgently recommend departing from much current HIT practice in America’s hospitals, doctors’ offices, pharmacies, labs, and patient’s homes. PCAST calls on ONC to accelerate making health data easily accessible, interpretable by computers, and searchable on the web.

In a companion piece to this one, John Halamka masterfully lays out the major ideas in this complex and, in places, highly technical 91 page report. Rather than comment on John’s summary and observations — I agree entirely with them — let me get to the heart of the report’s most important recommendation. While breathtakingly innovative, there are some problems with PCAST’s approach.

PCAST calls on ONC to:

“…establish a “universal exchange language” that enables health IT data to be shared across institutions; and also to create the infrastructure that allows physicians and patients to assemble a patient’s data across institutional boundaries, subject to strong, persistent, privacy safeguards and consistent with applicable patient privacy preferences. Federal leadership is needed to create this infrastructure.”

Bravo! I wholeheartedly agree. It is time to abandon the document-centric and centralized relational database approaches that have typified health IT systems for so long, and that have created islands of data while locking in proprietary vendors, who can then extract exorbitant pricing from clients.

The report boldly asserts what I have worked toward for years: we need to use XML (extensible markup language) to discretely tag health data, using existing standards for vocabularies (but also agreeing upon when and how to use them), so that “systems [will] be able to send and receive data in the universal exchange language.” As one of the co-developers of the Continuity of Care Record (CCR) standard, an XML schema balloted and released in 2005 (!), all I can say is that it’s about time.

The report proposes that this new universal language be composed of the most elemental data possible – e.g., a lab value, a diagnosis, a medication, a visit note, an image – and that each of these carry “metadata” about the patient – e.g., the patient’s name, privacy and consent controls, date of birth, etc. These metadata tags would always accompany the data elements wherever located, and allow for a person’s data to be aggregated dynamically, “on the fly.” The notion is similar to how Facebook pages assemble disparate content into a coherent view that the user can control.

In essence, the report advocates leapfrogging over records, documents, or standardized views of health data lodged in centralized databases. It’s just data out there, tagged and waiting to be assembled. Perform the equivalent of a Bing or Google search and, “poof,” the data are there. The report is sketchy on how to do this, but the idea is recognizable and has appeal.

I have two problems with this approach. First, why build a new health care XML variant? Though the report largely ignores it, in the past 24 months particularly, the health care industry has made significant progress toward structured XML for computable health data tagging and exchange. Google Health and CVS/MinuteClinic are current examples of production XML-based interoperability implementation using the CCR standard. Indivo, SMArt, and hdata are examples of ongoing research in this area.

Second, to be meaningful, views of health data need a framework that provide context and relevance. Clinicians will tell you that, just as a single data point is usually not helpful, aggregated data, even if searchable, are almost useless for understanding a patient’s situation and/or what decisions need to be made next.

Health and health care ata elements should be grouped into categories that make facilitate pattern recognition by a trained professional, and that can under gird and support knowledge-based decisions. Not all patient data is relevant all the time, and the relevance of some data will change based on the context of the presentation.

For example, if I know a patient well and am seeing her for a follow up visit, not every lab value is relevant. In this context, the lab values that are recent or have changed are most helpful in guiding care decisions going forward. On the other hand, if I’m admitting a patient to hospital with his first episode of grand mal seizures, then a much larger and expansive set of data – including history, medications, active problem list, allergies, and so on – would be appropriate to satisfy relevance in this context. In both cases these data could be expressible in XML, but the desirable views of the data would be quite different.

The physician, nurse, and technologist authors of the CCR standard understood this requirement for clinical relevance and context quite well. The CCR standard is an XML schema that tags elements in ways consistent with the PCAST recommendations. But it adds tagged data objects to help organize, or provide context for, those elements.Those data objects include demographics, problem list, medication list, lab results, allergies, procedures, and so on. The CCR standard’s basic schema is also deployed in the CDA CCD.

The CCR standard’s XML schema design does not require all these data objects for a valid view of the patient’s information. In fact, depending on the context, the view can be mixed with inclusions or exclusions. Need a basic summary? Use these 6 data objects and data elements belonging to them. Need a  cardiology referral? Use these 6 plus two other objects and data element relevant to referral and return to the primary care doctor.

In other words, the basic XML structure in the CCR standard and CDA CCD looks like this:

  • View /display /summary record
    • Data object
    • Data elements
    • Data object
    • Data elements

The PCAST report seems to recommend a structure like this:

  • View /display /summary record
    • Data element + metadata
    • Data element + metadata
    • Data element + metadata

If you imagine a huge search engine with bots and spiders that can index the metadata for every piece of data about a person, then the second structure above seems logical. But if you need profiles that capture relevant data sets about a patient for particular clinical contexts, then this level of granularity might simply deliver data overload and not much information.

I am pleased by the PCAST advisors’ bold recommendations, in part because they frame the interesting trade-offs between the objectives and uses of health data for providers, patients, researchers, and public health. With luck, they’ll facilitate a much-needed discussion of the next version design of the CCR standard and similar XML tools for health care.

David C. Kibbe, MD, MBA consults and writes on health information technologies.

One thought on “Why Context and Relevance Matter to Health Data Computability

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s