Clinical Models and SNOMED Kaiser Perspective
Clinical Models and SNOMED. Comments on use from Kaiser Permanente
Peter Hendler MD,, Jamie Ferguson, Michael Rossman, Moon Hee Lee, Mark Shafarman
Clinical Models and especially the authors of clinical models must understand that there are two fundamentally different kinds of modeling logic being used and one must be aware, when creating a clinical model to keep them straight. You can stick to one or the other, or if you mix them then you have to follow rules for mixing the two types of models or else they will not be implementable.
Both kinds of model logic use the words "Class" and "Subclass", but the meanings of these words differs significantly in each type of logic.
The first kind of modeling logic is called "Extensional". It is based on the "Closed World Assumption" (CWA) and it is widely understood. Many models created by clinicians are of this kind of logic, and this kind of logic is also more familiar to software developers outside of healthcare.
One major characteristic of this Extensional logic is that "classes must be extended by the authors of the model." In other words, for example if you had a class named "living subject" and another class named "person", the "person" could never be a subtype of "living subject" unless the author "extends" the class "living subject" to include "person." "Person" would never automatically be a "living subject". It would have to be explicitly stated.
Most relational database logic and Object Oriented models use this type of "extensional" logic. The class hierarchies do not "automatically rearrange themselves". Any rearrangement in the class hierarchies must be explicitly asserted by the modelers.
An important point about these Extensional models is how they are queried. In databases most queries are performed with SQL, and in OO models the queries are essentially the same as SQL in that you are making an incomplete pattern of the query and asking the query engine to bring back all of the instances in the database or model that fulfill the query. For example, SELECT PERSON WHERE SEX EQUALS MALE.
Reasoners and Description Logic have no place in these extensional models.
The second kind of model logic is called "intentional". It is based on the "Open World Assumption" (OWA), and it is used in linked data models of the semantic web but it is less widely understood by most software developers. Some clinicians and clinician driven models built by these developers may produce models that are not only unaware of this kind of logic, but even have features that make them unfit for adding intentional logic unless they are first modified.
This is the kind of logic found in models of RDF, OWL and, most relevant for our discussion, in SNOMED CT. None of the other controlled vocabularies in health care are based on this kind of logic. SNOMED is unique in this respect.
In this kind of logic the "classes" can be extended by "intention". You don't need to explicitly state that a "person is a kind of living subject". That relationship will be inferred from the model itself. For example let's say there are defined characteristics that are "necessary and sufficient" to define a "living subject", and if someone else designs a new class called "person", then as long as the person class has all the "necessary and sufficient" characteristics to be a "living subject", the reasoner will infer and change the definition of "person" so it is now in the "living subject" hierarchy.
This kind of logic is very powerful and useful in clinical medicine. It allows you to infer relationships between model elements that would otherwise escape the attention of the modelers. More importantly, these relationships would otherwise escape the attention of the queries on the model. This logic allows for "reasoners" also called "classifiers" to find these "inferred relationships".
Using "subsumption searching" you can return more complete results when, for example, you are searching for "all patients that have an infectious disorder of the lung caused by a member of the mycobacteria family". This complex kind of query can not be done with the first "extensional" kind of logic, but it is important in clinical systems using SNOMED CT.
How do you make a model using both kinds of logic?
There are some who advocate making the entire model out of the second kind of logic. These models would be all RDF or OWL. There are good reasons why not to do this: First of all, too few clinical modelers understand Intentional logic well enough to assure you would end up with accurate and correct models without unintended inferences. Secondly, it's not such a good idea to abandon traditional databases and SQL which are more scalable when used correctly and which can be the best choice for a lot of simpler data.
The ideal solution is to use the "extensional" logic for part of the model and the "intentional" logic for another part of the model. But be very clear what parts are modeled in what logic. Don't mix them up, keep them confined to their own parts of the model. Apply a simple modeling rule to keep each kind of logic in its place.
What parts of the model should be "extensional" and what parts should be "intentional"?
The "intentional" part is by far more difficult to produce, and fortunately, modelers do not have to create this part of the model. This part of the model is provided by SNOMED CT.
CIMI has agreed to use SNOMED CT as its primary vocabulary, therefore, SNOMED CT should be used in CIMI models for anything that is in SNOMED CT, and this is where the intentional logic lives in the model. Modeling concepts outside SNOMED that also exist in SNOMED, would violate the CIMI agreement to use SNOMED CT as its primary vocabulary.
In general if you think of the model as being composed of What, When, Who, Where, Why: The "what" is represented by SNOMED CT, whereas the when, who, where and optional why can be represented in the OO "extensional" part of the model.
If we are going to say that patient John Doe was diagnosed with a myocardial infarction on July 26 2012 as observed by EKG and CPK in the emergency room of Kaiser Permanente Fremont Medical Center. The what "myocardial infarction" is represented as a SNOMED code. The who, when, and where is represented in the "extensional model" either in a relational database or an OO model.
In some clinical models you might find words in the model to indicate concepts that exist in SNOMED CT like "systolic blood pressure" or "oral temperature". These words are words expressing the "what" of the model and if the model is to use SNOMED CT according to the CIMI agreement this should not be in the OO part of the model at all because then you would be reinventing SNOMED CT all over again. They should be left entirely to the SNOMED CT "intentional" part of the model.
Both the creation and the querying of mixed models must take place in two steps.
Once you have a well designed model that uses SNOMED CT, you must do the query in two steps. First you do the subsumption search in SNOMED. You can isolate SNOMED, you need nothing but SNOMED for this phase of the query. Let's say you search for all kinds of diseases that are "a disorder of the lung with causative agent mycobacteria". The result of that query gives you a list of SNOMED codes. This is an intermediate result, step one of the two step query. You now take that list and iterate over it saying with an SQL like query. SELECT PATIENTS WHERE DIAGNOSIS_CODE IS IN <<LIST OF RESULTS FROM STEP ONE>>.
In the creation of models there is a similar two step process. First clinicians with the help of an expert modeler, create the models for the conditions the way they understand them. These UML or ADL models will have "what" words in them like "systolic blood pressure" or "pneumonia". This first step of clinical model creation is this step which can be done by clinicians.
These unfinished models then go to experts that replace the "what words" with SNOMED CT concepts. The "what words" may be retained in the finished model, but now they are officially defined as human readable labels for clinicians. The real definition of the model is based on the SNOMED codes that replaced the words and now are the official definition of the "what" part of the what when where who why complete model.