Difference between revisions of "Data Normalization"

From SHARP Project Wiki
Jump to navigationJump to search
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
''“The complexity of modern medicine exceeds the inherent limitations of the unaided human mind” – David M. Eddy, MD, PhD''
+
''“The complexity of modern medicine exceeds the inherent limitations of the unaided human mind” – David M. Eddy, MD, PhD''{{#ev:youtube| 1OSHKdNYYR8 }}
 +
[http://www.youtube.com/watch?v=1OSHKdNYYR8&feature=email| Dr. Huff on Data Normalization] Stanley M. Huff, M.D.; SHARPn Co-Principal Investigator; Professor (Clinical) - Biomedical Informatics at University of Utah - College of Medicine and Chief Medical Informatics Officer Intermountain Healthcare. Dr. Huff discusses the need to provide patient care at the lowest cost with advanced decision support requires structured and coded data.
  
== Clinical Data Normalization ==
+
__TOC__
 +
 
 +
== Resources ==
 +
 
 +
'''[[Project_1_Releases|Releases]]''' - Download, install, configure, and use the software produced.
 +
{{:Project_1_Releases}}
 +
 
 +
'''[[CEMS|Clinical Element Models]]''' - CEMs are at the core of data normalization.
 +
 
 +
'''[[Project_1_Presentations|Presentations]]''' - Presentations made or found during the coarse of this grant that are relevant to this project.
 +
 
 +
'''[[Project_1_Documents|Documents]]''' - Documents created by or used by this project.
 +
 
 +
'''[[Project_1_References|References]]''' - Additional resources relevant to this project.
 +
 
 +
==Introduction to Data Normalization==
 +
;Clinical Data Normalization
 
Clinical data comes in all different forms even for the same piece of information. For example, age could be reported as 40 years for an adult, 18 months for a toddler or 3 days for an infant. Database normalization of clinical data fields in general fosters a design that allows for efficient storage avoiding duplication or repetition of data; data querying becomes easier. Without normalization, data can’t be used as a single a dataset.
 
Clinical data comes in all different forms even for the same piece of information. For example, age could be reported as 40 years for an adult, 18 months for a toddler or 3 days for an infant. Database normalization of clinical data fields in general fosters a design that allows for efficient storage avoiding duplication or repetition of data; data querying becomes easier. Without normalization, data can’t be used as a single a dataset.
  
Line 24: Line 41:
 
|}
 
|}
  
== The Need for Clinical Models ==
+
;The Need for Clinical Models
 
Detailed clinical models are the basis for retaining computable meaning when data is exchanged between heterogeneous computer systems. Detailed clinical models are also the basis for shared computable meaning when clinical data is referenced in decision support logic.
 
Detailed clinical models are the basis for retaining computable meaning when data is exchanged between heterogeneous computer systems. Detailed clinical models are also the basis for shared computable meaning when clinical data is referenced in decision support logic.
  
Line 32: Line 49:
 
*The ''detailed clinical models'' provide the ''standard structure and terminology'' needed for clinical decision support and automated data analysis
 
*The ''detailed clinical models'' provide the ''standard structure and terminology'' needed for clinical decision support and automated data analysis
  
Data normalization & Clinical Models are at the heart of secondary use of clinical data.  If the data is not comparable between sources, it can’t be aggregated into large datasets and used for example to reliably to answer research questions or survey populations from multiple health organizations.
+
Data normalization & Clinical Models are at the heart of secondary use of clinical data.  If the data is not comparable between sources, it can’t be aggregated into large datasets and used for example to reliably to answer research questions or survey populations from multiple health organizations. Without models, there becomes too many ways to say the same thing.
  
== Clinical Use Cases ==
+
For more details, see our information on [[CEMS|Clinical Element Models]] (CEMs).
 +
 
 +
;[[media:Complex_Issues_in_Modeling_AMIA_Fall_Meeting_2011.pdf|Practical modeling issues: Representing coded and structured patient data in EHR systems ]]
 +
:[[Stanley Huff]]
 +
:AMIA Annual Symposium
 +
:October 22, 2011
 +
 
 +
;Clinical Use Cases
 +
In all of these situations, the goal is not just to have the data available for humans to read and understand, but to have the data structured and coded in a way that will allow computers to understand and use the information.
 
*Data sharing
 
*Data sharing
 
*Real time decision support
 
*Real time decision support
Line 49: Line 74:
 
**Continuous quality improvement
 
**Continuous quality improvement
  
== Real time, patient specific, decision support ==
+
;Real time, patient specific, decision support
 
*Alerts
 
*Alerts
 
**Potassium and digoxin
 
**Potassium and digoxin
Line 70: Line 95:
 
**Diabetic report
 
**Diabetic report
  
== What Needs to be Modeled? ==
+
;What Needs to be Modeled?
 
All data in the patient’s EMR, including:
 
All data in the patient’s EMR, including:
 
*Allergies
 
*Allergies
Line 83: Line 108:
 
*Family history, medical history and review of symptoms
 
*Family history, medical history and review of symptoms
  
== How are Clinical Models used? ==
+
;How are Clinical Models used?
 
*Data entry screens, flow sheets, reports, ad hoc queries
 
*Data entry screens, flow sheets, reports, ad hoc queries
 
**Basis for application access to clinical data
 
**Basis for application access to clinical data
Line 94: Line 119:
 
*Does NOT dictate physical storage strategy
 
*Does NOT dictate physical storage strategy
  
== Introduction to Clinical Element Models (CEMs) ==
+
== Project Team ==
In order to represent detailed clinical data models, we have designed The Clinical Element Model (CEM). When we state “The Clinical Element Model” we are referring to the global modeling effort as a whole, or in other words, our approach to representing detailed clinical data models and the instances of data which conform to these models.
+
Thank you's go to the [[Project_1_Contacts|Data Normalization team]].
 +
 
 +
;Aims
 +
* Build generalizable data normalization pipeline
 +
* Semantic normalization annotators involving LexEVS
 +
* Establish a globally available resource for health terminologies and value sets
 +
* Establish and expand modular library of normalization algorithms
 +
* Consistent and standardized common model to support large-scale vocabulary use and adoption
 +
* Support mapping into canonical value sets
 +
* Normalize the data against CEMs.
 +
* Normalize retrospective data from the EMRs and compare it to normalized data that already exists in our data warehouses (Mayo Enterprise Data Trust, Intermountain).
 +
* Iteratively test normalization pipelines, including NLP where appropriate, against normalized forms, and tabulate discordance.
 +
* Use cohort identification algorithms in both EMR data and EDW data.
 +
* Sharing data through NHIN Connect and/or NHIN Direct
 +
* Comparison of data processed through SHARP to data in existing Mayo and Intermountain data trust, EDW, AHR
 +
* Evaluation of NLP outputs and value?  Focus on a specific domain: X-rays, operative notes, progress notes, sleep studies?

Latest revision as of 16:37, 30 October 2012

“The complexity of modern medicine exceeds the inherent limitations of the unaided human mind” – David M. Eddy, MD, PhD{{#ev:youtube| 1OSHKdNYYR8 }} Dr. Huff on Data Normalization Stanley M. Huff, M.D.; SHARPn Co-Principal Investigator; Professor (Clinical) - Biomedical Informatics at University of Utah - College of Medicine and Chief Medical Informatics Officer Intermountain Healthcare. Dr. Huff discusses the need to provide patient care at the lowest cost with advanced decision support requires structured and coded data.

Resources

Releases - Download, install, configure, and use the software produced.

Data Normalization Releases
  • CouchDB is the default target database where CEMs are placed.
  • Fully automated approach for sending documents through.
  • cTAKES (Natural Language Processing) function embedded.
  • Only one Mirth Connect channel to store CEMs to DBMS.
  • MySQL is the default target database where CEMs are placed.
  • Manual intervention is required at times to move documents to stages in the pipeline.
  • HL7 message sorting channel used at the front end. Only this version uses these channels.

Clinical Element Models - CEMs are at the core of data normalization.

Presentations - Presentations made or found during the coarse of this grant that are relevant to this project.

Documents - Documents created by or used by this project.

References - Additional resources relevant to this project.

Introduction to Data Normalization

Clinical Data Normalization

Clinical data comes in all different forms even for the same piece of information. For example, age could be reported as 40 years for an adult, 18 months for a toddler or 3 days for an infant. Database normalization of clinical data fields in general fosters a design that allows for efficient storage avoiding duplication or repetition of data; data querying becomes easier. Without normalization, data can’t be used as a single a dataset.

Un-normalized Normalized (days) Normalized (months)
40 years 1436 47
18 months 543 18
3 days 3 0.1
The Need for Clinical Models

Detailed clinical models are the basis for retaining computable meaning when data is exchanged between heterogeneous computer systems. Detailed clinical models are also the basis for shared computable meaning when clinical data is referenced in decision support logic.

  • The need for the clinical models is dictated by what we want to accomplish as providers of health care
  • The best clinical care requires the use of computerized clinical decision support and automated data analysis
  • Clinical decision support and automated data analysis can only function against standard structured coded data
  • The detailed clinical models provide the standard structure and terminology needed for clinical decision support and automated data analysis

Data normalization & Clinical Models are at the heart of secondary use of clinical data. If the data is not comparable between sources, it can’t be aggregated into large datasets and used for example to reliably to answer research questions or survey populations from multiple health organizations. Without models, there becomes too many ways to say the same thing.

For more details, see our information on Clinical Element Models (CEMs).

Practical modeling issues: Representing coded and structured patient data in EHR systems
Stanley Huff
AMIA Annual Symposium
October 22, 2011
Clinical Use Cases

In all of these situations, the goal is not just to have the data available for humans to read and understand, but to have the data structured and coded in a way that will allow computers to understand and use the information.

  • Data sharing
  • Real time decision support
  • Sharing of decision logic
  • Direct assignment of billing codes
  • Bio-surveillance
  • Data analysis and reporting
    • Reportable diseases
    • HEDIS measurements
    • Quality improvements
    • Adverse drug events
  • Clinical research
    • Clinical trials
    • Continuous quality improvement
Real time, patient specific, decision support
  • Alerts
    • Potassium and digoxin
    • Coagulation clinic
  • Reminders
    • Mammography
    • Immunizations
  • Protocols
    • Ventilator weaning
    • ARDS protocol
    • Prophylactic use of antibiotics in surgery
  • Advising
    • Antibiotic assistant
  • Critiquing
    • Blood ordering
  • Interpretation
    • Blood gas interpretation
  • Management – purpose specific aggregation and presentation of data
    • DVT management
    • Diabetic report
What Needs to be Modeled?

All data in the patient’s EMR, including:

  • Allergies
  • Problem lists
  • Laboratory results
  • Medication and diagnostic orders
  • Medication administration
  • Physical exam and clinical measurements
  • Signs, symptoms, diagnoses
  • Clinical documents
  • Procedures
  • Family history, medical history and review of symptoms
How are Clinical Models used?
  • Data entry screens, flow sheets, reports, ad hoc queries
    • Basis for application access to clinical data
  • Computer-to-Computer Interfaces
    • Creation of maps from departmental/foreign system models to the standard database model
  • Core data storage services
    • Validation of data as it is stored in the database
  • Decision logic
    • Basis for referencing data in decision support logic
  • Does NOT dictate physical storage strategy

Project Team

Thank you's go to the Data Normalization team.

Aims
  • Build generalizable data normalization pipeline
  • Semantic normalization annotators involving LexEVS
  • Establish a globally available resource for health terminologies and value sets
  • Establish and expand modular library of normalization algorithms
  • Consistent and standardized common model to support large-scale vocabulary use and adoption
  • Support mapping into canonical value sets
  • Normalize the data against CEMs.
  • Normalize retrospective data from the EMRs and compare it to normalized data that already exists in our data warehouses (Mayo Enterprise Data Trust, Intermountain).
  • Iteratively test normalization pipelines, including NLP where appropriate, against normalized forms, and tabulate discordance.
  • Use cohort identification algorithms in both EMR data and EDW data.
  • Sharing data through NHIN Connect and/or NHIN Direct
  • Comparison of data processed through SHARP to data in existing Mayo and Intermountain data trust, EDW, AHR
  • Evaluation of NLP outputs and value? Focus on a specific domain: X-rays, operative notes, progress notes, sleep studies?