Difference between revisions of "Data Normalization"

From SHARP Project Wiki
Jump to navigationJump to search
Line 28: Line 28:
 
*The ''detailed clinical models'' provide the ''standard structure and terminology'' needed for clinical decision support and automated data analysis
 
*The ''detailed clinical models'' provide the ''standard structure and terminology'' needed for clinical decision support and automated data analysis
  
== What Needs to be Modeled? ===
+
== What Needs to be Modeled? ==
 
:All data in the patient’s EMR, including:
 
:All data in the patient’s EMR, including:
 
*Allergies
 
*Allergies

Revision as of 13:11, 22 October 2011

“The complexity of modern medicine exceeds the inherent limitations of the unaided human mind” – David M. Eddy, MD, PhD

Clinical Data Normalization

Clinical data comes in all different forms even for the same piece of information. For example, age could be reported as 40 years for an adult, 18 months for a toddler or 3 days for an infant. Without normalization, data can’t be used as a single a dataset.

Un-normalized Normalized (days) Normalized (months)
40 years 1436 47
18 months 543 18
3 days 3 0.1

The Need for Clinical Models

  • The need for the clinical models is dictated by what we want to accomplish as providers of health care
  • The best clinical care requires the use of computerized clinical decision support and automated data analysis
  • Clinical decision support and automated data analysis can only function against standard structured coded data
  • The detailed clinical models provide the standard structure and terminology needed for clinical decision support and automated data analysis

What Needs to be Modeled?

All data in the patient’s EMR, including:
  • Allergies
  • Problem lists
  • Laboratory results
  • Medication and diagnostic orders
  • Medication administration
  • Physical exam and clinical measurements
  • Signs, symptoms, diagnoses
  • Clinical documents
  • Procedures
  • Family history, medical history and review of symptoms


Data normalization is at the heart of secondary use of clinical data. If the data is not comparable between sources, it can’t be aggregated into large datasets and used reliably to answer research questions or survey populations from multiple health organizations.