SHARP Project Wiki:Project Background

From SHARP Project Wiki
Revision as of 02:29, 9 April 2010 by Admin (talk | contribs)
Jump to navigationJump to search

Project Proposal


We propose research that will generate a framework of open-source services that can be dynamically configured to transform EHR data into standards-conforming, comparable information suitable for large-scale analyses, inferencing, and integration of disparate health data. We will apply these services to phenotype recognition (disease, risk factor, eligibility, or adverse event) in medical centers and population-based settings. Finally, we will examine data quality and repair strategies with real-world evaluations of their behavior in Clinical and Translational Science Awards (CTSAs), health information exchanges (HIEs), and National Health Information Network (NHIN) connections.

We have assembled a federated informatics research community committed to open-source resources that can industrially scale to address barriers to the broad-based, facile, and ethical use of EHR data for secondary purposes. We will collaborate to create, evaluate, and refine informatics artifacts that advance the capacity to efficiently leverage EHR data to improve care, generate new knowledge, and address population needs. Our goal is to make these artifacts available to the community of secondary EHR data users, manifest as open-source tools, services, and scalable software. In addition, we have partnered with industry developers who can make these resources available with commercial deployment. We propose to assemble modular services and agents from existing open-source software to improve the utilization of EHR data for a spectrum of use-cases and focus on three themes: Normalization, Phenotypes, and Data Quality/Evaluation. Our six projects span one or more of these themes, though together constitute a coherent ensemble of related research and development. Finally, these services will have open-source deployments as well as commercially supported implementations.

There are six strongly intertwined, mutually dependent projects, including: 1) Semantic and Syntactic Normalization; 2) Natural Language Processing (NLP); 3) Phenotype Applications; 4) Performance Optimization; 5) Data Quality Metrics; and 6) Evaluation Frameworks. The first two projects align with our Data Normalization theme, while Phenotype Applications and Performance Optimization span themes 1 and 2 (Normalization and Phenotyping); while the last two projects correspond to our third theme.