Mayo Text Analysis
Mayo Text Analysis
Our Goal
Working with Mayo practice leaders to innovatively and efficiently record, index, retrieve, collate, and analyze data needed to provide and enhance excellent, cost-effective patient care, research and education.
Overview
Free or unstructured text represents a significant portion of clinical description and detail as part of an Electronic Medical Record. Unstructured text is typically contained in various clinical documents such as radiology, pathology, surgical and clinical notes. These documents contain important information that can be used in clinical research and the demand for access to such information is growing. Unstructured text offers little in terms of access to the information beyond keyword searching. The main goal of the text analysis team at Mayo is to develop tools and methods for structuring the unstructured text of clinical documents for subsequent indexing, retrieval and text mining. We address the following issues in our work:
- detection of token boundaries (tokenization)
- detection of sentence boundaries (sentence detection)
- context sensitive spelling correction
- word sense disambiguation (WSD)
- part-of-speech tagging
- shallow and deep parsing
- near-synonymy and semantic relatedness
- structured concept representation in terms of semantic frames
- automatic term extraction
- mapping of free text to ontologies and nomenclatures
- automatic text categorization/classification
- negation identification
- concept status identification (probable/history of/family history of)
|