Demos November 14, 2006
Unified Medical Language System Semantic Navigator Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA
Issues Size z
Large number of concepts (>1 million)
Complexity z z z
Polyhierarchical structures Multiple information sources Multiple properties
Lack of formality z z
Redundant relations Hierarchies vs. hierarchical relations Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
2
Challenges Restrict information space to selected information
sources Reduce complexity z z z
Group concepts by semantic groups Transitive reduction on hierarchical relations Select co-occurring concepts
Reduce the cognitive burden on the user z
Use graph-based rather than tree-based representations
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
3
UMLS Semantic Navigator SemNav http://umlsks.nlm.nih.gov* ► SN Resources ► Semantic Navigator (* free UMLS registration required)
Unified Medical Language System® Developed at NLM since 1990 139 source vocabularies z
17 languages
Broad coverage of biomedicine z z z
5.1M names 1.3M concepts 16M relations
Integration z z
Synonymous terms are clustered in a concept Hierarchies (trees) are combined in a graph structure Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
5
Terminology integration Terms Duchenne muscular dystrophy
MeSH, SNOMED CTV3, Jablonski, CRISP, DxPlain, MedDRA, LOINC
Duchenne’s muscular dystrophy
COSTAR
Duchenne de Boulogne muscular dystrophy
Jablonski
Duchenne type progressive muscular dystrophy
SNOMED
pseudohypertrophic muscular dystrophy
MeSH, CTV3 SNOMED
X-liked recessive muscular dystrophy
Jablonski
severe generalized familial muscular dystrophy
SNOMED
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
6
Terminology integration Relationships
Inter-concept relationships: hierarchies from the source vocabularies Redundancy: multiple paths One graph instead of multiple trees (multiple inheritance)
A
C
B D E H
E
B
F H
D E G H
A B D
C E
G
F H
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
7
UMLS A two-level structure Two-level structure z
Semantic Network
z
Metathesaurus
z
135 Semantic Types (STs) 54 types of relationships among STs >1M concepts ~12 M inter-concept relationships
Link = categorization
Semantic Network Semantic Type
categorization
Concept Metathesaurus
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
8
Semantic Types
Anatomical Structure Fully Formed Anatomical Structure
Embryonic Structure
Body Part, Organ or Organ Component
Disease or Syndrome Pharmacologic Substance
Population Group
Semantic Network
Metathesaurus Mediastinum 4
Saccular Viscus
Angina 97 Pectoris
Esophagus 12
Heart Left Phrenic Nerve
Concepts
9
Heart Valves
Fetal 31 Heart
Cardiotonic 225 Agents Tissue 22 Donors
[…]
[…]
MeSH Browser
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
13
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
14
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
15
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
16
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
17
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
18
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
19
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
20
SemNav Visualization options
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
21
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
22
SemNav Relationships Semantic Types
Biologically Active Substance
Disease or Syndrome
Amino Acid, Peptide or Protein
Muscular Dystrophy, Duchenne Dystrophin
190
Concepts
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
26
Technical details Simple web/cgi technology (apache, Perl) dot (GraphViz) z z
PNG file (-Tpng) Client-side map (-Tcmap)
Precompute the transitive closure on hierarchical
relations to perform the transitive closure fast Remove cycles (UMLS)
Lister Lister Hill Hill National National Center Center for for Biomedical Biomedical Communications Communications
27
Medical Ontology Research Contact:
[email protected] Web: mor.nlm.nih.gov Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA