PSC

Report 3 Downloads 337 Views
Using GIS to Understand a Rare Disease: Primary Sclerosing Cholangitis (PSC) ESRI Regional User Conference Estella M. Geraghty, MD, MS, MPH/CPH, FACP, GISP Christopher L. Bowlus, MD - Gastroenterologist March 7, 2012

Outline § What is PSC? § GIS Techniques 1. Distribution Mapping 2. Hot Spot Analysis 3. Distance to Transplant Centers 4. Validation 5. Interpolation § Next Steps

What is PSC? § A rare, chronic & progressive disease of the biliary tract

Image from Johns Hopkins Medicine, Gastroenterology and Hepatology website

What is PSC? § The normal gallbladder and biliary tree are smooth § Inflammation causes changes in PSC § Strictures § Beads on a string Images from Johns Hopkins Medicine, Gastroenterology and Hepatology website

What is PSC? § Can be quiescent for a long time (yrs) § Symptoms § Abdominal pain § Itching & jaundice § Fever and chills § Weight loss, fatigue

Image from Johns Hopkins Medicine, Gastroenterology and Hepatology website

Complications of PSC? § Cirrhosis § Cholangiocarcinoma § Colon Cancer

Images from Johns Hopkins Medicine, Gastroenterology and Hepatology and Daily Health News websites

Causes of PSC…

Ischemic vascular damage

Toxins from intestinal bacteria

GIS Step 1. US Distribution of PSC? § Data § United Network for Organ Sharing (UNOS) 1995-2008 § U.S. Census population 2008 § Enumeration unit – ZIP code § Analysis variable – PSC prevalence PSC cases on transplant list *100,000 = PSC prevalence per 100,000 population Population

GIS Step 1. US Distribution of PSC? § Some stats: § 6767 PSC cases over 14 years § Lost 78 cases - no ZIP code (1.15%) § Excluded ZIP codes with < 1000 population (302 cases) § Cases occur in 4615 of 30322 (15.22%) of US ZIP codes

GIS Step 2. Are there ‘hot spots’ of PSC? § Used the method prescribed by ‘the Laurens’ § Control for the heterogeneity of ZIP code sizes § Determine the distance where clustering is most intense § Use spatial weights matrix to include large ZIP codes See Spatial Statistics Resources at: http://blogs.esri.com/esri/arcgis/2010/07/13/spatial-statistics-resources/

GIS Step 3. Distance to Transplant Centers § Euclidean distance calculated from each ZIP code centroid to nearest transplant center § Mean distance was shorter to a transplant center in hot spots vs cold spots (p0 § Hot spot = 10 § GiPValue < 0.0501 & GiZScore 0.0501 § Equivalent to mean = 0

GIS Step 4. Validation § Union parts A & B § Add a field (sum) § Add parts A & B

Possible Combinations

Meaning

10 + 10

= 20

Both hot spots

-1 + -1

= -2

Both cold spots

0

Both ‘mean’ spots

+

0

=

10 +

0

= 10

One hot & one ‘mean’ spot

-1 +

0

= -1

One cold & one ‘mean’ spot

=

One hot & one cold spot

10 + -1

0

9

Grey areas are non-concordant

Spearman Rank Correlation Coefficient = 0.2359

Spatial Spearman’s § Permutation analysis (2 models) § To determine significance (thanks to Lauren Scott and Mark Janikas)!

Model #1

Add ‘0’ to every row (default)

Determine the number of iterations

Clear selection

Send to next model

Clear selection

Also selecting and counting those = -2 and 0 (all concordant rows)

Model #2

Create a table name and location

Add field to contain the concordant row count

Receives data from permutation model

Last 3 rows of code block: icursor.insertRow(row) del icursor return rtable

Idaho, Utah & Colorado – mostly mountainous areas

Ohio & Susquehanna River Valleys – mostly low lying

Lead & Inflammation § Conflicting evidence

GIS Step 5. Interpolation of lead levels § TOXMAP

GIS Step 5. Interpolation of lead levels § Challenges: § Data are aggregated over many years for each station § >41,300 stations in U.S. § Latitude and longitude are included…but… § Multiple datums are used § Eg. Astro, OLDHI, Unknown

GIS Step 5. Interpolation of lead levels § § § § § §

Separated dataset by datum Geocoded stations in each datum Transformed each result to WGS84 Appended results into one file Dealt with multiple outliers Performed IDW & Kriging interpolations

Next Steps § Re-aggregate the interpolated lead results to ZIP code § Zonal analysis vs § Gaussian geostatistical simulation § Use the exploratory regression tool to evaluate variable relationships § Consider other environmental variables § Ideas?

Thank you for your attention!

Este Geraghty [email protected] [email protected]