Deverick Anderson, MD, an epidemiologist for Duke Medicine, is using a new tool at Duke to hunt down hidden sources of contamination of antibiotic resistant microbes. His target is Clostridium difficile (C. difficile), a dangerous antibiotic-resistant bug that the Centers for Disease Control calculates contributes to 14,000 deaths in the U.S. each year. The new tool, DEDUCE-GEO, uses big data patients’ geocoded home addresses.
DEDUCE-GEO tool uses latitude and longitude to visualize links between clinical data and physical environments
Deverick Anderson, MD, an epidemiologist for Duke Medicine, is using a new tool at Duke to hunt down hidden sources of contamination of antibiotic resistant microbes. His target is Clostridium difficile (C. difficile), a dangerous antibiotic-resistant bug that the Centers for Disease Control calculates contributes to 14,000 deaths in the U.S. each year.
“You didn’t used to get infected with antibiotic resistant C. difficile unless you were already in the hospital,” said Anderson, the director of the Duke Infection Control Outreach Network (DICON). “But now we are seeing more and more people come into the hospital already infected. This has made us wonder about what exposures there are to these antibiotic resistant organisms outside of the hospital.”
Anderson’s new tool is DEDUCE-GEO. It is Duke’s newest addition to D.E.D.U.C.E., a web-based query tool that allows investigators to filter millions of rows of data in Duke’s vast enterprise data warehouse of clinical information.
Duke launched DEDUCE GEO in 2013, after embarking on a mammoth project to verify and standardize over 4.5 million patients’ home addresses in the electronic health record. By creating systems that automatically link addresses to latitude and longitude, they made it possible to accurately display where patients live on an interactive map. How detailed this map is depends upon the investigator’s assigned research role within DEDUCE and the Institutional Research Board (IRB) policies governing access to patient information.
But the tool now goes far beyond simple geospatial visualization of addresses.
Using DEDUCE-GEO, researchers can access trillions of data points that define the geography of the local community, tie that information to patient care data, and visualize the results on an interactive map. This technology can help define a research cohort, inform healthcare interventions, and help take some of the guesswork out of determining how the environment is affecting people’s health and healthcare.
Sohayla Pruitt, a senior geospatial scientist for Duke Medicine, has led the DEDUCE-GEO team in collecting multiple layers of geographic, demographic, and socioeconomic data and linking it to the electronic health record. This data includes information from the 2010 Census; local infrastructure data such as streets, highways and parks; as well as geographic data for each business establishment in North Carolina.
Now DEDUCE-GEO has about 12,000 socio-economic and demographic data points tied to each block-group in Durham. This data includes average age, income, education level, race, commute time to work, family size, and much more. In the coming year, the team hopes to incorporate methods that allow the computation of the distance from each address to the nearest business or infrastructure feature.
The integration of all of this information into one data source allows researchers using the DEDUCE Cohort Manager and DEDUCE-GEO to:
• refine a cohort by geographic, demographic or socioeconomic characteristics – for example, choosing only those patients who have a certain level of education or live near particular businesses
• explore the neighborhood characteristics associated with their cohort through interactive charts and maps, and
• export the 12,000+ geospatial variables so that more advanced statistical models can select those that have statistical relevance.
In the past, a few intrepid researchers with a bit of knowledge of Geographic Information Systems (GIS) could laboriously pull some of this data together for a cohort of patients they were studying. However, researchers often had the time and resources to gather data on only one or two aspects of the environment that they could pre-conceive might relate to health. Seldom did researchers get a robust picture with tens of thousands of geographic variables that could help indicate how geography affects health or healthcare.
“We’ve taken out the manual labor and mysterious processes of GIS and built the infrastructure so that regardless of what researchers are interested in looking at, they have a database that is project agnostic and on demand,” Pruitt said.
Using Geospatial Enhanced Health Data
Duke researchers have been using geospatial data in many ways to improve research and clinical care.
This map of the terrain of diabetes in Durham County includes streets (lowest layer), two layers of economic and demographic information (purple and blue), and concentrations of diabetes patients (top layer). The green spines indicate locations of key social or commercial institutions that healthcare workers might partner with to provide community-based care. Published in Health Affairs 2013;32:608-1615
Maps of participants in the MURDOCK Study in Kannapolis have helped guide recruitment efforts to ensure the sample accurately reflects the demographics of the local community. “We’ve been able to see where our samples are under- or over-representative of the community, and look at possible geographic barriers to recruitment of certain segments of the population, such as the distance to a recruitment site,” said Melissa Cornish, a business development consultant for the study.
Researchers have also created a map of diabetic patients in Durham County linked to data on education, economics, utilization of the emergency room, and distance from healthcare clinics. This map has guided a Duke Medicine program that sends teams of social workers and nurses into neighborhoods with a high density of diabetic patients to help patients better manage their chronic disease and avoid costly trips to the emergency room. Researchers hope to duplicate this type of geospatially informed intervention in other areas of the U.S.
Recently, Pruitt helped create on-demand geospatial predictive models about where and when people smoke. By using mobile electronic devices to track where people lit up and modelling it against thousands of geospatial variables, the system was able to predict new areas where smokers are likely to light up again in the future. This study is helping researchers imagine the possibility of using mobile devices to provide personalized smoking cessation interventions, such as motivational text messages, based not only on time, but also on location.
“It pushes past the reactive approach to interventions based on past behavior and uses the statistical power of modeling to predict where that behavior will occur in the future,” Pruitt said.
Deverick Anderson, the epidemiologist, hopes that mapping out the geography of patients who come to Duke’s hospitals with an existing C. difficile infection will help uncover patterns of community exposures, including possible culprits such as water sources, farms, high-density living areas, or socio-economic factors.
“We lack understanding about the transmission of C. difficile outside the walls of the hospital, so if we find some clustering, it will probably just be scratching the surface,” he admits. But he sees great potential in the level of granularity that the DEDUCE-GEO data can produce.
“If we can find even a weak signal, a weak connection between something in the environment and these infections, it could give us data that could lead to an expanded study or perhaps allow us to apply for other grants,” he said. “To be honest, we don’t know what we are going to find, but it is exciting to think that we now have tools to open the lid on the question and see what comes out.”