Nerd in the herd
How a SAS data scientist applies machine learning to prevent a pachyderm pandemic and improve herd health
By: Alison Bolen, SAS Insights Editor
When Packy, a beloved Asian elephant at the Oregon Zoo, was diagnosed with tuberculosis in 2015, zookeepers did everything they could to save him. Packy, the first elephant born in captivity in the US, was diagnosed with TB at age 52 through an annual “trunk wash.” This cumbersome procedure involves rinsing the elephant’s trunk with fluid and then collecting the fluid in a large bag for testing.
For more than a year after the diagnosis, the zoo treated Packy for TB, all the while remaining hopeful that he would recover. Indeed, two other elephants at the zoo were treated for TB at the same time and recovered.1 But when Packy tested positive for a drug-resistant form of the disease in 2017, elephant experts made the hard decision to euthanize Packy to end his suffering and prevent the spread of the disease to others.2
Elephants and TB are a dangerous combination
Packy is the latest casualty in an elephant tuberculosis outbreak across the United States. Since 1996, about 60 elephants have been diagnosed with TB in a population of nearly 1,300 individuals during the past 50 years. A number of factors are making it hard to curtail the outbreak:
- Testing for TB in elephants is an expensive and unreliable process. The disease can lie dormant for years before an elephant tests positive, and elephants that have TB often do not show signs of the disease.
- Elephants are social animals that live in close contact with others in herds, making transmission likely within a herd. Medical quarantine is possible for infected elephants but not healthy for long periods of time for these naturally social creatures.
- Within the US, elephants are transferred often from one facility to another, bringing them into contact with many other elephants and humans, allowing for transmission across large populations.
See how data is shaping our world – for the better
Conservation. Health. Human rights. Education. Interested in making a difference? Join the data for good movement. Around the globe, people are banding together and using data to help solve some of the world's most challenging issues.
The risk beyond elephants
Tuberculosis is a bacterial infection that attacks the lungs. According to the US Centers for Disease Control and Prevention, TB is one of the deadliest diseases in the world. While it’s uncommon in humans in the US, 10.4 million people around the world became sick with the disease in 2016, and there are 1.7 million TB-related deaths annually worldwide.
In humans and animals, most forms of TB can be treated with antibiotics when the disease is caught early enough, but epidemiologists fear a drug-resistant outbreak of the disease could cause a global pandemic.
Transmission of TB from animals to humans is rare, but not unheard of. In 2013, seven humans who had close contact with TB-infected elephants in Oregon tested positive for the disease.3
A better way to test for TB
A passionate data scientist and zoological expert have partnered to develop a more accurate and less invasive method for identifying TB in elephants using neural networks, a type of machine learning.
Currently, zoos and sanctuaries perform annual trunk wash TB tests on every elephant in their care. And new elephants in their herds are often retested or isolated temporarily to reduce the transmission of the disease.
New research from Sarah Harden, Systems Engineer at SAS, and Dr. Ramiro Isaza, Professor of Zoological Medicine at the University of Florida, analyzes 20 years of data on elephants in the US. The research compares traditional analytics methods, like logistic regression and decision trees, to more advanced methods, like neural networks and ensemble modeling.
According to Harden, the network models outperformed all other methods in identifying the likelihood of TB in individual elephants because this advanced method analyzes relational factors. Those factors include elephant locations, herd dynamics and social groups that exist within the elephant population.
Analyzing network variables shows relationships and connections between the elephants and identifies which animals are more likely to be at risk of the disease.
“In many cases, the network model can detect TB before the standard test detects it,” explains Harden. Zookeepers could use this information to know which elephants to test, which to treat and which to isolate. Plus the analysis is less invasive, less expensive and takes less time.
Preventing even a few elephants or a few people from getting TB, could have a huge impact on the spread of the disease or the development of a drug resistant form of the disease. Sarah Harden Systems Engineer SAS
A field model for elephant TB
Harden and Isaza have two primary goals for their research:
- Provide zookeepers and veterinarians with a practical field model they can use to quickly predict the likelihood of TB infection in any elephant within a herd.
- Promote the use of more advanced modeling techniques to improve the accuracy of disease prediction in animal populations.
Ultimately, succeeding at both goals could help curtail the spread of TB in elephants. “When it comes to emerging pathogens, accuracy is incredibly important,” explains Harden. Preventing even a few elephants or a few people from getting TB could have a huge impact on the spread of the disease or the development of a drug-resistant form of the disease.”
Isaza adds, “Similarly, falsely identifying a healthy elephant as infected could have major implications to the elephant, the herd and the zoo.”
Overall, Harden predicts that the new network model is 6 percent more accurate than any existing models used in the field, which means it could identify the disease in about 15 elephants that were missed previously, and it could reduce false positives as well.
In practical terms, if a zoo is considering adopting an elephant, the method could help predict whether the animal has TB by answering questions like:
- Is she an African or Asian elephant?
- Where was she born?
- How many elephants were in that herd?
- How many times has she moved?
- How many and which elephants has she encountered in those herds?
“Our model takes the analysis to the next level. If the zoo is looking for a 30-year-old Asian elephant and they don’t run the analysis, they’re taking a risk to introduce her to an existing herd of elephants,” explains Harden. But if they run the analysis, they’ll have a better idea of the risk associated with the individual, and can take appropriate precautionary measures.
Since a single set of trunk washes can cost around $500, and treatments for TB are in the $100,000 per year range, a small improvement could save zoos a lot of money while also improving animal health. The biggest benefit, of course, would be preventing elephants from suffering the same fate as Packy. Keeping more elephants healthy and slowing the spread of the disease would be a true success.
1 The Infected Elephant in the Room. Slate, March 24, 2015.
2 Packy, Oregon Zoo's Beloved Elephant Dies at 54. Seattle Times, February 9, 2017.
3 Elephants infected seven Oregon zoo workers with tuberculosis: CDC. Reuters, January 8, 2016.
Recommended reading
- Article Analytics tackles the scourge of human traffickingVictims of human trafficking are all around us. From forced labor to sex work, modern-day slavery thrives in the shadows. Learn why organizations are turning to AI and big data analytics to unveil these crimes and change future trajectories.
- Article Homelessness data holds insights to a hidden problemSAS partnered with The Carying Place, an organization that supports working homeless families, to find new ways to measure indicators of participant success and provide families the help they deserve.
- Article Finding COVID-19 answers with data and analyticsLearn how data plays a role in optimizing hospital resources, understanding disease spread, supply chain forecasting and scientific discoveries.
- Article What do drones, AI and proactive policing have in common?Law enforcement and public safety agencies must wrangle diverse data sets – such as data from drones – in their proactive policing operations. To be most effective, they need modern tools that support AI techniques like machine learning, computer vision and natural language processing.