Bringing data to the stream
Meet a data scientist who combines his love for streams with analytics
By: Alison Bolen, SAS Insights Editor
Charles Dow, PhD, has always been fascinated with streams. As a child, Dow spent most of his free time in or around streams and ponds. He still likes to bike, hike and fish in and around water, and he continues to be fascinated with water flowing through streams and rivers.
“When I’m traveling, I have this somewhat obsessive need,” says Dow. “I have to look at every stream, river or lake that I pass. I don’t necessarily have to stop and get out of the car – but I have to slow down and look. It’s something I do to this day.”
Dow is the Director of Information Services at Stroud Water Research Center, where he uses SAS® to analyze and understand changes in freshwater systems.
According to the World Health Organization, access to fresh water has been improving around the globe. However, rising populations and increased water use continue to stress the world’s demand for this finite resource.
A vast majority of people living in the United States rely on a public water supply, and more than half of that supply is from surface water sources, according to data from the United State Geological Survey. Yet more than half the nation’s streams and rivers remain in poor condition.
The Stroud Center has been committed to the preservation and restoration of freshwater systems since 1967, and its use of SAS in that mission started in the 1980s.
Download free ebook to learn how organizations are using data for good
From preventing life-threatening illnesses to protecting endangered species to rebuilding after natural disasters, organizations across the globe are harnessing data to make a difference. Read this ebook to find out more.
Continuity matters in stream research
Using SAS has allowed research scientists at the Stroud Center to treat their data the same way over the years – and to repeat complex analyses with complete confidence in the results, says Dow. “It’s important to understand how stream conditions have changed, for better or worse, and how we can relate those changes to other things going on in the environment. That’s a big part of our research focus.”
For example, White Clay Creek, a stream that flows through the research facility, has been at the center of stream research since 1967. For the last 20 years, this stream has been designated as a site for long-term research in environmental biology (LTREB), funded through the National Science Foundation. The LTREB project recognizes that long-term time series data is critical in understanding ecology. The ability to consistently manage and analyze these data sets has helped the Stroud Center maintain LTREB support.
“We can go back to programs that were written decades ago and still run them. That continuity is an important thing when dealing with our data. We rely on historical data sets, and understanding how a new project fits into existing data streams is very important to our research,” explains Dow. “We consider these ‘core’ data sets on stream water quality, dating back in some cases to the late 1960s, as one of our more valuable assets.”
It’s important to understand how stream conditions have changed, for better or worse, and how we can relate those changes to other things going on in the environment. That’s a big part of our research focus
Charles Dow, PhD • Director of Information Services
Studying insects with SAS®
The team at the Stroud Center studies fresh water, including the physical nature of streams and how their location changes through time. It looks at the chemistry of water and the biology in streams, including macroinvertebrates (aquatic insects and related invertebrates) that live in flowing waters.
Since different types of macroinvertebrates tolerate different stream conditions and varied levels of pollution, scientists use their presence or absence to understand water quality. “Macroinvertebrates are a very important indicator of stream health and stream water quality,” says Dow. “They provide an integrated and all-encompassing view of water quality and not a snapshot view at a given moment in time. We use the bugs to see how things are changing in the stream over time, or how things are different between forested and urban areas.”
A challenge with sampling insect populations across multiple sites is that sample sizes vary from site to site, making it difficult to make accurate comparisons. “At one site you might have collected 100 bugs. At another site you’ve collected 1,000 bugs,” says Dow. “The trick is making sure what you’ve collected at one site is really comparable to another site so you’re not biasing your conclusions based on the site with the larger sample size.”
Analysis of this macroinvertebrate data begins with a SAS resampling technique, developed and refined over the years by Stroud Center entomologists in cooperation with data scientists, to reduce sample sizes to an unbiased equal number across all samples. “You have to randomly resample the data a lot of times until you have some confidence that you have a random sample,” says Dow. “It takes a lot of time and a bit of computing power to come up with a routine that does that relatively quickly and easily for thousands of sample sites over dozens of years. We have a really nice routine built in SAS that allows us to do that quickly.”
Since different types of macroinvertebrates tolerate different stream conditions and varied levels of pollution, scientists use their presence or absence to understand water quality.
The results of water research
This meticulous study of thousands upon thousands of tiny, aquatic critters, along with other chemical, physical and biological aspects of streams over 50 years, has brought about many tangible results and insights into how streams work, from drinking water sources for Philadelphia and New York City to mountainous regions of Costa Rica and tributaries of the Amazon in Brazil.
A few recent projects include:
- Working with farmers to reduce runoff and improve crop yields. This six-year project compares the impact of different farming practices on soil health, farm productivity and water quality throughout the 8.7 million-acre Delaware River watershed.
- Redesigning wetlands to reduce flooding in Pennsylvania’s White Clay Creek. A primary goal of this project is to reduce flooding impacts like the devastation that occurred after Superstorm Sandy in 2012.
- Educating students and community members through hands-on programs and online resources that motivate them to become responsible stewards of freshwater resources.
Additionally, the Stroud Center uses SAS in conjunction with geographic information system geospatial data to examine the interaction between land surfaces and the streams and rivers that drain them. Understanding how our activities on the land affect stream and river health allows residents, conservation organizations and government officials to better manage and protect our freshwater systems.
Dow, for one, is grateful for the opportunity to work on these projects and to make a difference with data and analytics. “I’ve been lucky enough to take an interest I had as a kid – a love of being in and around streams – and make a career out of it. Part of that is the challenge of making sense of data, and helping scientists find answers to their questions," he says.
“I’ve always been interested in how much the character of a stream can change over time or after a large storm,” continues Dow. “The average layperson thinks stream locations are static. But they’re very dynamic. It’s amazing how much they change if you look at a series of aerial photos of a stream over decades. It’s that dynamic nature of streams that makes this work so satisfying and keeps it interesting. We know a lot about how streams and rivers work, but there is so much more that we don’t know. Because of all we don’t know, my work life at Stroud Water Research Center continues to be filled with challenges and is never dull.”
Closeup bug photo credit: Dave Funk, Stroud Water Research Center.
Recommended reading
- Fraud detection and machine learning: What you need to knowMachine learning and fraud analytics are critical components of a fraud detection toolkit. Discover what you’ll need to get started defending against fraud – from integrating supervised and unsupervised machine learning in operations to maintaining customer service.
- Unemployment fraud meets analytics: Battle lines are clearly drawnMany fraudsters seized opportunities presented by the COVID-19 pandemic. During the crisis, unemployment fraud became a battleground between international criminal networks and government agencies. Learn how analytics can save billions – and deliver benefits to those truly in need.
- Public health infrastructure desperately needs modernizationPublic health agencies must flex to longitudinal health crises and acute emergencies – from natural disasters like hurricanes to events like a pandemic. To be prepared, public health infrastructure must be modernized to support connectivity, real-time data exchanges, analytics and visualization.