Every Wednesday during the warmer months, volunteers test the water for bacteria at sites up and down the Anacostia. But what if you want to go paddleboarding on Saturday? The results from three days earlier could be completely irrelevant, especially if there’s been a big rainstorm, possibly causing sewage to overflow into the river.
The environmental group Anacostia Riverkeeper is trying to fill in those water quality gaps, partnering with the artificial intelligence company DataRobot to predict bacteria levels, based on known current conditions and past information.
“It’s kind of similar to how they do climate change models or weather forecasting,” says Robbie O’Donnell, with Anacostia Riverkeeper. There is a long list of factors that can impact bacteria levels on any given day, he says. “There’s rain, there’s the turbidity at that moment, there’s the water level, there’s the tide, there’s the pH of the water, how sunny it is outside,” O’Donnell says.
Paul Fornia, a data scientist at DataRobot, says the company is offering select nonprofits access to its artificial intelligence software—and consulting on how to use it—on a pro-bono basis. The software uses machine learning: “If you have a problem and you have a bunch of historical data, you can train machine learning algorithms to make forecasts on future data,” Fornia explains.
The model for predicting water quality is still in the works—O’Donnell hopes to start posting daily water quality forecasts in May or June, and to expand to sites on the Potomac and Rock Creek later this year. It’s still too early to say how accurate the predictions will be.
“This is not a yes or no to go jump in the water,” says O’Donnell, noting that swimming is illegal in any D.C. waterways. Rather, he sees the forecasts as one more tool to help people understand water quality, in addition to weekly water testing his group is conducting. “If people have all the information, they can make the most up-to-date and accurate decision themselves,” O’Donnell explains.
Using artificial intelligence to forecast water quality is not unprecedented, and research has shown it to be effective. A 2018 study of water quality in the Chicago River found that a machine learning model was able to accurately predict bacteria levels 86.5 percent of the time. A 2008 study of beaches in San Diego found machine learning models were accurate more than 90 percent of the time.
Fornia says this is the first time DataRobot has developed such a model for water quality, and he hopes it’s successful enough to reproduce elsewhere. “One of the things we’re most excited about is that it could potentially scale, to other urban riverkeepers in the U.S.,” Fornia says. He suggests it could be even more useful elsewhere in the world. “Water quality is obviously a huge issue globally, in rural communities in Asia and Africa, for example.”