A machine learning model may significantly reduce the error for annual suicide rate estimation, according to results of a cross-sectional national study published in JAMA Network Open.
“Researchers have attempted to strengthen and supplement the signal produced from online search or social media data by examining variables that provide additional population-level environmental context for suicide risk, such as macroeconomic indicators and meteorological patterns,” Daejin Choi, PhD, of the department of computer science and engineering at Incheon National University in South Korea, and colleagues wrote. “To our knowledge, no study has attempted to combine information from disparate real-time data sources to evaluate the ability of multiple streams to tackle the pressing public health need of enabling estimation of suicide fatality rates for the U.S. in near real time.”
The researchers aimed to address this research gap using a machine learning pipeline to combine signals from multiple real-time information streams to estimate weekly U.S. suicide fatalities in near real time. First, the approach fit optimal machine learning models to the individual data streams and then combined predictions made from each stream using an artificial neural network. The researchers variously obtained from 2014 to 2017 national-level U.S. administrative data on suicide deaths, health services and meteorological, economic and online data. Multiple heterogeneous data streams included ED visits for suicide ideation and attempts via the National Syndromic Surveillance Program; calls to the National Suicide Prevention Lifeline; calls to U.S. poison control centers for intentional self-harm; consumer price index and seasonality-adjusted economic measures; hours of daylight per week; suicide-related Google and YouTube search treads; and public posts related to suicide on Reddit, Twitter and Tumblr. Choi and colleagues statistically compared estimates with actual fatalities recorded by the National Vital Statistics System.
Results showed a high correlation between the machine learning method’s estimates of weekly suicide deaths and actual counts and trends, with a Pearson correlation of 0.811 (P < .001). The model estimated annual suicide rates with a low error of 0.55%.
“Examining suicide from the perspective of multiple data sources, each representing a unique aspect of the problem, can help inform federal support of appropriate programs and policies to prevent suicide,” Choi and colleagues wrote. “For example, more timely information about rapidly increasing suicide trends could enable governmental funding and support for programs to prevent suicide in a way that better keeps pace with the growing magnitude of the problem. Such efforts might include more rapidly addressing clinician shortages in mental health care; expanding crisis intervention programs, such as hotline, chat or text services; or strengthening policies and programs that address underlying risk factors, such as economic or housing instability.”