
Juan Banda, Assistant Professor of Computer Science
Juan Banda, a Next Generation Program computer science researcher and collaborators at the Stanford University School of Medicine have developed new software that uses a machine-learning computer program to determine whether a patient is likely to have a genetic disease that can lead to heart attack or stroke. Machine learning is a method of data analysis that uses algorithms and statistical models to identify patterns and make decisions with minimal human intervention.
The cholesterol-raising disease, known as familial hypercholesterolemia (or FH), is often misdiagnosed as high cholesterol. Without intervention, around 50 percent of men with FH have a heart attack by age 50 and about 30 percent of women by age 60, said Banda, an assistant professor of computer science. However, early diagnosis and treatment of the disease can essentially neutralize this threat, he said.
“It’s underdiagnosed, but the prevalence of FH is roughly one in every 250 persons in the general population, more or less. So, it’s not at all rare, but is just not widely known or understood,” Banda said. Part of the reason for this is that expensive genetic testing has been required to pinpoint the disease, he said.
Banda said that the computer program developed by the research team provides a potentially low-cost, life-saving approach to quickly and accurately flagging patients who are likely to have the disease. In test runs of the program, it correctly identified 88 percent of the cases it screened. Theoretically, if the program were used in a clinic, any patient it flagged as having FH could undergo further genetic testing to verify whether they actually have the disease, Banda said.
To create the software, the team used data from Stanford’s FH clinic to learn what distinguishes an FH patient in an electronic health record. Banda and the researchers trained the algorithm to pick up on a combination of family history, current prescriptions, lipid levels, lab tests and more to understand what details signal the disease. The scientists built the algorithm’s foundation using data from 197 patients who had FH and 6,590 who did not, allowing the computer program to learn the difference between the two, he said.
Once the computer program was trained, Banda and researchers moved on to the testing phase, initially running it on a set of roughly 70,000 de-identified patient records it had never encountered. The research team’s follow-up analysis determined that the software had detected patients who had FH with 88 percent accuracy.
“It is designed to work very simply,” said Banda, who was a research scientist at Stanford before joining Georgia State. “You feed in a patient’s data and it outputs the probability that the person has FH. And we validated this by having doctors review charts of the patients we predicted to have FH.”
Banda said that he and his colleagues are working to put the new technology to use at Stanford’s healthcare clinics.
The research was published in the article “Finding missed cases of familial hypercholesterolemia in health systems using machine learning” in npj Digital Medicine.
Researchers from Atomo Health in Texas, the University of Pennsylvania, and Yale University also contributed to the work. The study was funded by in part by the American Heart Association and the FH Foundation’s FIND FH® (FLAG, IDENTIFY, NETWORK, DELIVER FH) initiative, which uses machine learning and big data to identify people with probable FH.
— Anna Varela, Director of Communications and Public Relations, College of Arts and Sciences