Newswise — PHILADELPHIA— What if your own medical record could figure out that you are at risk of developing a rare disease so you could receive a diagnosis months or even years earlier than you might otherwise, allowing doctors to get started on important treatment sooner? That’s what a team of researchers co-led by faculty at the Perelman School of Medicine at the University of Pennsylvania and the University of Florida College of Medicine will explore with the help of a $4.7 million grant from the National Institutes of Health (NIH).

For the next four years, researchers will work to develop a set of algorithms powered by machine learning, a form of artificial intelligence (AI), to identify which patients are at risk of five different types of vasculitis and two different types of spondyloarthritis. These predictions, derived from information already available in patients’ electronic health records, could greatly increase the chance of patients being diagnosed sooner.

The efforts to develop this prediction method, called “PANDA: Predictive Analytics via Networked Distributed Algorithms for multi-system diseases,” will be led by principal investigators Yong Chen, PhD, a professor of Biostatistics, and Peter A. Merkel, MD, MPH, chief of Rheumatology and a professor of Medicine and Epidemiology at Penn, and Jiang Bian, PhD, chief data scientist of the University of Florida Health system and a professor in the Health Outcomes & Biomedical Informatics at the University of Florida College of Medicine.

“This is an exciting step forward, building on our current PDA framework, from clinical evidence generation toward AI-informed interventions in clinical decision-making,” Chen said. “Despite the clear need to reduce the dangerous and costly delays in diagnosis, individual clinicians, especially in primary care, face important challenges.” 

Chen used one of the forms of vasculitis under study, granulomatosis with polyangiitis (GPA), as an example of the promise the PANDA system holds. GPA involves inflammation of many organs and can be very severe or even fatal. Mortality rates for patients with this condition remain high in the first year after diagnosis, and the correct diagnosis of this type of vasculitis, and all the other types, can be delayed by months or even years. 

“An earlier diagnosis of any of the types of vasculitis and spondyloarhritis we’re working on leads to a much better prognosis and better clinical outcomes,” Merkel said. “Even if we determine that a patient has just a 10 percent likelihood of developing one of these diseases, that is a much higher chance of a rare problem, and clinicians can keep that in mind and make better decisions for their patients.”

Among the challenges in diagnosis faced by clinicians and their patients are how rare diseases can camouflage themselves as other common diseases, a lack of access to data or other clinicians the patient works with, and, simply, a lack of familiarity with extremely uncommon conditions. An algorithm that automatically scans known information to identify the possibility of a disease like GPA could be lifesaving.

“The increasing availability of real-world data, such as electronic health records collected through routine care, provides a golden opportunity to generate real-world evidence to inform clinical decision-making,” Bian said. “Nevertheless, to leverage these large collections of real-world data, which are often distributed across multiple sites, novel distributed algorithms like PANDA are much needed.”

The researchers plan is to pull data through Patient-Centered Clinical Research Networks (PCORnet), a national database including information from different health systems, adding up to more than 27 million patients. De-identified data from these patients, including lab test results, comorbid conditions, past treatments, and other commonly available information, will be used to create the algorithms. Once built, the researchers will test each algorithm’s predictive power across 10-plus health systems, and then following these tests, the methods the team develops will be shared and available to apply to other diseases.

Because, as its name implies, machine learning algorithms are designed to “learn” and refine themselves as they’re used and fed more data, it’s possible that PANDA will continuously refine itself and become more helpful as time passes. “The proposed machine learning algorithms will adaptively update their key parameters as more data are made available,” said Chen. “We plan to evaluate these machine learning algorithms periodically to ensure they meet our pre-specified standards and can evolve positively over time.”

The grant funding the research is 1U01TR003709.



Penn Medicine is one of the world’s leading academic medical centers, dedicated to the related missions of medical education, biomedical research, and excellence in patient care. Penn Medicine consists of the Raymond and Ruth Perelman School of Medicine at the University of Pennsylvania (founded in 1765 as the nation’s first medical school) and the University of Pennsylvania Health System, which together form a $8.9 billion enterprise.

The Perelman School of Medicine has been ranked among the top medical schools in the United States for more than 20 years, according to U.S. News & World Report's survey of research-oriented medical schools. The School is consistently among the nation's top recipients of funding from the National Institutes of Health, with $546 million awarded in the 2021 fiscal year.

The University of Pennsylvania Health System’s patient care facilities include: the Hospital of the University of Pennsylvania and Penn Presbyterian Medical Center—which are recognized as one of the nation’s top “Honor Roll” hospitals by U.S. News & World Report—Chester County Hospital; Lancaster General Health; Penn Medicine Princeton Health; and Pennsylvania Hospital, the nation’s first hospital, founded in 1751. Additional facilities and enterprises include Good Shepherd Penn Partners, Penn Medicine at Home, Lancaster Behavioral Health Hospital, and Princeton House Behavioral Health, among others.

Penn Medicine is powered by a talented and dedicated workforce of more than 44,000 people. The organization also has alliances with top community health systems across both Southeastern Pennsylvania and Southern New Jersey, creating more options for patients no matter where they live.

Penn Medicine is committed to improving lives and health through a variety of community-based programs and activities. In fiscal year 2020, Penn Medicine provided more than $563 million to benefit our community.