Infectious Disease

Machine learning uses electronic health records to stratify food allergy risks in infants

August 23, 2023

4 min read


Brandwein reports receiving personal fees from MYOR Diagnostics Ltd. Please see the study for all other authors’ relevant financial disclosures.


Receive an email when new articles are posted on

Please provide your email address to receive an email when new articles are posted on .

” data-action=subscribe>

We were unable to process your request. Please try again later. If you continue to have this issue please contact [email protected].

Back to Healio

Key takeaways:

  • Random forest regression models had an area under the curve of 0.8 with 83% accuracy.
  • The researchers called the model superior to using family or infant history of atopy alone in assessing risks.

Using routinely collected data in electronic health records, predictive modeling stratified risks for developing food allergy among infants, according to a study published in Allergy.

This information can help physicians initiate interventions to mitigate the development of food allergy among these infants, Michael Brandwein, PhD, cofounder and vice president of research and development at MYOR Diagnostics Ltd., and colleagues wrote.

Awareness of potential risks can enable timely interventions for preventing food allergy in infants. Image: Adobe Stock

“This study arose from the observation of the dramatic surge in food allergy incidences over recent decades,” Brandwein told Healio.

Michael Brandwein

Believing that environmental or lifestyle risk factors might be at the core of this rise, Brandwein said that he and his colleagues have been diligently constructing extensive databases based on real-world and clinical trial data.

“Our overarching goal is to discern the key drivers behind food allergy risks and, by doing so, pave the way for effective prevention strategies,” he said.

The researchers tapped machine learning to conduct their study.

“Machine learning is like teaching a computer to recognize patterns based on examples, rather than programming it with specific instructions for every task,” Brandwein said.

Given information from the past such as how temperatures change over seasons or how a stock’s price moves over time, Brandwein said, a predictive model tries to guess what will happen next.

“The more data or examples you give it, the better its predictions typically become. So, it’s like training the computer to make educated guesses about the future based on patterns it has seen in the past,” he said.

Study design, results

The retrospective, cross-sectional database study used EHRs spanning decades from Leumit Health Services in Israel, including 4,077 patients with food allergy (57.5% boys) and 95,686 control patients (51.2% boys). All patients were born between 2010 and 2020.

“We partnered with Leumit Start, the Innovation arm of Leumit Health Services,” Brandwein said. “Our in-house team of clinical specialists and computer scientists worked on the model generation for the past year or so.”

Data from both cohorts were used to train and test random forest regression models (RFRMs), which the researchers said often outperform logistic regression models due to their bootstrapping and bagging abilities during large dataset analysis.

Logistic regression models are intuitively explainable, the researchers said, which is important to clinicians. The researchers said they included logistic regression models in their study with the RFRMs to balance maximal predictive value with clinician and researcher needs.

The logistic regression models indicated several significant risk factors for food allergy, the researchers said, including:

  • first-borne child (OR = 2.08; 95% CI, 1.19-2.26);
  • percent of siblings with an atopic condition (OR = 2.06; 95% CI, 1.84-2.3);
  • siblings with food allergy (OR = 1.71; 95% CI, 1.61-1.81);
  • parental atopic history (OR = 1.31; 95% CI, 1.25-1.36);
  • use of topical antibiotics during infancy (OR = 3.87; 95% CI, 3.26-4.58);
  • use of systemic antibiotics during pregnancy (OR = 1.93; 95% CI, 1.82-2; P < .001);
  • use of systemic antibiotics during infancy (OR = 2.86; 95% CI, 2.49-3.27); and
  • prior diagnosis of atopic dermatitis (OR = 8.61; 95% CI, 7.71-9.6).

After incorporating all the risk factors from before the diagnosis of food allergy, RFRM had a receiver operating characteristic curve with a 0.8 area under the curve (AUC), 83% accuracy, 62% corresponding sensitivity and 84% specificity. The largest effect in constructing the RFRM was the number of courses of systemic antibiotics during pregnancy.

“We were taken by the strong contribution of the use of systemic antibiotics while pregnant to the model,” Brandwein said, noting that several studies have pointed to the importance of the microbiome in the development of food allergies.

“This observation further strengthens the connection between a healthy microbiome and a healthy child,” he said. “It also provides for a potential way to mitigate risk in the future.”

When the researchers trained the RFRM only with the risk factors that were available during the prenatal period, the RFRM had a 0.76 AUC, 79% accuracy, 59% corresponding sensitivity and 80% specificity.

The researchers additionally compared their algorithm with maternal history of food allergy, parental history of atopic conditions and previous diagnoses of atopic dermatitis before age 4 months, which they said are sporadically used in clinical and research settings to determine risk for food allergy.

Calling its improvements drastic and significant, the researchers also said their regression model was superior to risk assessments that only used family or infant histories of atopy.

Conclusions, next steps

Based on these findings, the researchers said that predictive modeling that uses routinely collected data from EHRs can be a powerful tool in stratifying infant risks for developing food allergies.

With awareness of these risks, the researchers continued, physicians and caregivers can conduct timely interventions to mitigate any potential for food allergies, including the use of early introduction of allergens into infant diets.

These findings may enable drastic reductions in clinical trials designed to assess the efficacy of prevention strategies as well, the researchers said, although future studies would be needed to further hone the predictive capability of these risk stratification techniques.

Brandwein also encouraged other health care systems to employ similar strategies.

“We believe strongly that health care systems around the world should proactively stratify risk for food allergies, to empower prevention policies and compliance,” he said.

“Our group is open to sharing our machine learning models with health care systems around the world to help educate parents and professionals and put an end to the food allergy problem,” he continued.

The researchers continue to hone the predictive capabilities of the algorithm by adding on new layers of risk factors and protective factors, Brandwein said, specifically dietary practices and other environmental influences.

“We’ve built similar models for other conditions in the atopic march, including atopic dermatitis, which we hope to publish in the near future,” he said. “We look forward to seeing our models in action.”

MYOR Diagnostics Ltd. invites caregivers to assess their child’s risk for food allergy online.

For more information:

Michael Brandwein, PhD, can be reached at [email protected].


Back to Top
Maria Gil, MHA)

Maria Gil, MHA

Health care has historically been data rich, but data starved, with insights buried in paper documents, inaccessible for analysis. The digitization of health care data into EHRs unlocked the potential for analysis. Data science transformed the way that we have been able to make linkages between cause and effect.

The limitations of data science are in computing power and the bandwidth for a human to direct the model. But in the last decade, computing power has become a nonissue in most cases. With the introduction of machine learning and generative AI (GenAI), the need for a human to direct the machine to compute millions and millions of data points is also changing.

This study employs machine learning to uncover relationships that dispelled even my own pre-existing notions around food allergies. As someone who struggles from food allergies, I was quite curious to see what could be gleaned from unleashing the power of machine learning on vast amounts of digital data now available in health care.

I found the role of antibiotic use during pregnancy to be surprising, but more importantly, I was inspired by the potential of analytics demonstrated in the study. Imagine if we could deploy GenAI to process large amounts of structured and unstructured data included in medical records and apply machine learning to identify patterns that may be overlooked in a bandwidth-constrained environment.

Companies such as Truveta Data are aggregating de-identified data from EHRs across the country to take analysis like this to the next level. The future is no longer limited to one analysis at a time on the population within your catchment area. Data aggregators will allow us to create population pools that resemble your exact patient and unleash the power of AI and machine learning to understand cause and effect in a much more relevant and profound way, parsing millions of records and data points to uncover patterns previously undetected. This can lead to a new way of understanding the causes of disease, as well as identifying the most effective treatments, personalized to each individual.

This is an interesting study and just the tip of the iceberg in how we use and analyze data for health care.

Maria Gil, MHA

Partner, Genpact

Disclosures: Gil reports being a health care data-tech-AI partner with Genpact.


Receive an email when new articles are posted on

Please provide your email address to receive an email when new articles are posted on .

” data-action=subscribe>

We were unable to process your request. Please try again later. If you continue to have this issue please contact [email protected].

Back to Healio


AI in Medicine

Related Articles