Garett MacGowan

Machine learning predicts hospital-onset COVID-19 infections using patient contact networks

DISCLAIMER: This blog is fully automated, unmonitored, and does not reflect the views of Garett MacGowan. The ML model may produce content that is offensive to some readers.

This blog post was generated with a (potentially) real article title as the only prompt. A link to the original article is below.

Original Article

Photo by Jaron Nix on Unsplash

Generated: 10/4/2021

Machine learning predicts hospital-onset COVID-19 infections using patient contact networks and clinical symptoms and is also able to identify asymptomatic individuals in a population.

1. Introduction {#sec1}
===============

Since COVID-19 was identified as a human pathogen in December of 2019,[@bib1] it has infected an estimated 4,100,000 people, caused over 220,000 deaths, and spread across all continents.[@bib2] The spread of this coronavirus within hospitals is an emerging threat to the US population, especially those most vulnerable to complications from the disease (i.e., seniors and individuals with pre-existing conditions).[@bib3], [@bib4], [@bib5] Despite the number of hospitalizations due to the COVID-19 infection appearing to plateau in the U.S. in May, 2020, these are expected to increase significantly in the coming months due to the aging population. Many people diagnosed with COVID-19 and recovering from the COVID-19, also known as hospital-onset COVID-19 infections, will be hospitalized and be at risk for complications from the infection such as pneumonia and sepsis[@bib6].

Given the expected rise in inpatient hospitalizations due to COVID-19 infections,[@bib6], [@bib7], [@bib8] identifying populations that are vulnerable to the infection is important to support the necessary steps for patient treatment and to reduce the severity of cases. Hospital-based population outbreaks have been observed before. For example, an outbreak of *Coxiella burnetii* (Q fever) in hospital laundry facilities began in 2009 among patients undergoing hospital laundry treatment[@bib9], [@bib10], [@bib11], [@bib12]; as a result, this disease gained a great deal of attention and many cases of Q fever and outbreaks have been reported in health care facilities.[@bib13], [@bib14], [@bib15], [@bib16], [@bib17], [@bib18] In addition to Q fever where the source of the outbreaks has been documented, multiple studies have looked for and identified potential sources of a variety of emerging and re-emerging pathogens from hospitals.[@bib19], [@bib20], [@bib21], [@bib22], [@bib23], [@bib24], [@bib25], [@bib26] Many of these outbreaks involved transmission from contaminated medical equipment such as a blood culture system.[@bib27]

In this study, we used Bayesian Markov chain Monte Carlo (MCMC) methods to build a probabilistic model that captures the potential for COVID-19 to transmit in a hospital setting and, ultimately, predict the likelihood of hospitals being a source of new COVID-19 infections based on the patient contact data from patients' rooms, and also to identify asymptomatic individuals in the hospital. We used this probabilistic model to explore how many COVID-19 infections are likely to occur at a specific hospital in the Washington D.C. area during the pandemic and provide an estimate as to how many cases of infection in patients and suspected cases in the general public might be avoided in the future.

The contributions of this study are threefold:1.To the best of our knowlege, this is the first study to explore the likelihood of hospital-based COVID-19 infections. In this paper, we focus exclusively on hospital-onset infections. These cases are more difficult to treat because they are more severe and associated with higher morbidity and mortality. This is largely due to the fact that these cases are more likely to involve severe underlying health conditions. A majority of hospital-onset infections can be avoided through more frequent and effective contact isolation and testing of patients, including asymptomatic patients. We used simulation and data from a large hospital in the DC area to model the likelihood of hospital-onset infections based on patient's clinical and behavioral data collected from room to patient contacts, using an ensemble framework that combines these data with patient's clinical symptoms and other variables (i.e., age and gender). The overall findings provide new insight into the factors that should be considered for preventing the spread of COVID-19 in hospital settings. It also highlights the potential benefits of contact network-based methods for predicting infection outcomes in the coming months.2.We present model prediction results that highlight the utility and limitations of the Bayesian methodology. Specifically, our study shows that one can quickly use the Markov Model framework, which can be applied to all pathogens that are transmissible from person-to-person, to predict COVID-19 transmissions in specific settings with low error. This methodology is flexible and can help explore a range of data-driven infection control strategies (e.g., hand hygiene, contact tracing, isolation, and vaccine development). It is also feasible to include clinical variables, even without having detailed patient-level data for each patient in the model. This provides insight into the likelihood of patients in the hospital being able to acquire the virus via different types of direct routes, such as person-to-person or indirect routes (e.g., from a contaminated contaminated device, such as a bed, an intensive care unit, or a bathroom). This helps prioritize infection control measures and treatment strategies. The ability to simulate disease transmission over a prolonged period of time (here it was 8 weeks) and explore a wide range of possibilities for the initial infection rates (10^−2^ to 10^−3^) allowed us to assess and identify factors that lead to higher likelihood of COVID-19 in the community.3.We used the framework to explore possible strategies that a hospital (or other health care facility) in the U.S. can take to prevent or limit the number of COVID-19 infections that occur in patients and the population more broadly.

2. Methods {#sec2}
==========

2.1. Data collection {#sec2.1}
--------------------

First, we collected patient clinical and behavioral information directly from the Johns Hopkins Health System COVID-19 Database.[@bib28] This includes information related to both COVID-19-confirmed patients and suspected COVID-19 patients. We then used this information to build a probabilistic model which combines this with more detailed epidemiological data (i.e., epidemiological information, such as patient's known history of contacts, recent travel) and spatial network data (see below). This combination enabled prediction of the infection rate in the various populations over time. The details of the data collection are provided in more detail in [Fig. 1](#fig1){ref-type="fig"}.Fig. 1**Workflow of data collection and modeling approaches taken to predict hospital-onset (newly diagnosed) COVID-19 and the public (total confirmed and potential) cases in the U.S.** A. We obtained data from the Johns Hopkins Health System COVID-19 Database and used the Ensemble Learning methodology to model the probability of COVID-19 infection in the hospital setting. Two major steps were used to model the infection probability in two settings: 1) data from confirmed individual's direct and indirect contacts from a specific patient room were collected, and 2) data from confirmed individuals' direct and indirect contacts over 8 wk, from the date that the patient was transferred to the ICU, were collected (see [Fig. 2](#fig2){ref-type="fig"}a). These data were then combined with the patient behavioral data including bed-to-bed distance and time spent on contact surfaces. The likelihood of hospital COVID-19 cases were then used to simulate the number of patients that are likely to be infected and the transmission of COVID-19 over time. This, in turn, enables us to model and explore infection control strategies and compare the utility of these strategies for infection prevention and containment. B. We collected data of hospital-onset cases from the Johns Hopkins Health System COVID-19 Database. We used a similar workflow detailed in [Fig. 1](#fig1){ref-type="fig"} A. to collect data on COVID-19 infections. The COVID-19 infection probability was then used to simulate the number of COVID-19 infections. The hospital-onset COVID-19 cases were reported by several counties in the DC area. However, the reported number of COVID-19 infections in the city of Baltimore is lower than in other counties, likely due to the high number of patients already diagnosed with COVID-19 in Baltimore.Fig. 1

2.2. Overview of the model and simulation methods {#sec2.2}
-------------------------------------------------

In this section, we describe how we combine data from the patient contact network and from the Johns Hopkins Health System COVID-19 Database to model the potential transmission of COVID-19 in the hospital setting, and the factors that we incorporated in this model and the assumptions that we make. We then describe the methodology used here to build simulations of new case infections from the likelihood values, compare the two approaches using a data-driven cross-validation procedure, and discuss the results in the context of real-world infection control strategies.

We model the likelihood of hospital-onset COVID-19 infections using patient contact and clinical data from patients' rooms in the Johns Hopkins Health System. We use both direct contact (from a direct contact from an infected or suspected patient to a susceptible patient, e.g., a physician or nurse to the patient) and indirect contact (from a direct contact to an infected or suspected patient, e.g.

Garett MacGowan