Modeling of COVID-19 total hospitalizations in the United States

The SARS-CoV-2 (COVID-19) virus continues to increase across the globe affecting all aspects of modern life. It remains unknown whether COVID-19 hospitalizations can be effectively modeled using regression analysis. Specifically, it is unknown which regression model may accurately reflect past or future trends in COVID-19 hospitalizations. We wanted to see whether we could develop a simple model to describe both previous and future COVID-19 hospitalizations. The graph for total hospital admissions for COVID-19 shows a curve similar to a sine wave with peaks in total hospitalizations occurring in April, July, and December. We used regression analysis for total COVID-19 hospitalizations to provide insight into potential factors influencing COVID-19 hospitalizations and predict future hospitalizations. We found that the total hospitalizations in the United States followed a sine-wave distribution with peaks in hospitalizations every 3.5 months between April and November 2020. However, the sine-wave distribution for COVID-19 disappeared when the model was extended to December 2020. In general, mathematical modeling of hospitalizations works best when there is an established pattern of disease transmission from multiple years of data collection; COVID-19 is a novel virus for which we have less than a year’s worth of data from which to draw conclusions. Furthermore, there remains uncertainty about the trajectory of COVID-19 cases and hospitalizations in the future, particularly with the recent emergency use authorization of the Pfizer and Moderna COVID-19 vaccines.


IntroductIon
The SARS-CoV-2 (COVID-19) virus continues to increase across the globe affecting all aspects of modern life. Despite the increasing numbers of SARS-CoV-2 infections, there is increased pressure to re-open economic activity and bring "normalcy" to people's lives. The primary method for detecting and monitoring the spread of the SARS-CoV-2 virus is the reversetranscriptase-polymerase-chain-reaction (RT-PCR) test and clinical symptoms. 1 However, the RT-PCR requires expensive equipment and trained technicians at certified laboratories, requires wait times to generate results, and has risks for false-positive results in patients due to excessively high cycle counts. 1 Tracking hospitalizations from the SARS-CoV-2 provides another method for assessing the spread and severity of COVID-19. The number of hospitalizations is a better metric for severe disease (rather than asymptomatic cases) and a better warning indicator that hospital capacity may be overwhelmed. 2 At the beginning of the pandemic, hospitalizations rose steeply as the pandemic swept through parts of the United States (US), particularly in New York and New Jersey. In response, hospital admissions for other problems, including elective procedures, fell dramatically, leading many hospitals to operate at less than 50 percent capacity. 2 This also included reduced hospital admission for acute medical disorders, including stroke and acute myocardial infarction. [3][4][5][6][7] Since this is a new virus, it remains unknown whether COVID-19 hospitalizations can be effectively modeled using regression analysis. Specifically, it is unknown which regression model may accurately reflect past or future trends in COVID-19 hospitalizations. The Centers for Disease Control (CDC) uses several mathematical models to predict national and state numbers of new and total COVID-19 cases, hospitalizations, deaths every four weeks. These forecasts used different types of data (e.g., COVID-19 data, demographic data, mobility data), methods, and estimates of interventions, such as social distancing and use of face coverings. 8 Given the uncertainty and variability with computer models, the CDC uses an ensemble of different computer forecasts for comparison of what may happen in the near future. 8 The ensemble is a collaboration among the CDC, 21 academic research groups, five private industry groups, and two governmentaffiliated groups. 8 Each of the models used for COVID-19 forecasting provides a median for the predicted distribution and 11 prediction intervals ranging from a 10% prediction interval to a 98% prediction interval. 8 These forecasts are updated every 4 weeks as more data accumulate to better ensure the predicative ability of the models and account for new errors. We wanted to see whether we could develop a simple model to describe both previous and future COVID-19 hospitalizations. The graph for total hospital admissions for COVID-19 shows a curve similar to a sine wave with peaks in total hospitalizations occurring in April, July, and December. In this paper, we used regression analysis for total COVID-19 hospitalizations to provide insight into potential factors influencing COVID-19 hospitalizations and predict future hospitalizations.

Methods
Data on total US COVID-19 hospitalizations, positive COVID-19 tests, and COVID-19 mortality were obtained from the CDC and the COVID-19 Tracking Project at the Atlantic (https://covidtracking.com/data/ national). The initial sine wave model was based on a regression analysis of data from April to November 2020. The model was subsequently reassessed by extending the data through December 2020 when new data became available. The data were analyzed using the Graph-pad Prism 9 software for data analysis and data presentation.

results
Using COVID-19 hospitalization data from the COVID-19 Tracking Project at the Atlantic, the total number of COVID-19 hospitalizations was plotted versus time ( Figure 1A and B). A regression analysis was performed on Figure 1A and B using a sinewave with a non-zero baseline model as shown below: The equation used for this regression analysis is given below: y(t) -total COVID-19 hospitalizations t -time in days T -period in days A -the amplitude of the waveform Φ -the phase angle that the waveform has shifted either left or right in radians B -baseline or average value Using 209 data points from April to November 2020, the regression analysis showed that the coefficient of determination (r 2 ) for the COVID-19 hospitalization was 0.8840 ( Figure 1A). This model includes the first two peaks in COVID-19 hospitalizations in April and July of 2020. The amplitude (12,963-14,337 hospitalization patients), period (101.4-104.6 days), phase shift (30.77-31.04 days), baseline (41,396-42,446 hospitalized patients), and frequency (0.00956/day-0.00968/ day) parameters for the model are shown in Table 1A. Each of these values was reported with a confidence The first peak in hospitalizations occurred in April and corresponded to the increase in total COVID-19 hospitalizations in certain states, including New York (Figures 1 and 2). After June, the daily increase in COVID-19 hospitalizations decreased until the middle of July, when the COVID-19 hospitalizations peaked a second time. This second peak was due to outbreaks in the South and West, including Texas ( Figure 2). Subsequently, the daily COVID-19 hospitalizations began to increase in November. This third peak is ongoing and may be related to colder weather and holiday (Thanksgiving and Christmas) gatherings. Different patterns seen in different regions and states reflect differences in local and state policies for managing the spread of COVID-19 at the beginning and later stages of the COVID-19 pandemic.
Data from the Atlantic COVID-19 Tracking Project show a similar trend in cases and hospitalizations for the Northeast, Midwest, Southern, and Western regions of the US (Figures 2 and 3). As shown in Figure 4 interval of 95%. The regression analysis showed that the baseline or average number of hospitalizations (B) was estimated to be between 41,396 to 42,446 COVID-19 patients. The maximum increase or decrease (amplitude-A max ) of COVID-19 patients from the baseline was between 12,963-14,337 COVID-19 patients.
A second regression analysis using a sine wave with a non-zero baseline was performed by extending the data through December 2020. Using 256 data points from April to December 2020, the regression analysis showed that the coefficient of determination (r 2 ) for the COVID-19 hospitalization was 0.1178 ( Figure 1B). Compared to the regression model in Figure 1A, this model includes all three peaks (April, July, and December) in COVID-19 hospitalizations in April and July of 2020. The amplitude (7790-15816 hospitalization patients), period (70.05 days-76.29 days), phase shift (5.355-6.858 days), baseline (48,307-53,975 hospitalized patients), and frequency (0.01311/day-0.01428/ day) parameters for the model are shown in Table 1B. The regression analysis showed that the baseline or average number of hospitalizations (B) was estimated to be between 48,307-53,975 COVID-19 patients. The maximum increase or decrease (amplitude-A max ) of COVID-19 patients from the baseline was between 7,790-15,816 COVID-19 patients.   In addition, changes in hospitalizations follow increases in daily positive COVID-19 tests ( Figure 5). This follows naturally with trends in hospitalization, which tend to increase within two weeks after an increase in positive COVID-19 tests. In contrast, there were higher daily COVID-19 deaths at the beginning of the pandemic than during the summer ( Figure 6). From July, the daily COVID-19 deaths averaged around 1,000 deaths per day. However, the average daily deaths in the winter season has increased to approximately 3,000 deaths per day. Furthermore, the trends in hospitalizations for the coming years remain uncertain. The recent holiday travel for Christmas and New Year celebrations may increase COVID-19 cases and hospitalizations in the month of January.

dIscussIon
Despite the regression analysis providing some insight into the trends of COVID-19 hospitalizations, the trend is likely to change with an increase in positive COVID-19 tests and hospitalizations over the fall and winter seasons. The sine wave model assumes absolute maximum and minimum values for COVID-19 hospitalizations at constant time intervals. However, the sine wave regression model was used for all COVID-19 hospitalizations in the US during the first year of the pandemic. Specifically, the sine wave analysis in this paper is the sum of all the hospitalizations across all 50 states. Furthermore, the lack of historical data makes the sine wave analysis uncertain in predicting when and how many COVID-19 hospitalizations will occur in the US. As shown in Figures 3 and  4, the sine wave regression model does not accurately model individual state hospitalizations. Therefore, the sine wave regression model works by combining state COVID-19 hospitalizations into a single variable while ignoring state-to-state variations in COVID-19 hospitalizations.
The sine wave trend in COVID-19 hospitalizations is unlikely to repeat in the future. The sine wave model assumes a fixed baseline of COVID-19 hospitalizations with a theoretical maximum and minimum values. Specifically, sine waves have upper and lower limits that cause movement to slow, roll over in direction, and then accelerate in the new direction. Similar to an engine, this movement requires a continuous supply of energy or resources to maintain the trend. Sine waves describe processes that are inherently unstable. Furthermore, a sine wave can continue only if there is an external source of energy to replace losses dissipated during the previous cycle. It is unclear whether the COVID-19 pandemic meets either of these conditions for a sine wave.
These problems can be seen in the sine wave regression models shown in Figures 1A and B. The sine wave model predicted peaks in COVID-19 hospitalizations every 3.5 months using total COVID-19 hospitalization data from April-November 2020 ( Figure 1A and Table 1A). However, the sine wave trend disappeared when the December peak in COVID-19 hospitalizations is included ( Figure 1B and Table 1B). This is due to a large increase in peak total COVID-19 hospitalizations in December compared to April and July. This increase in COVID-19 hospitalizations was likely due to increased travel, more family gatherings, and increased indoor activities during the winter seasons. Similar to other viral outbreaks, such as influenza, each outbreak is due to a change in conditions, either a change in the virus, or a change in the environment favoring transmission, or both. The height of the peak is determined by the number of people  susceptible. As cases, hospitalizations, and deaths exhaust susceptible hosts, the growth slows, then rolls over, then declines. The curve does not decline to zero but persists until conditions are ripe for the next outbreak.

conclusIon
The continual spread of SARS-CoV-2 has driven efforts to adapt local and national responses to reduce COVID-19 cases and hospitalizations. In general, mathematical modeling of hospitalizations works best when there is an established pattern of disease transmission from multiple years of data collection; COVID-19 is a novel virus for which we have less than a year's worth of data from which to draw conclusions. Furthermore, there remains uncertainty about the trajectory of COVID-19 cases and hospitalizations in the future, particularly with the recent emergency use authorization of the Pfizer and Moderna COVID-19 vaccines. As shown in this paper, trends can appear or disappear rapidly with changes in social behavior and adherence to public health measures to prevent the spread of COVID-19.
However, the trends in COVID-19 hospitalizations provide insight into the potential risk factors influencing COVID-19 hospitalizations. The CDC data indicate that minority groups, older age, and medical comorbidities (hypertension, obesity, metabolic dysfunction, and chronic lung disease) increase the risk of hospitalization with COVID-19. Among all risk factors though, age remains the most important risk factor for COVID-19 hospitalizations and deaths. As shown in Table 2, the risk for COVID-19 hospitalizations and deaths increases proportionally compared to 18-29-year-old demographic. Beyond the age of infected COVID-19 patients, the trends in COVID-19 hospitalizations also reflect public policy with regard to re-opening the economy, school openings, social gatherings, and public fatigue with wearing masks and/or maintaining social distance. Together, these factors have a direct impact on the incidence of positive COVID-19 tests and increase the likelihood of a person's being hospitalized from COVID-19. However, it is unclear whether policies intended to decrease spread of the virus will be effective long term or merely defer hospitalizations and deaths until the next cycle. Longer term studies are needed to determine whether additional public health measures and other factors related to hospital management may affect patients' mortality, morbidity, and quality of life during the COVID-19 pandemic.