We will use innovative computing to advance epidemiological science.
- Develop exa-scale enabled analytic methods and simulations of spreading process on multi-scale multi-layer (MSML) networks with a billion or more nodes.
- Use simulation-based methods for analyzing complex interventions to contain real-time epidemics.
- Combine simulations with HPC-enabled machine learning methods to support modeling and decision support tasks.
Scalable high-performance simulations, analytics, and ML
In  we described an integrated, data-driven operational pipeline based on national agent-based models to support federal and state-level pandemic planning and response. The pipeline consists of (i) an automatic semantic-aware scheduling method that coordinates jobs across two separate high performance computing systems; (ii) a data pipeline to collect, integrate and organize national and county-level disaggregated data for initialization and post-simulation analysis; (iii) a digital twin of national social contact networks made up of 288 Million individuals and 12.6 Billion time-varying interactions covering the US states and DC; (iv) an extension of a parallel agent-based simulation model to study epidemic dynamics and associated interventions. This pipeline can run 400 replicates of national runs in less than 33 hours, and reduces the need for human intervention, resulting in faster turnaround times and higher reliability and accuracy of the results. This pipeline has been regularly used for scenario projections carried out using national agent-based models and have been used to support policymaking. The projections generated by the pipeline have been summarized in the Morbidity and Mortality Weekly Report (MMWR) , on the live website, and also briefed to the White House by the COVID-19 Response Team . This work has been discussed in the national media and by the head of the CDC on one of their video briefings in Summer 2021. This paper was one of the six finalists for the 2021 ACM Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research, and was presented at the 2021 ACM Supercomputing Conference. The archival version of this paper will appear in an upcoming special issue of the The International Journal of High Performance Computing Applications (IJHPCA).
In  we presented a high-performance, agent-based simulation model for studying contact tracing during the ongoing Covid-19 pandemic. While this work was motivated by the COVID-19 pandemic, the framework and design are generic and can be applied in other settings. Contact tracing (CT) is an important and effective intervention strategy for controlling an epidemic. Its role becomes critical when pharmaceutical interventions are unavailable. CT is resource intensive, and multiple protocols are possible, therefore the ability to evaluate strategies is important. This work extended our HPC-oriented ABM framework EpiHiper to efficiently represent contact tracing. The main contributions are: (i) Extension of EpiHiper to represent realistic CT processes. (ii) Realistic case study using the VA network motivated by our collaboration with the Virginia Department of Health. The work was valued by VDH; Elena Diskin, COVID-19 Containment, Epidemiology Program Manager, Virginia Department of Health, commented, thus: The COVID-19 Contact Tracing team worked collaboratively with the Biocomplexity Institute during the first half of 2021 to consider and research key questions regarding the impact and effectiveness of contact tracing as an intervention strategy for controlling an epidemic. The team met weekly to align on the study questions, approach, and parameters. The scalable agent-based model developed as an outcome of these efforts provided important insights into how Virginia should dedicate and focus contact tracing resources.
In  we described a study on the role of vaccine acceptance in controlling the spread of COVID-19 in the US using AI-driven agent-based models. Our study uses a 288 million node social contact network spanning all 50 US states plus Washington DC, comprised of 3300 counties, with 12.59 billion daily interactions. The highly-resolved agent-based models use realistic information about dis- ease progression, vaccine uptake, production schedules, acceptance trends, prevalence, and social distancing guidelines. Developing a national model at this resolution that is driven by realistic data requires a complex scalable workflow, model calibration, simulation, and analytics components. Our workflow optimizes the total execution time and helps in improving overall human productivity. This work develops a pipeline that can execute US-scale models and associated workflows that typically present significant big data challenges. Our results show that, when compared to faster and accelerating vaccinations, slower vaccination rates due to vaccine hesitancy cause averted infections to drop from 6.7M to 4.5M, and averted total deaths to drop from 39.4K to 28.2K nationwide. This occurs despite the fact that the final vaccine coverage is the same in both scenarios. Improving vaccine acceptance by 10% in all states increases averted infections from 4.5M to 4.7M (a 4.4% improvement) and total deaths from 28.2K to 29.9K (a 6% increase) nationwide. The analysis also reveals interesting spatiotemporal differences in COVID-19 dynamics as a result of vaccine acceptance. To our knowledge, this is the first national-scale analysis of the effect of vaccine acceptance on the spread of COVID-19, using detailed and realistic agent-based models.
We have continued to work on Loimos, the parallel epidemic diffusion simulator based on Charm++. This is a joint effort between UVA and UMD. The basic system builds our earlier effort called Charmsimdemics. The system is specifically designed to scale to very large networks (national scale models) and be able to use exascale computing architectures with 1 Million cores or more. Fault tolerance, efficient partitioning of the network, and load balancing become important components. We have ported the code to UVA’s cluster, and have tested its performance on a few state-level synthetic populations. We are preparing to submit the paper to ACM Supercomputing conference.
We study the problem of designing vaccination strategies in network models of epidemic spread satisfying the budget constraints to minimize the spread of an outbreak. This problem is computationally expensive even in a non-adaptive intervention setting where vaccinations are determined ahead of time and are not dependent on the current state of the epidemic, and vaccines are immediately effective with 100% efficacy. In , we use ideas from influence maximization for this problem. We observe that at low transmission probabilities, the number of nodes not infected is a submodular function. While this is not true in general, this motivates the use of nodes that maximize influence as a vaccination strategy. However, constructing such a set is inherently sequential since it is based on a greedy algorithm. We present a new parallel algorithm based on greedy hill climbing, and present an efficient parallel implementation for distributed CPU-GPU heterogeneous platforms. We show strong scaling results on up to 128 nodes of the Summit supercomputer. Our parallel implementation is able to significantly reduce time to solution, from hours to minutes on large networks
 P. Bhattacharya, J. Chen, S. Hoops, D. Machi, B. Lewis, S. Venkatramanan, M. L. Wilson, B. Klahn, A. Adiga, B. Hurt, J. Outten, A. Adiga, A. Warren, Y. Y. Baek, P. Porebski, A. Marathe, D. Xie, S. Swarup, A. Vullikanti, H. Mortveit, S. Eubank, C. L. Barrett, and M. Marathe, “Data-Driven Scalable Pipeline using National Agent-Based Models for Real-time Pandemic Response and Decision Support,” International Journal of High Performance Computing Applications, 2022. Finalist for the 2021 ACM Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research; To Appear.
 R. K. Borchering, C. Viboud, E. Howerton, C. P. Smith, S. Truelove, M. C. Runge, N. G. Reich, L. Contamin, J. Levander, J. Salerno, et al., “Modeling of future covid-19 cases, hospitalizations, and deaths, by vaccination rates and nonpharmaceutical intervention scenarios–United States, April–September 2021,” MMWR Morb Mortal Wkly Rep 2021, vol. 70, pp. 719–724, May 2021.
 R. Walensky and A. Fauci. https://www.whitehouse.gov/briefing-room/press-briefings/ 2021/05/05/press-briefing-by-white-house-covid-19response-team-and-public-health- officials-34/, 2021. Last accessed: October 2021.
 S. Hoops, J. Chen, A. Adiga, B. Lewis, H. Mortveit, H. Baek, M. Wilson, D. Xie, S. Swarup, S. Venkatra- manan, J. Crow, E. Diskin, S. Levine, H. Tazelaar, B. Rossheim, C. Ghaemmaghami, R. Early, C. Barrett, M. V. Marathe, and C. Price, “High performance agent-based modeling to study realistic contact tracing protocols,” in In Proceedings of the 2021 Winter Simulation Conference, 2021.
 P. Bhattacharya, D. Machi, J. Chen, S. Hoops, B. Lewis, H. Mortveit, S. Venkatramanan, M. L. Wilson, A. Marathe, P. Porebski, B. Klahn, J. Outten, A. Vullikanti, D. Xie, A. Adiga, S. Brown, C. Barrett, and M. Marathe, “AI-Driven Agent-Based Models to Study the Role of Vaccine Acceptance in Controlling COVID-19 Spread in the US,” in Proceedings of the 2021 IEEE International Conference on Big Data (IEEE BigData), pp. 1566–1574, Dec. 2021.
 M. Minutoli, P. Sambaturu, M. Halappanavar, A. Tumeo, A. Kalyanaraman, and A. Vullikanti, “Preempt: Scalable epidemic interventions using submodular optimization on multi-gpu systems,” in Proc. SC, SC ’20, IEEE Press, 2020.
A cyber-environment for real-time epidemic science: Team members have developed multiple dashboards to provide timely and accurate information to public health agencies, the general population, and the research community.
At the beginning of the pandemic, there was a lot of confusion about where cases of COVID-19 were present and how prevalent the virus was. We initially developed the COVID-19 Surveillance Dashboard as a one-stop source where researchers and the public could easily visualize where cases were emerging and how many cases there were by region. Dashboard coverage rapidly grew to include all countries/territories that had confirmed cases, with province- or state-level detail for 20 countries, and county-level details for the US. This tool was the first dashboard we developed under this project and is the most popular of our dashboards: by March 2022, it had been visited by 1.2 million users from 220 countries; and more than 71 million requests were processed on the main map layer.
The dashboard, which was released in early February 2020, initially covered only case counts as collated by Johns Hopkins University. Over time, however, we expanded to include additional open data sources, including the World Health Organization (WHO), Tencent, Wikipedia, and others; as new sources were added, some were retired, so our current data source dependencies include The New York Times, USAFacts, the World Health Organization (WHO), Wikipedia, JHU (for Mainland China data), and Our World in Data for vaccination data. The dashboard is updated multiple times per day in order to provide users with the most up-to-date information.
As the pandemic progressed, additional features were added to the dashboard, including:
The ability to view counts of Confirmed Cases, Deaths, Estimated Active, and Estimated Recovered per day on the Heat Map, Chart, and Data views; users can also drill down to view these values for specific regions or as counts per 100K people in the population.
A timeline tool that allows users to view case counts for a specific date (default is current) and to cycle through the days of the pandemic to see how conditions have evolved over time.
The total number of vaccines administered by country (and, for the US, by state); this includes the number of vaccines administered, people who have received at least one vaccine, and people who are fully vaccinated.
- An “Analytics” tab where users can perform analytical queries, such as Top 6 deaths in world as a different way to focus in on the data that is of interest to them.
A “Spatial Temporal” tab (for the United States only) where the number of new cases per week is displayed in chart and map formats as an animation. Users can choose to view this information for USA, a specific HHS region, or at the county level for a selected US state.
The front-facing COVID-19 Surveillance Dashboard is a Single Page Application which loads the current data into an HTML page for display and rendering, and the display is updated dynamically as the user interacts with the application. There are three main development APIs that are incorporated into the development of the dashboard: (i) the ArcGIS API for performing the map rendering and functionality; (ii) Responsive Web Design to ensure usability across different displays, including mobile devices; and (iii) amCharts for the simple, yet flexible, charting capabilities. As the first of our dashboards to be developed, the COVID-19 Surveillance Dashboard framework was adopted as a standard in the development of our other dashboards, covered in the following sections.
Our paper describing our dashboard was presented at the IEEE International Conference on Big Data . In addition to the history and technical details summarized above, the paper addresses some of the difficulties of using open data sources and proposes a standard for publishing open source data. It also describes in detail our algorithm for estimating recovery and active case counts, since this is important information for researchers that is often hard to obtain.
One of the side effects of the COVID-19 pandemic is scarcity of medical resources: as more people become ill, hospitals can expect to see more patients, possibly exceeding their limited capacity. However, if public health officials can predict in advance where crises are likely to occur (and where they are less likely), they can prioritize distribution of personnel, ventilators and other medical supplies, plan where field hospitals may be most effective, and essentially address the issue in a proactive, rather than reactive, way.
Our epidemiologists were already simulating projections of the number of hospitalizations that could be expected per region. Using hospital (bed) capacities by the Virginia Healthcare Alerting and Status System (VHASS) region for Virginia and by state for the US, we were able to use our hospitalization projections to estimate where hospitals could expect to exceed capacities, and provide a dashboard for public health officials where they could rapidly identify those regions on their own. Our dashboards1 provide the following visualizations:
The number of projected hospitalizations per region on a color-coded heat map.
The maximum projected percentage of occupied hospital beds per region on a color-coded heat map; we did not attempt to ascertain the occupancy per hospital because it was assumed that patients could be transferred between hospitals within a region relatively easily.
A timeline allowing users to update the heat maps from week to week over a six-week period, with a movie feature that iterates through the weeks to allow users to see how hospital occupancies are expected to change over time.
A chart on the right side of the screen to allow users to see the time series of projected hospitalizations and hospital occupancy thresholds over a six-week period.
The models used to simulate the projections were based on assumptions about the spread (e.g., varying levels of transmissibility), possible government restrictions (e.g., lifting restrictions on Memorial Day or two weeks later), adherence to government or public health guidance, vaccination and other interventions, and/or emergence of new variants like Delta and Omicron. Hospitalization projections depended on these model- sets, also known as scenarios, so each week the MRDD instances typically display hospitalization rates under 3 or more such scenarios so the impact of different conditions can be easily compared.
In order to translate the projected weekly hospitalizations into projected maximum weekly occupancy, we had to take a couple of additional factors into account:
It is unreasonable to assume that a hospital region’s entire capacity could be dedicated to COVID-19 patients; room must be spared for patients with other emergencies. The default assumption is that 80% of the hospital capacity would be taken up with non-COVID-19 patients, and that most hospital regions could accommodate up to 120% capacity with manageable effort. However, a Hospital Capacity slider allows users to change those settings in order to see how reducing the capacity reserved for non- COVID-19 patients (e.g., by cancelling elective surgeries) or adding additional beds (e.g., adding a field hospital) would impact occupancy.
The duration of people’s stays in the hospital also has a significant impact on occupancy levels. If a hospital that has the capacity for 50 COVID-19 patients gets 10 new patients per day, and the average stay is 2 days, then the hospital will never exceed capacity; however, if the average stay is 7 days, then occupancy will match or exceed capacity by the fifth day. In the Virginia instance of the tool, the default duration is set to 8 days, and to 7 days for the United States' instance; however, the Duration slider allows users to adjust the average hospital stay in order to assess hospital occupancy using local hospital duration averages.
Updates for 2022. The Medical Resource Demand Dashboards helped inform public health officials, especially during the surges caused by the Delta and Omicron variant. The following enhancements were also implemented to help policymakers make more informed decisions:
Dynamic Updates to Bed Counts: For the first year, the dashboards depended on a static list of bed counts to determine capacity per region; these values impacted the calculation of the percentage of occupied beds. However, with the addition of beds during surges, then the ebbing of staffing during post-surge declines in demand, these numbers could become stale. For the Virginia instance, we switched to using US Health and Human Services ”staffed beds” statistics for a more dynamic and up-to-date source for available resources. (Note: this change is invisible to the user.)
Comparison to Actual Occupancy: In order for the users to assess how our projections compared to actual occupancy, we added an Occupancy chart to the Virginia instance of the dashboard that includes three weeks of actual maximum hospital occupancies per VHASS region (data provided by VHHA) that can be compared to the projected values by scenario (see Figure 2). In addition to helping evaluate how projections measure up across scenarios, this chart is also useful for checking to see if the duration of hospital stay is well-calibrated to the current conditions; for example, the average duration of 8 days is still a relatively good measure for the state of Virginia as a whole, but this chart helped us see that there is significant variation across the VHASS regions, from Northern (4-5 days) to Central and Eastern (9 days).
Chatbot Tool: A Chatbot Tool has been released to the Virginia instance of the tool so users can get quick answers to questions about the currently deployed projections. The release of the tool for the US instance is imminent.
We have submitted a paper on the Virginia instance of this dashboard to KDD 2022 .
The Stanford group combined a full mobility network, constructed from SafeGraph mobility data and demo- graphic data from American Community Survey, with an SEIR model to predict the spread of COVID-19 in 10 Metropolitan Statistical Areas (MSAs). What they found was that their mobility-network-enhanced SEIR model performed better in predicting spread of COVID-19 in those MSAs than either an aggregate-mobility- data-enhanced SEIR model or the SEIR model alone; it was responsive to traffic fluctuations to Points of Interest like restaurants, gyms, and churches; it predicted that minorities would suffer greater impact from the pandemic; and, most relevantly for our application, it found that targeted (strategic) closure of Points of Interest (POIs) could be more effective than uniform closures in slowing the disease spread. Reducing COVID-19 cases is our primary focus to be sure, but targeted closure also reduces the economic impact of the pandemic .
The MSAs that the Stanford group examined in their paper in Nature were large metropolitan areas, but the Virginia Department of Health (VDH) was interested in seeing if something similar could work at the state level. The UVA team partnered with Stanford’s team to recalibrate their model for three MSAs in Virginia, and used their projections to develop a dashboard where VDH could experiment with different mobility levels to try to ascertain where it would be most effective to reduce mobility. Stanford began by calibrating their model for three MSAs in Virginia:
- Washington-Arlington-Alexandria-DC-VA-MD-WV (referred to here for brevity as Washington DC);
- Richmond-VA (Richmond); and
- Virginia-Beach-Norfolk-Newport-News-VA-NC (Eastern).
Simulations from their calibrated models for November 2020 to early January 2021 fit the actual COVID- 19 activity over those two months very well.
For the dashboard, the Stanford team ran simulations for each of the three MSAs at foot traffic levels of 0%, 50%, 100%, and current levels relative to the corresponding foot traffic before the pandemic (e.g., comparing November 2020 foot traffic to November 2019 foot traffic) for each of five Point of Interest (POI) categories:
- Restaurants, including full- and limited-service restaurants, snack bars, and cafes;
- Essential Retail, including grocery stores, pharmacies, and convenience stores;
- Retail, including clothing, hardware, book, and pet stores, and all other non-essential retail stores;
- Gyms and Fitness centers; and
- Religious Organizations, like churches, synagogues, and mosques.
The Dashboard allows users to visualize the impact of increasing or reducing foot traffic to different POIs on COVID-19 case counts. To this end, the dashboard (shown in Figure 4) is divided into 4 panels and has an additional popup report. The panel titled ”Visits to Points of Interest” on the left-hand side of the dashboard is the navigation bar of the application; users can see what the current foot traffic levels are for the selected region, and drag indicators along gliders to change the foot traffic mobility in POIs of that category. As they change the mobility levels on the navigation bar, the other three visible panels are updated accordingly. The upper right panel is the map panel; from here, users will see areas that show a decrease in case counts due to the mobility change highlighted in blue, and areas that show an increase in red; users can click on dates at the top of the bar to see cumulative case counts for one, two, three, and four week periods; finally, users can hover over areas on the map to see more details about the case counts, or select an area to highlight on the other two panels. The plot on the lower middle panel allows the user to view the cumulative case counts over the four week period at current and target mobility levels, along with uncertainty bounds. The table in the lower right panel shows the case count and percentage difference that would occur by MSA if foot traffic levels are changed. Finally, the user can click on the ”Mobility History” button on the navigation bar to display the ”Mobility data analysis for Virginia” report, which includes a full history of mobility in the three MSAs since the beginning of the pandemic, with a number of visualizations where users can review and make their own assessments. Our paper about the model and the associated dashboard was awarded the Best Paper for Applied Science award at KDD 2021 .
Unlike the surveillance dashboard and the MRDD, this dashboard is not open to the public. The reason for this is that a user viewing this tool in isolation might come to conclusions that are not fully informed, leading them to boycott certain POIs. Public health officials, on the other hand, can use this as part of a corpus of tools, and come to a more comprehensive conclusion regarding closures. For this reason, we maintain a password-protected site only for our colleagues in public health, although the instance we stood up for the KDD 2021 paper which shows projections for January 2021 is available at https://nssac.bii. virginia.edu/covid-19/kdd-command/.
When the COVID-19 vaccine first became available in late 2020, the initial challenge was in meeting the demand for vaccines. However, six months later, while the majority of people who wanted the vaccine had received it, there were pockets of the population where vaccination rates continued to be low. Part of this was due to availability of the vaccine, as in rural areas, but even in well-populated areas, vaccination rates lagged in some demographic groups, such as young people, some races, and among ethnic LatinX.
In an effort to improve COVID-19 vaccination rates in undervaccinated populations, Virginia Department of Health (VDH) started deploying Mobile Vaccination Sites in areas where they thought these populations could be reached. The Mobile Units were stocked with the Johnson & Johnson (one shot) vaccine so people might see the site and take the opportunity to get vaccinated without the need to schedule a second dose. Initially, site locations were selected by local public health officials relying largely on intuition, but they wanted a more consistent method for identifying these sites.
In collaboration with Stanford and some of our public health collaborators, we proposed that mobility data from SafeGraph’s Weekly Patterns dataset could be used to identify high-traffic areas where mobile vaccination sites could be successful. However, looking exclusively at highly traveled areas had the problem that many of the high-traffic sites were also in areas where vaccination rates were already high; VDH suggested it would be more useful if we could identify areas that were highly visited by specific demographic groups. The initial demographic groups VDH wanted to target were people aged 20-30, people aged 30- 40, Black people, LatinX people, and populations in areas where vaccination rates were low; subsequently, broader groups were added, including people aged 20-40 and people who were either Black or LatinX.
Although we could use SafeGraph data to identify high-traffic areas, that dataset did not include demographic information. However, it did include the estimated number of visitors to each Place of Interest (POI) (eg, stores and restaurants) from different Census Block Groups (CBGs). By using American Community Survey (ACS) data to get the demographic breakdowns of the different CBGs, we were able to interpolate the number of visitors to each POI by demographic group. Because of precision issues in the collection of the mobility data, instead of aggregating this to specific points of interest, we chose to aggregate visits using S2 Geometry down to the L14 level areas (about 80-acre increments). We identified the top 25 highest-traffic L14 areas in each county by demographic group, and have been submitting lists of these sites to VDH as deliverables every week since June 2021; these deliverables are presented both as Comma Separated Value (CSV) files and as HTML maps. In addition, we provide the latitude and longitude for the centroid of the L14s, the address of the highest visited POI within the L14 and the day of the week that saw the most visits in the previous week. This process is described in more detail in our paper that was presented at the Innovative Applications of Artificial Intelligence 2022 conference in February .
The work was appreciated by VDH; Frank Diaz, MA CVC/Mobile Program Manager, Virginia Department of Health said The UVA mobility data has been beneficial for planning in our new mobile vaccine unit program. We use this data to support vaccination efforts through our local health districts. The data allows us to pinpoint our populations with the lowest vaccine uptake and not be stagnant in offering vaccination services. We now have the data to target 5-10 places in a week if we do not get good enough participation at our standard mobile vaccine sites to take the mobile units out and target populations with the lowest vaccination uptake. We use the UVA mobility data to focus vaccination efforts on these target groups while also providing additional accessible and convenient locations to be vaccinated. Using the UVA’s mobility data gives us the unique advantage of being able to bring the vaccine to members of the commonwealth instead of them seeking out where to get vaccinated, an advantage that will contribute to the fight against COVID-19.
Although we have not built our own dashboard using this data, VDH has created its own private dashboard where local public health officials can access the weekly recommendations.
As we were investigating mobility data for other applications, and considering its potential for improving our models or offering further insight into the COVID-19 pandemic, we developed a dashboard in collaboration with Stanford University to allow us and others to visualize patterns in mobility (eg, aggregate visits to POIs) in the United States during the pandemic, and compare mobility counts to trends in COVID-19 cases to see what correlations could be made. The COVID-19 Mobility Surveillance Dashboard is built on the same framework designed for the COVID-19 Surveillance Dashboard referred to in section 1. It relies on SafeGraph Monthly Patterns data to provide mobility (visit) counts to POIs per region, and the COVID-19 Surveillance Dashboard for COVID-19 Confirmed Case counts. A single-page application, it is divided into three panels: the Control Panel, the Map Panel, and the Chart/Data Panel. These are described in more detail below:
The Control Panel: The control panel allows the user to choose between visualizations (Mobility – the default – or Confirmed COVID-19 Cases), change the displayed date on the map, and filter on mobility type (currently limited to Restaurants, Essential Retail, Retail, Gyms, Religious Organizations, and All Other Categories). It should be noted that, for privacy reasons, a category or subcategory is only explicitly named for the selected region if there are at least 50 such POIs of that category/subcategory in that region; visits to categories/subcategories that fall below that threshold are aggregated under the “All Other (Category)” subcategory classification, or ”All Other Categories” if fewer than 50 POIs of that category exist in the region.
Map Panel: The map panel on the left displays either the aggregated mobility counts or the confirmed case counts, depending on the toggle selection in the control panel for the selected date; if toggled to Mobility, the mobility counts are also filtered by the selected NAICS categories/subcategories. Users can click on regions of the map to see a popup containing the following metrics based on the date and selected filters: population, mobility count on that day, mobility growth rate, confirmed COVID-19 cases, mean COVID-19 confirmed cases, and the date of the last data update. From the popup, the user can also select “Next Level” to view mobility and case counts at the county level. The map panel also has a timeline where users can cycle between dates on the map so users can watch the ebb and flow of either mobility or confirmed case counts across regions over time. While the Mobility map shows the aggregated visit counts, the COVID-19 Confirmed Cases Map displays the mean confirmed cases per region over the previous 7 days as of the selected date; this is because many states stopped reporting case counts over the weekends, leading to 0 case counts for some states on Saturday and Sundays, then a surge on Monday.
Chart Panel: The chart panel on the right displays mobility or mean confirmed case counts depending on the selected toggle, date, and filter categories from the control panel for the selected region on the map. If the display is toggled to Mobility, the chart displays mobility by category as stacked bar charts, with confirmed case counts as an overlying time series. If the display is toggled to Confirmed Cases, then the chart displays the mean case counts. Either way, the graph can be displayed either for the previous 30 days, 60 days, or 6 months, and it can also be displayed either as a stacked bar chart (the default) or as time series plots; the user can also toggle to a Data tab on this panel where they can see the raw data.
The COVID-19 VaxStat Dashboard is a rapid development dashboard that displays a series of modular reports for assessing the state of vaccination in Virginia and the nation, as well as one report that forecasts vaccine uptake in the near future based Google Trends, COVIDCast and other data. The framework was developed to be modular in order to facilitate the addition of new reports as they are developed or new data becomes available, as well as to remove reports when they are no longer relevant. Like the COVID-19 Surveillance Dashboard, the data on the dashboard is updated every day so that users can view the status in near-real-time.
The VaxStat Dashboard is made up of two panels: a navigation panel on the left where users can select between reports, and the report panel on the right which has its own controls for filtering or toggling the views within a specific report. The reports that have been developed for this tool are described below.
Vaccinated vs. Vaccine Accepting: This report (seen in Figure 5) displays the percentage of the population that has been vaccinated in Virginia (according to VDH COVID-19 PublicUseDataset vaccination data) along with the percentage that has indicated that they would be willing to be vaccinated according to the Delphi COVID-19 Trends and Impact Survey that has been conducted by Carnegie Mellon University on the FaceBook application. One wrinkle we had to overcome is that the
Delphi COVID-19 Trends and Impact Survey is targeted only to people aged 18 and older, and that survey surveillance can be biased because they are limited to people willing to take online surveys. To correct for these discrepancies, we ”normalize” the acceptance by comparing the number of people who indicated on the survey that they had been vaccinated to the number of people who were reported as vaccinated by VDH. The report itself is broken into two panels: a map panel, where users can toggle between viewing Vaccinated or Vaccine Accepting percentages on the map, and a time series chart that shows the trends of the two threads from December 14, 2020 when the first vaccines became available.
One interesting takeaway from this report is that while vaccinations plateaued in May-June 2021 with about a 10% gap between vaccine accepting and actually vaccinated, that gap narrowed substantially at the advent of the Delta variant, then again when the Omicron variant emerged as the dominant variant.
Partially vs. Fully Vaccinated: This report used to report on the percentage of people in Virginia were partially vaccinated (e.g., only one shot of the Moderna or Pfizer vaccines) or Fully Vaccinated (one shot of Johnson & Johnson vs. the second shot for Moderna and Pfizer). This report was similar in format to the Vaccinated vs. Vaccine Accepting in format – users could toggle between Partially and Fully Vaccinated on the map, and there was a chart on the right – but, in this case, the chart on the right was a stacked bar chart so the summed percentage of partially and fully vaccinated people correlated to the Vaccinated curve in the previous report, providing some validation between reports. The data on this report came from VDH COVID-19 PublicUseDataset vaccination data.
This report was retired when people started taking the Booster shots, because many people deviated from their shot of origin (e.g., taking the Pfizer shot after taking Johnson & Johnson and other iterations), so differentiating between “Partially” and “Fully Vaccinated” became more of a challenge. We will be revising this report based on the number of vaccine doses people receive.
Vaccinated vs. Recent Infections in Virginia Localities: This scatterplot graph plots the counties of Virginia based on percentage of the population who are vaccinated (x-axis) versus the recent number of confirmed cases per 100K in that county (y-axis). The data this report relies on comes from the VDH COVID-19 PublicUseDataset and the COVID-19 Surveillance Dashboard. By and large, this report supports the claim that cases are lower in areas where vaccination is higher, but there are occasional exceptions.
This report was actually developed as one of the plots that we include in our weekly deliveries to VDH, but thanks to the modular design of this dashboard, it could be integrated into the dashboard reasonably easily.
Vaccinated vs. Recent Infections by State: This scatterplot is similar to the one described above, except that it displays vaccination rates vs recent cases per US state. The data sources for the report are the CDC and the COVID-19 Surveillance Dashboard.
Vaccine Administration by Vaccine Type: This report shows how many vaccines were administered by vaccine type (Johnson & Johnson, Pfizer, and Moderna) across the state of Virginia. This report is also divided into the map and chart format that we saw earlier; here, the map is based on the current cumulative number of vaccines administered at the current time, while the chart shows incidence time series for each vaccine type. A couple of interesting takeaways from this report: (i) Although Pfizer was the dominant vaccine across the state, in the Far Southwest region, Moderna dominated. (ii) Although vaccination dropped off considerably in early May after the first wave of people got vaccinated, there was a secondary surge in October corresponding to the recommendation that people get booster shots.
Vax Signals: The Vax Signal report displays the projected percentage of people per county in Virginia who are expected to take the COVID-19 vaccine in the coming weeks based on measures of vaccine hesitancy in those counties. We do this by incorporating data from Google Search Trends, the Delphi COVIDCast Survey, and the CDC’s Household Pulse Survey as measures of vaccine hesitancy across Virginia. Predictions are for the week ending on the date noted. We describe this process in more detail in the paper we submitted to the 2022 International Joint Conference on Artificial Intelligence.
We currently have additional reports in the works. A report on vaccination across different age groups is completed and will be deployed as soon as the data collation process is added to the daily pipeline. The replacement for the Partially vs. Fully Vaccinated report is also under development.
 A. S. Peddireddy, D. Xie, P. Patil, M. L. Wilson, D. Machi, S. Venkatramanan, B. Klahn, P. Porebski, P. Bhattacharya, S. Dumbre, and M. Marathe, “From 5vs to 6cs: Operationalizing epidemic data management with covid-19 surveillance.,” in Proceedings of the IEEE International Conference on Big Data (BigData), 2020.
 M. L. Wilson, S. Venkatramanan, P. Porebski, B. Lewis, A. Adiga, J. Outten, A. Vullikanti, K. K. Dudakiya, A. Telionis, and M. Marathe, “Data driven models and a dashboard for real-time tracking and forecasting medical resource demand,” Submitted, 2022.
 S. Chang, E. Pierson, P. W. Koh, J. Gerardin, B. Redbird, D. Grusky, and J. Leskovec, “Mobility network models of covid-19 explain inequities and inform reopening,” Nature, vol. 589, no. 7840, p. 82–87, 2020.
 S. Chang, M. L. Wilson, B. Lewis, Z. Mehrab, K. K. Dudakiya, E. Pierson, P. W. Koh, J. Gerardin, B. Redbird, D. Grusky, M. Marathe, and J. Leskovec, “Supporting covid-19 policy response with large- scale mobility-based modeling,” in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD ’21, (New York, NY, USA), p. 2632–2642, Association for Computing Machinery, 2021.
 Z. Mehrab, M. L. Wilson, S. Chang, G. Harrison, B. Lewis, A. Telionis, J. Crow, D. Kim, S. Spillmann, K. Peters, J. Leskovec, and M. V. Marathe, “Data-driven real-time strategic placement of mobile vaccine distribution,” medRxiv, 2022.