Search here to find large public and licensed datasets
The COVID-19 Open Research Dataset is an extensive machine-readable resource of over 45,000 scholarly articles, including over 33,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset is intended to mobilize researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease. The dataset is updated weekly and contains all COVID-19 and coronavirus-related research (e.g., SARS, MERS) from the following sources: PubMed's PMC open access corpus (using this query: COVID-19 and coronavirus research), additional COVID-19 research articles from a corpus maintained by the World Health Organization (WHO), and bioRxiv and medRxiv pre-prints (using this query: COVID-19 and coronavirus research). Also available is a comprehensive metadata file of 44,000 coronavirus and COVID-19 research articles with links to PubMed, Microsoft Academic, and the WHO COVID-19 database of publications (includes articles without open access full text).
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. They are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak. They have used these data to power their maps and reporting tracking the outbreak, and they are now being made available to the public in response to requests from researchers, scientists, and government officials who would like access to the data to better understand the outbreak. The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. They will publish regular updates to the data in this repository.
The OxCGRT systematically collects information on several different common policy responses national governments have taken in response to the COVID-19 pandemic, scores the stringency of such measures, and aggregates these scores into a common Stringency Index. Eleven indicators of government response are provided; seven indicators are policies such as school closures and travel bans, and four are financial indicators such as fiscal or monetary measures. Data are collected from public sources by a team of dozens of Oxford University students and staff from every part of the world.
The Assessment Capacities Project (ACAPS) provides independent, high-quality, and timely humanitarian analysis to enable crisis responders to better understand and address the needs of the affected population. ACAPS' COVID19 Government Measures Dataset compiles the measures implemented by governments worldwide in response to the coronavirus pandemic. The measures fall into five categories: social distancing, movement restrictions, public health measures, social and economic measures, and lockdowns. Data are compiled by consultation with government, media, the United Nations, and other organizational sources.
Researchers, students, and others in the Models of Infectious Disease Agent Study (MIDAS) create and use computational models to study transmission dynamics of a broad range of infectious diseases. Many MIDAS members are conducting research on COVID-19 and are contributing to an extraordinary international collection of data and information regarding the outbreak. Through its Online Portal for COVID-19 Modeling Research, MIDAS provides access to COVID-19-related data and parameter estimates as well as a software catalog with a list of dashboards and visualizations for following the COVID-19 pandemic.
Temporal data on school closures in response to the COVID-19 pandemic at the country and country region levels.
USAFacts is collecting temporal data on confirmed cases of COVID-19 and deaths at the county level from the Centers for Disease Control and Prevention (CDC) and state- and local-level public health agencies.
Portal to COVID-19-related pre-prints, datasets, presentations, and software housed in the Figshare research repository.
The COVID Racial Data Tracker is a collaboration between the COVID Tracking Project and the Antiracist Research & Policy Center. This team tracks race and ethnicity data from every state that reports it, and is updated twice per week.