SCHARE’s cloud-based platform contains:
- Datasets relevant to chronic diseases, health disparities, health care delivery, and health outcomes research, including place-based factors, lived experiences and other social science behavioral data.
- A project data repository for NIH-funded projects, called Collections, centered on SCHARE Core Common Data Elements for enhanced data interoperability and compliance with NIH Data Management and Sharing policy.
- Secure, collaborative workspaces and for researchers and relevant collaborators.
- Consortium, which functions as a data or administrative center with multiple research sites collecting ongoing data and sharing resources.
- Computational capabilities for collaboratively evaluating, designing, and assessing fit-for-purpose utilization of datasets and algorithms to generate AI models that are effective, transparent, and reproducible.
Registration is required to access the SCHARE platform, learn more and register.
Datasets
STATUS: The SCHARE Datasets collection is accessible to all SCHARE-registered researchers. New datasets are being actively added.
SCHARE Datasets list (PDF, 291 KB)
On SCHARE, researchers can access, link, analyze, and export a wealth of datasets relevant to research in health disparities and health care outcomes, including:
- Public Datasets: publicly accessible, federated, de-identified datasets hosted by SCHARE or hosted by Google through the Google Cloud Public Dataset Program
- Examples: American Community Survey (ACS), Behavioral Risk Factor Surveillance System (BRFSS)
- Project Datasets: publicly accessible and controlled-access, funded program/project datasets using Common Data Elements and shared by NIH grantees and intramural investigators to comply with the NIH Data Sharing Policy
- Examples: Forthcoming datasets such as the HEAN Chronic Disease Center
Datasets are grouped by these categories:
- Economic Stability, Education Access and Quality, Health Care Access and Quality, Health Care Access and Quality, Neighborhood and Built Environment, Social and Community Context) and Health Behaviors
- Diseases and Conditions
Learn more about SCHARE Datasets and access the SCHARE platform (registration required).
SCHARE/PhenX Core Common Data Elements
STATUS: The SCHARE/PhenX Core Common Data Elements are available to all researchers through the National Library of Medicine.
Endorsed by the National Institutes of Health, the SCHARE/PhenX Core Common Data Elements (CCDEs) are standardized questions and responses that can be used across different studies to ensure consistent data collection and facilitate interoperability. CCDEs enable researchers to efficiently design data collection, management, and analysis plans; link data from different sources; and enable data harmonization to generate large datasets for AI use.
Learn more about the SCHARE Core Common Data Elements.
Data Repository
STATUS: The SCHARE Data Repository is available to SCHARE-registered researchers.
The SCHARE Data Repository (SDR) enables researchers to meet the requirements of the NIH Data Management and Sharing policy, which requires the hosting, management, and sharing of data generated by NIH-funded research programs. SCHARE provides a repository for projects focused on population science topics, such as health disparities, health care delivery, behavioral, chronic diseases and public health outcomes. All SCHARE-registered users—including NIH-based researchers, external researchers, and public— can access data within the repository at varying privacy and security levels utilizing the controlled-access process. SCHARE Data Repository offers options for completed and ongoing projects. It also offers a Collection for single or smaller group projects and Consortium for multi-site projects, such as a Data Center and 15+ collection sites. The SCHARE Repository utilizes core common data elements as a means to facilitate data aggregation for AI development that optimizes public health scientific knowledge discoveries and generates tools to monitor and improve health outcomes.
Access the SCHARE Data Repository.
Collaborative Workspaces
STATUS: The SCHARE Collaborative Workspaces are available to all SCHARE-registered researchers.
SCHARE is powered by Terra, an open-source data analysis platform based on Google Cloud Platform. Terra was developed by the Broad Institute of MIT and Harvard in collaboration with Microsoft and Verily.
Using SCHARE's Terra resources, researchers and their collaborators can access and cross-link the same publicly available or controlled-access data. They can also create secure online spaces for collaboratively running large-scale analyses and sharing reproducible results and resources.
SCHARE supports interactive analysis tools such as Jupyter notebooks. Jupyter notebooks are human-readable executable documents that can be run to perform advanced data analyses, including artificial intelligence and machine learning tasks, using coding languages such as Python and R. The platform also supports Dockstore as a repository for Docker-based analysis workflows that allow users to automate basic steps in their analyses.
- Register for SCHARE
- Explore the SCHARE Terra Workspace, or create your own workspace using the tutorials accessible from our Tutorials and Resources page.
SCHARE-Chronic Disease NAIRR Pilot Project
STATUS: The SCHARE-Chronic Disease NAIRR Pilot Project is Active.
The National Artificial Intelligence Research Resource (NAIRR) is a vision for a shared national research infrastructure for responsible discovery and innovation in AI. The NAIRR is a concept for a national infrastructure that connects U.S. researchers to computational, data, software, model and training resources they need to participate in AI research. The NAIRR pilot is a proof-of-concept for the eventual full-scale NAIRR, which aims to bridge this gap and ensure that AI resources and tools are accessible to the broad research and education communities in a manner that advances trustworthy AI and protects privacy, civil rights and civil liberties.
To support these efforts, the SCHARE-Multiple Chronic Diseases Disparities Research Consortium Pilot forms a unique collaborative relationship between community partners, academia, and SCHARE to use big data and cloud computing data science analytics to increase the prevention, treatment, and management of multiple chronic diseases, such as diabetes, obesity, hypertension, coronary heart disease, congestive heart failure, chronic kidney disease, stroke, and certain cancers. The data warehouse includes chronic disease, ascribed and acquired attributes, and relevant environmental and living conditions data, which is mapped to the SCHARE common data elements for increase data interoperability, and highlighted in Think-a-Thons to democratize data use adoption. This collaboration highlights the relevance of community input to foster “fit for purpose” AI applications that provide efficiency, reduce harm and save costs.
Learn more about the SCHARE-Chronic Disease NAIRR Pilot Project.
Page updated March 24, 2026