Open-Access Medical Image Repositories
If you would like to add a database to this list, or if you find a broken link, please email stephen@aylward.org.
Image created by AI - Microsoft Image Creator
Data from medical imaging challenges
Challenges typically provide the most current and easiest to use data...
This is the preferred medical imaging challenge portal! I wish everyone used it instead of fracturing the field and making search harder, but...
Portal for 100's of grand challenges in medical imaging:
Most run by academia and featured at international conferences
For example,
CAUSE07: Segment the caudate nucleus from brain MRI.
BIOCHANGE 2008 PILOT: Measure changes.
MS lesion segmentation challenge 08 Segment brain lesions from MRI.
Liver Tumor Segmentation 08 Segment liver lesions from contrast enhanced CT.
EXACT09: Extract airways from CT data.
ANODE09: Detect lung lesions from CT.
VOLCANO09: Quantify changes in pulmonary nodules.
Coronary Artery Algorithm Evaluation Framework: Extract coronary artery centerlines from CTA data.
ROC-Retinopathy Online Challenge: Detect microaneurysms for diabetic retinopathy screening.
60 challenges from a variety of biomedical areas
A growing number of their challenges include a medical imaging component: https://openchallenges.io/challenge?searchTerms=image
Portal for grand challenges in machine learning from Microsoft
2017: Pediatric bone age
2018: Pneumonia
2019: Intracranial hemorrhage
...
2022: RSNA Cervical Spine Fracture AI Challenge
2023: RSNA Screening Mammography Breast Cancer Detection AI Challenge
Managed by MIT MIT Laboratory for Computational Physiology
Physiological and clinical data
Related open-source software
Challenges associated with annual Computing in Cardiology conference
Open Science Framework (OSF)
A collaborative environment for open science research. Also used to host open-acess data, particularly from challenges. For example:
Image-to-Physical Liver Registration Sparse Data Challenge
Access the data at OSF: https://osf.io/u3dxy/
See the open-access publication from SPIE MI.
Commercial grand challenges
Retrospective Image Registration Experiment (RIRE)
THE database that started the concept of medical image challenges
Credited for demonstrating that mutual information typically outperforms landmark-based registration (now a generally accepted notion)
Focused on quantifying medical image rigid and affine registration accuracy
Sites that list and/or host multiple collections of data
These data are typically well formatted and well documented
HuggingFace Datasets: https://huggingface.co/datasets
A collection of multiple, annotated datasets for training and evaluating AI models
Also includes links to pre-trained models that are running on HuggingFace servers - you can upload your data and apply their pre-trained models to it.
Examples include
192 datasets with the term "medical" in their name or description
Lymph node, artery, and vein segmentation from thoracic CT: https://huggingface.co/datasets/andreped/LyNoS
Retinal OCT images: https://huggingface.co/datasets/marcelhuber/downprojection_images
Zenodo: https://zenodo.org/
A collection of multiple datasets - providing a way to cite hosted data in your publications
Over 1000 datasets with the filter "Open Access" "Zip File" "Dataset" and the search term "Medical"
Examples include
AeroPath: Thoracic CT with segmented airways: https://zenodo.org/records/10069289
EEG Time series data: https://zenodo.org/records/4540350
COVID-19 CT Lung: https://zenodo.org/records/3757476
100,000 hisotological images of colorectal cancer: https://zenodo.org/records/1214456
Segmentations of 117 important anatomical structures in 1228 CT images - used to train TotalSegmentator V2: https://zenodo.org/records/10047292
A multi-speaker dataset of real-time two-dimensional speech magnetic resonance images with articulator ground-truth segmentations: https://zenodo.org/records/10046815
The Cancer Imaging Archive (TCIA)
Formerly the National Biomedical Imaging Archive (NBIA):
Lung Image Database Consortium (LIDC)
Reference Image Database to Evaluate Response (RIDER)
Breast MRI
Lung PET/CT
Neuro MRI
CT Colonography
Virtual Colonoscopy
Osteoarthritis Initiative (MIA)
PET/CT phantom scan collection
Contains COVID CT
Data from phantoms, simulated data
Misc. clinical data
Includes links to data de-identification tools
NCI's Imaging Data Commons and Genomic Data Commons
DICOM formatted clinical data and annotations for AI/Cloud
Extension to and contains reformatted data from TCIA
Part of the larger NCI Cancer Research Data Commons effort
Images and datasets from a wide variety of scientific computing (including medical imaging) domains
For example,
Liver tumors with segmentations
Human Carotid
BrainWeb
100 Healthy Brain MRIs: 18-90 years old
Age and gender balanced collection
MRA, T1, T2, and some DTI
Intracranial vessels extracted from select patients.
National Institute for Mental Health's (NIMH's) OpenNeuro.org
BIDS compliant MRI, PET, MEG, EEG, and iEEG data
Includes data from healthy volunteers (See release notes)
Cornell Visualization and Image Analysis (VIA) group
Provides a list of available databases, many of which are also listed here.
SICAS Medical Image Repository
Post mortem CT of 50 subjects
CT, microCT, segmentation, and models of Cochlea
Copies of select challenge data (e.g., BRATS2015)
Medical Imaging and Data Resource Center (MIDRC)
A joint effort between the American College of Radiology (ACR), the Radiological Society of North America (RSNA), and the American Association of Physicists in Medicine (AAPM)
Nightingale Open Science (jump to data at https://docs.ngsci.org/)
A non-profit initiative that works closely with health systems around the world to create and curate de-identified datasets of medical images
Includes imaging, wave-forms (ECG), and other high-dimensional data
UCI Machine Learning Repository
The father of internet data archives for all forms of machine learning.
Stanford AI in Medicine Database
Mix of X-ray, CT, and MRI of chest, hands, etc.
Computer Vision Online Image Archive
Large listing of multiple databases in computer vision and biomedical imaging
Medical Image Databases & Libraries
Data for specific topics or anatomy
NIH Database of 100,000 Chest X-Rays
Images, associated clinical data, annotations, and diagnoses
Cross-sectional MRI Data in Young, Middle Aged, Nondemented and Demented Older Adults
Longitudinal MRI Data in Nondemented and Demented Older Adults
Alzheimer’s Disease Neuroimaging Initiative (ADNI) unites researchers with study data as they work to define the progression of Alzheimer’s disease. ADNI researchers collect, validate and utilize data such as MRI and PET images, genetics, cognitive tests, CSF and blood biomarkers as predictors for the disease.
The Federal Interagency Traumatic Brain Injury Research (FITBIR) informatics system: MRI, PET, Contrast, and other data on a range of TBI conditions
Structured Analysis of the Retina: This research concerns a system to automatically diagnose diseases of the human eye.
Digital Retinal Images for Vessel Extraction (DRIVE)
Digital images and expert segmentations of retinal vessels.
Whole-slide images from The Cancer Genome Atlas's (TCGA) glioblastoma multiforme (GBM) samples
Johns Hopkins Medical Institute's DTI collection
DTI Atlases: adults, children, ...
Duke Center for In Vivo Microscopy
Small animal MRI, CT, ...
MIT Intensive Care Unit Admissions (MIMIC)
60,000 deidentified health data records
Digital Database for Screening Mammography (DDSM)
Large collection with normal and abnormal findings and ground truth.
Japanese Society of Radiological Technology (JSRT) Database
Digital Chest X-ray images with lung nodule locations, ground truth, and controls.
Segmentation in Chest Radiographs (SCR) database
Digital Chest X-ray images with segmentations of lung fields, heart, and clavicles.
Public Lung Database to Address Drug Response
Well documented chest CT images.
Mammographic Image Analysis Society (mini-MIAS) Database
Mammographic images and markup.
Standard Diabetic Retinopathy Database (DIARETDB1)
Digital retinal images for detecting and quantifying diabetic retinopathy.
SpineWeb is an online collaborative platform for everyone interested in research on spinal imaging and image analysis.
MR data of Hips, knees and other sites affected by osteoarthritis
MR brain
3D craniofacial surface measurements
Collection of files intended for 3D printing, but includes volumetric medical scans (i.e., CT and MRI in NRRD format) for a variety of anatomic structures (bones, muscles, vessels).
For example, see
Data geared towards education
A free online Medical Image Database with over 59,000 indexed and curated images, from over 12,000 patients
Image Based Medical Reference: "Find Algorithms, Decision Aids, Checklists, Guidelines, Differentials, Point of Care Ultrasound (POCUS), Physical Exam clips and more"
Simulated and phantom data
Simulated brain MR database.
Free AI-generated photos for academic research
Free Human Anatomy Images and Pictures
Database of high-quality, tracked ultrasound images of phantoms
"a database of tracked sequences of US images from medical phantoms acquired with a methodology that ensures the spatial and force control of the US probe along prescribed trajectories by using a robotic arm and an optical tracking system." (Abdominal and baby phantoms)
Search tools
Google launched Dataset Search, "so that scientists, data journalists, data geeks, or anyone else can find the data required for their work and their stories, or simply to satisfy their intellectual curiosity."