Dataset | Modality | Language | Task Type | Number of Samples | Year | License |
---|---|---|---|---|---|---|
COUGHVID | Cough audio | English | COVID-19 detection | 20,000 | 2020 | CC-BY 4.0 |
Coswara | Cough, breath, speech | English | COVID-19 detection | 5,000 | 2022 | CC-BY 4.0 |
UK COVID-19 Vocal Audio Dataset | Cough, breath, speech | English | COVID-19 detection | 70,000 | 2023 | OGL v3.0 |
Respiratory Sound Database | Lung auscultation sounds | English | Respiratory disease classification | 920 | 2017 | CC-BY 4.0 |
smarty4covid | Cough, breath, voice | English | COVID-19 detection | 4,600 | 2023 | CC-BY 4.0 |
Bridge2AI-Voice | Voice recordings | English | Voice biomarker research | Not specified | 2025 | Apache-2.0 |
VOICED | Voice recordings | English | Pathological voice analysis | 208 | 2018 | ODC-BY 1.0 |
Perceptual Voice Qualities Dataset | Voice recordings | English | Perpetual voice quality | 360+ | 2020 | CC-BY 4.0 |
COVID-19 Voice Dataset | Voice recordings | English | COVID-19 detection | Not specified | 2023 | CC-BY 4.0 |
ALS IAC Speech Corpus | Speech | English | ALS | Not specified | 2024 | CC-BY 4.0 |
PMC COVID-19 Voice Dataset | Voice recordings | English | COVID-19 detection | Not specified | 2022 | OGL v3.0 |
Dataset | Modality | Language | Task Type | Number of Samples | Year | License |
---|---|---|---|---|---|---|
PulseDB | ECG, PPG, ABP waveforms | English | Cuff-less blood pressure estimation | 5,245,454 | 2023 | ODbL |
MIMIC-BP | ECG, PPG, ABP waveforms | English | Blood pressure estimation | 12,000 | 2024 | ODC-By 1.0 |
Pulse-ECG | ECG images | English | ECG interpretation | 1,160,000 | 2023 | Apache-2.0 |
MTHS Dataset | Video-PPG, ECG signals | English | Heart rate and SpOâ‚‚ estimation | 65 | 2023 | CC BY-NC-ND 4.0 |
Welltory Dataset | Video-PPG, ECG signals | English | Heart rate variability analysis | 21 | 2023 | CC BY-NC-ND 4.0 |
BUT-PPG Dataset | Video-PPG signals | English | Heart rate estimation | 65 | 2023 | CC-BY 4.0 |
Dataset | Modality | Language | Task Type | Number of Samples | Year | License |
---|---|---|---|---|---|---|
Huatuo-26M | Text | Chinese | QA, Dialogue | 26 M+ | 2023 | CC-BY 4.0 |
TCMD | Text | Chinese | Syndrome-Finding Mapping | 100,000 + | 2024 | CC-BY 4.0 |
CMD | Text | Chinese | Medical Dialogue | 25,000 + dialogues | 2020 | MIT |