Recommended Repositories

In general, data should be submitted to discipline-specific, community-recognised repositories where possible, or to generalist repositories if no suitable community resource is available.

This list is derived from Scientific Data’s recommended repository list. The list is also available in figshare for free reuse, with attribution, by others. Scientific Data is an open access data journal published by Springer Nature.

Authors should consult journal information for authors in case of more specific repository recommendations. Some Springer Nature journals, such as Human Genome Variation, maintain their own research data repositories.

This list is not currently comprehensive for all research disciplines. Researchers from disciplines such as Computer Science and the Humanities should also consult the global registry of research data repositories re3data.org, or consider one of the general repositories from the recommended repositories list, such as figshare.

If an author’s preferred repository is not listed below,  in the first instance we encourage repository managers to investigate listing their repositories with re3data.org. Life sciences repositories should also be listed with Biosharing. For those that meet the criteria, they should apply for listing with Scientific Data.

See our data repositories FAQs for more information.

Biological sciences

Top

Nucleic acid sequence

Sequence information should be deposited following the MIxS guidelines.
Simple genetic polymorphisms or structural variations should be submitted to dbSNP or dbVar (please note that these repositories cannot accept sensitive data derived from human subjects); the NCBI Trace Archive may be used for capillary electrophoresis data, while SRA accepts NGS data only.

Nucleic acid sequence repositories
Database of Genomic Variants Archive (DGVa)
dbSNP
dbVar

DNA DataBank of Japan (DDBJ)

EBI Metagenomics

EMBL Nucleotide Sequence Database (ENA)

European Variation Archive (EVA)
GenBank
NCBI Assembly

NCBI Sequence Read Archive (SRA)

NCBI Trace Archive

Protein sequence

Protein sequence repositories

UniProtKB*

Molecular & supramolecular structure

These repositories accept structural data for small molecules (COD); peptides and proteins (all); and larger assemblies (EMDB).

Molecular & supramolecular structure repositories
Biological Magnetic Resonance Data Bank (BMRB)

Coherent X-ray Imaging Data Bank (CXIDB)

Crystallography Open Database (COD)

Electron Microscopy Data Bank (EMDB)

Protein Circular Dichroism Data Bank (PCDDB)
Structural Biology Data Grid

Worldwide Protein Data Bank (wwPDB)

Neuroscience

These data repositories all accept human-derived data (NeuroMorpho.org additionally accepts imaging data from other organisms). Please note that human-subject data submitted to OpenfMRI must be de-identified, while FCP/INDI can handle sensitive patient data.

Neuroscience repositories
Functional Connectomes Project International Neuroimaging Data-Sharing Initiative (FCP/INDI)

NeuroMorpho.org

OpenfMRI

Omics

Please refer to the MIAME standard for microarray data. Molecular interaction data should be deposited with a member of the International Molecular Exchange Consortium (IMEx), following the MIMIx recommendations.
For data linking genotyping and phenotyping information in human subjects, we strongly recommend submission to dbGAP or EGA, which have mechanisms in place to handle sensitive data.

Omics repositories

ArrayExpress

Biological General Repository for Interaction Datasets

Database of Interacting Proteins (DIP)

dbGAP
The European Genome-phenome Archive (EGA)

Gene Expression Omnibus (GEO)

GenomeRNAi

IntAct
Japanese Genotype-phenotype Archive
NCBI PubChem BioAssay

Metabolomics

Metabolomics data should be submitted following the MSI guidelines.

Metabolomics repositories

MetaboLights

Proteomics

We ask authors to submit proteomics data to members of the ProteomeXchange consortium (listed below), following the MIAPE recommendations.

Proteomics repositories

PeptideAtlas

PRIDE

ProteomeXchange

Taxonomy & species diversity

Taxonomy & species diversity repositories
Global Biodiversity Information Facility (GBIF)

Integrated Taxonomic Information System (ITIS)

KNB: The Knowledge Network for Biocomplexity

MorphoBank.org

NCBI Taxonomy*

Mathematical & modelling resources

Mathematical & modelling resources repositories

BioModels Database

Kinetic Models of Biological Systems (KiMoSys)

Cytometry

Cytometry repositories

FlowRepository

Organism-focused resources

These resources provide information specific to a particular organism or disease pathogen. Where applicable, data records should be submitted both to a community repository and to one suitable for the type of data (e.g. transcriptome profiling; please see above).

Organism-focused resources repositories

Eukaryotic Pathogen Database Resources (EuPathDB)

FlyBase

Influenza Research Database

Mouse Genome Informatics (MGI)
Rat Genome Database (RGD)

VectorBase

Xenbase

Zebrafish Model Organism Database (ZFIN)

Chemistry & chemical biology

Top

Health sciences

Top

Some of the repositories in this section are suitable for datasets requiring restricted data access, which may be required for the preservation of study participant anonymity in clinical datasets. We suggest contacting repositories directly to determine those with data access controls best suited to the specific requirements of your study.

Health sciences repositories

National Addiction & HIV Data Archive Program (NAHDAP)

National Database for Autism Research (NDAR)

Cancer Imaging Archive

ClinicalTrials.gov
Virtual Skeleton Database (SICAS medical image repository)
PhysioNet
National Database for Clinical Trials related to Mental Illness (NDCT)

Research Domain Criteria Database (RDoCdb)

Physics, astrophysics & astronomy

Top

Generalist repositories

Top

Springer Nature encourages authors to archive data to one of the above data-type specific repositories. However, generalist repositories can handle a wide variety of data, and may also be appropriate for storage of associated analyses, or experimental-control data, supplementing the primary data record.

Generalist repositories

Dryad Digital Repository

figshare

Harvard Dataverse

Open Science Framework

Zenodo


Other repositories

Top

In areas where well-established subject or data-type specific repositories exist, we ask authors to submit their data to the appropriate resources. If none of the discipline-specific repositories listed above are suitable, we are glad to support the use of other repositories if they are able to mint DataCite DOIs for hosted data, and share data under open terms of use (for example the CC0 waiver). These might include institutional, project specific or national archives, which only accept data deposition from researchers associated with a specific institution, project, or nation respectively.