The Access to Biological Collections Data (ABCD) Schema is an evolving comprehensive standard for the access to and exchange of data about specimens and observations (a.k.a. primary biodiversity data). The ABCD Schema attempts to be comprehensive and highly structured, supporting data from a wide variety of databases. It is compatible with several existing data standards. Parallel structures exist so that either (or both) atomised data and free-text can be accommodated.
Sponsored by Biodiversity Information Standards TDWG - the Taxonomic Databases Working Group, the current specification was last modified in 2007.
A body of standards, including a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries.
Sponsored by Biodiversity Information Standards (TWDG), the current standard was last modified in October 2009.
Ecological Metadata Language (EML) is a metadata specification particularly developed for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications).
Sponsored by ecoinformatics.org, EML Version 2.2.0 was released in 2019.
Genome metadata on PATRIC consists of 61 different metadata fields, called attributes, which are organized into the following seven broad categories: Organism Info, Isolate Info, Host Info, Sequence Info, Phenotype Info, Project Info, and Others.
The Investigation/Study/Assay (ISA) tab-delimited (TAB) format is a general purpose framework with which to collect and communicate complex metadata (i.e. sample characteristics, technologies used, type of measurements made) from 'omics-based' experiments employing a combination of technologies.
Created by core developers from the University of Oxford, ISA-TAB v1.0 was released in November 2008.
A common portal to a group of nearly 40 checklists of Minimum Information for various biological disciplines. The MIBBI Foundry is developing a cross-analysis of these guidelines to create an intercompatible, extensible community of standards.
The concept was realized initially through the joint efforts of the Proteomics Standards Initiative, the Genomic Standards Consortium and the MGED RSBI Working Groups. The latest project to register with MIBBI is the MIABie guidelines for reporting biofilm research, as of January 2012.
NeXus is an international standard for the storage and exchange of neutron, x-ray, and muon experiment data. The structure of NeXus files is extremely flexible, allowing the storage of both simple data sets, such as a single data array and its axes, and highly complex data and their associated metadata, such as measurements on a multi-component instrument or numerical simulations. NeXus is built on top of the container format HDF5, and adds domain-specific rules for organizing data within HDF5 files in addition to a dictionary of well-defined domain-specific field names.
Observ-OM is founded on four basic concepts to represent any kind of observation: Targets, Features, Protocols (and their Applications), and Values. It is intended to lower the barrier for future data sharing and facilitate integrated search across panels and species. All models, formats, documentation, and software are available for free and open source (LGPLv3) at http://www.observ-om.org.
Open Data for Access and Mining (ODAM) Structural Metadata is a format describing how the metadata should be formatted and what should be included to ensure ODAM compliance for a data set. To comply with this format, two metadata files in TSV format are required in addition to the data file(s). These two files describe the metadata of the dataset, which includes descriptions of measures and structural metadata like references between tables. The metadata lets non-expert users explore and visualize your data. By making data interoperable and reusable by both humans and machines, it also encourages data dissemination according to FAIR principles. The structural metadata is specified in section 'Data collection and preparation' on the website.
OME-XML is a vendor-neutral file format for biological image data, with an emphasis on metadata supporting light microscopy. It can be used as a data file format in its own right, or as a way of encoding metadata within a TIFF or BigTIFF file (for which purpose there is the OME-TIFF specification).
The standard is maintained by the Open Microscopy Environment Consortium, and was last updated in June 2012.
Protein Data Bank archive (PDB) is the single worldwide archival repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies, managed by the Worldwide PDB (wwPDB). The PDB Exchange Dictionary (PDBx) is used by the wwPDB to define data content for deposition, annotation and archiving of PDB entries. PDBx incorporates the community standard metadata representation, the Macromolecular Crystallographic Information Framework (mmCIF), orginally developed under the auspices of the International Union of Crystallography (IUCr). PDBx has been extended by the wwPDB to include descriptions of other experimental methods that produce 3D macromolecular structure models such as Nuclear Magnetic Resonance Spectroscopy, 3D Electron Microscopy and Tomography.
Some repositories have decided that current standards do not fit their metadata needs, and so have created their own requirements.
RO-Crate is a community effort to establish a lightweight approach to packaging research data with their metadata. It is based on schema.org annotations in JSON-LD, and aims to make best-practice in formal metadata description accessible and practical for use in a wider variety of situations, from an individual researcher working with a folder of data, to large data-intensive computational research environments.
A metadata standard for describing environmental monitoring activities, programmes, networks and facilities published by the UK Environmental Observation Framework (UKEOF).
An extension of the ABCD standard for DNA data.
Darwin Core documentation and recommendations for herbaria.
A protocol-independent XML schema for a geospatial extension to the Darwin Core.
An extension to the Darwin Core standard, it includes additional terms required to describe plant genetic resources and in particular germplasm seed samples.
The European Directory of Marine Environmental Datasets metadata scheme, which is a profile of ISO 19115.
A profile of the FGDC/CSDGM metadata standard, intended to support the collection and processing of biological data.
Established by a global network of countries and organizations, GBIF is a web portal promoting and facilitating the mobilization, access, discovery and use of biodiversity data. The portal uses a profile of EML; a How-to Guide and Reference Guide for using the profile are available.
An extension to ABCD 2.06, it is designed to allow the storage and transmission of herbarium plant specimen data.
An extension of ISA-TAB specifying the format for representing and sharing information about nanomaterials, small molecules and biological specimens along with their assay characterization data.
A list of nearly 40 Minimum Information standards projects registered with the MIBBI initiative.
An ISA-Tab-based standard for reporting the results of single nucleotide resolution nucleic acid structure mapping experiments.
Bio-Formats reads proprietary microscopy image data and metadata, and converts them to OME-TIFF, a combination of TIFF and OME-XML.
A web application that offers data publishers wishing to serve to the GBIF network an easy interface for describing data elements as basic text files, composing an appropriate XML Darwin Core descriptor file to accompany them.
A tool to validate XML metadata against the Darwin Core Text Guidelines.
The open source ISA metadata tracking tools facilitate ISA-TAB-compliant collection, curation, local management and reuse of datasets in an increasingly diverse set of life science domains.
Metacat is a repository for data and metadata that helps scientists find, understand, and effectively use the data sets they manage or that have been created by others.
A software generator to rapidly build web databases and a suite of web databases for genotype, phenotype, QTL and analysis pipelines.
An application for accessing and manipulating metadata and data (both locally and on the network), with wizards creating metadata files using a subset of Ecological Metadata Language (EML).
Experimental data table management software to make research data accessible and available for reuse with minimal effort on the part of the data provider. Designed to manage experimental data tables in an easy way for users, ODAM provides a model for structuring both data and metadata that facilitates data handling and analysis. It also encourages data dissemination according to FAIR principles by making the data interoperable and reusable by both humans and machines, allowing the dataset to be explored and then extracted in whole or in part as needed.
Repository software for organising, viewing, analysing and sharing biological microscopy images. It supports proprietary file formats but normalises to OME-TIFF/OME-XML.
Tool for downloading data from PATRIC.
Bioinformatics tools to create and extract metadata compliant with the MIBBI-registered MIAPE minimum requirements.
The UKEOF Catalogue contains over 2000 metadata records of environmental observations undertaken and funded by public and third sector organisations.
The Catalogue provides a unique management tool to underpin the activities and requirements of the environmental observation community. It provides a strong basis for strategic planning, giving a holistic overview of environmental observations as well as a place to discover who is doing what, where, why and when.
An aggregation of information on all the known species in Australia, collected from museums, herbaria, community groups, government departments, individuals and universities. All data is converted to Darwin Core.
The BioCASE Biological Unit Network provides access to a transnational network of biological collections; its protocol requires providers to use the ABCD schema in their configuration files.
A repository hosting computational models of biological systems, using the MIBBI-registered MIRIAM and MIASE minimal metadata requirements.
This national facility for looking after and distributing data concerning the marine environment requires that data sets use a well-documented format such as CF-compliant NetCDF and be accompanied by a Dublin Core record as well as discovery metadata in a recognised standard such as DIF or FGDC/CDGM.
A a virtual laboratory for neurophysiology, enabling sharing and collaborative exploitation of data, analysis, code and expertise. Metadata must include the MIBBI-registered MINI recommendations.
A resource database of images, videos, and animations of cells, capturing a wide diversity of organisms, cell types, and cellular processes. Its native metadata format for images is OME-XML.
A repository-developed metadata schema for EST data in Genbank.
The Environmental Information Data Centre (EIDC) is a Natural Environment Research Council Data Centre hosted by the Centre for Ecology & Hydrology (CEH). It manages nationally-important datasets concerned with the terrestrial and freshwater sciences.
A database of flow cytometry experiments where you can query and download data collected and annotated according to the MIBBI-registered MIFlowCyt standard.
Established by a global network of countries and organizations, GBIF is a web portal promoting and facilitating the mobilization, access, discovery and use of biodiversity data. The preferred format for publishing data to the GBIF network is the Darwin Core Archive, and its Integrated Publishing Toolkit uses EML as its data standard.
One of two research centers in the US creating libraries of signatures that describe how cells respond to perturbation, it uses the ISA-TAB standard to describe its data.
An international collaboration to provide access to a non-redundant set of protein-protein interaction data from a broad taxonomic range of organisms. IMEx partner databases require data to be MIMIx (a MIBBI-registered standard) compatible.
A network of systems and projects that use the ISA-Tab file format, and/or are powered by components of the ISA software suite.
A repository for viewing and analysing multi-dimensional image data associated with articles published in The Journal of Cell Biology. Its native metadata format is OME-XML.
A network of federated institutions that have agreed to share data and metadata using a common framework, principally revolving around the use of the Ecological Metadata Language as a common language for describing ecological data.
A network providing the scientific expertise, research platforms, and long-term datasets necessary to document and analyze environmental change, it uses the Ecological Metadata Language in describing its data.
A database for metabolomics experiments and derived information in ISA-Tab format.
An EML developer, this US-based centre of cross-disciplinary research uses existing data to address major fundamental issues in ecology and allied fields.
An online portal for education and research on learning in Science, Technology, Engineering, and Mathematics, using a profile of the Dublin Core Metadata Elements for resource and collections metadata.
A data repository for marine species datasets from all of the world's oceans; it uses an extension of Darwin Core 2 as its data standard.
Ocean Networks Canada operates the world-leading NEPTUNE and VENUS cabled ocean observatories that collect data on physical, chemical, biological, and geological aspects of the ocean over long time periods, supporting research on complex Earth processes. The CF standard is used within netCDF data products delivered through the Oceans 2.0 interface and via OPeNDAP webservices.
A centralized, MIBBI standards compliant, public data repository for proteomics data, post-translational modifications and supporting spectral evidence.
A web portal using Darwin Core to describe biodiversity data collected in Madagascar.
Four distributed database networks (MaNIS, HerpNET, ORNIS and FishNet) using a Darwin Core engine to make bioinformatics specimen data interoperable, mappable and publicly available.
Protein Data Bank archive (PDB) is the single worldwide archival repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies. The Worldwide PDB (wwPDB) organization manages the PDB archive and ensures that the PDB is freely and publicly available to the global community.