Biochemistry

Standards Add

CSMD (Core Scientific Metadata Model) Edit

A study-data oriented model, primarily in support of the ICAT data managment infrastructure software. The CSMD is designed to support data collected within a large-scale facility’s scientific workflow; however the model is also designed to be generic across scientific disciplines.

Sponsored by the Science and Technologies Facilities Council, the latest full specification available is v 4.0, from 2013.

DIF (Directory Interchange Format) Edit

An early metadata initiative from the Earth sciences community, intended for the description of scientific data sets. It inlcudes elements focusing on instruments that capture data, temporal and spatial characteristics of the data, and projects with which the dataset is associated. It is defined as a W3C XML Schema.

Sponsored by the Global Change Master Directory, the DIF Writer's Guide Version 6 is from November 2010.

FGDC/CSDGM (Federal Geographic Data Committee Content Standard for Digital Geospatial Metadata) Edit

A widely-used, but no longer current standard defining the information content for a set of digital geospatial data required by the US Federal Government.

CSDGM was sponsored by the US Federal Geographic Data Committee.  However, in September 2010 the FGDC endorsed ISO 19115 and began encouraging federal agencies to transition to ISO metadata.

ISA-Tab Edit

The Investigation/Study/Assay (ISA) tab-delimited (TAB) format is a general purpose framework with which to collect and communicate complex metadata (i.e. sample characteristics, technologies used, type of measurements made) from 'omics-based' experiments employing a combination of technologies.

Created by core developers from the University of Oxford, ISA-TAB v1.0 was released in November 2008.

MIBBI (Minimum Information for Biological and Biomedical Investigations) Edit

A common portal to a group of nearly 40 checklists of Minimum Information for various biological disciplines. The MIBBI Foundry is developing a cross-analysis of these guidelines to create an intercompatible, extensible community of standards.

The concept was realized initially through the joint efforts of the Proteomics Standards Initiative, the Genomic Standards Consortium and the MGED RSBI Working Groups. The latest project to register with MIBBI is the MIABie guidelines for reporting biofilm research, as of January 2012.

ODAM Structural Metadata Edit

Open Data for Access and Mining (ODAM) Structural Metadata is a format describing how the metadata should be formatted and what should be included to ensure ODAM compliance for a data set. To comply with this format, two metadata files in TSV format are required in addition to the data file(s). These two files describe the metadata of the dataset, which includes descriptions of measures and structural metadata like references between tables. The metadata lets non-expert users explore and visualize your data. By making data interoperable and reusable by both humans and machines, it also encourages data dissemination according to FAIR principles. The structural metadata is specified in section 'Data collection and preparation' on the website.

Repository-Developed Metadata Schemas Edit

Some repositories have decided that current standards do not fit their metadata needs, and so have created their own requirements.

Extensions Add

SNRNASM ISA-Tab Edit

An ISA-Tab-based standard for reporting the results of single nucleotide resolution nucleic acid structure mapping experiments.

Tools Add

ODAM Software Suite Edit

Experimental data table management software to make research data accessible and available for reuse with minimal effort on the part of the data provider. Designed to manage experimental data tables in an easy way for users, ODAM provides a model for structuring both data and metadata that facilitates data handling and analysis. It also encourages data dissemination according to FAIR principles by making the data interoperable and reusable by both humans and machines, allowing the dataset to be explored and then extracted in whole or in part as needed.

ProteoRed Tools Edit

Bioinformatics tools to create and extract metadata compliant with the MIBBI-registered MIAPE minimum requirements.

Use Cases Add

BioModels Database Edit

A repository hosting computational models of biological systems, using the MIBBI-registered MIRIAM and MIASE minimal metadata requirements.

BODC (British Oceanographic Data Centre Published Data Library) Edit

This national facility for looking after and distributing data concerning the marine environment requires that data sets use a well-documented format such as CF-compliant NetCDF and be accompanied by a Dublin Core record as well as discovery metadata in a recognised standard such as DIF or FGDC/CDGM.

Chem-BLAST Edit
A Web-based service for searching for and visualizing chemical structures. It uses data from the Protein Data Bank that has been transformed to RDF.
dbEST (Expressed Sequence Tag Database) Edit

A repository-developed metadata schema for EST data in Genbank.

FlowRepository Edit

A database of flow cytometry experiments where you can query and download data collected and annotated according to the MIBBI-registered MIFlowCyt standard.

International Molecular Exchange Consortium Edit

An international collaboration to provide access to a non-redundant set of protein-protein interaction data from a broad taxonomic range of organisms. IMEx partner databases require data to be MIMIx (a MIBBI-registered standard) compatible.

ISA Commons Edit

A network of systems and projects that use the ISA-Tab file format, and/or are powered by components of the ISA software suite.

MetaboLights Edit

A database for metabolomics experiments and derived information in ISA-Tab format.

NCDC (National Climatic Data Center) Edit

The world's largest climate data archive, providing climatological services and data worldwide. It currently promotes the FGDC/CSDGM metadata standard for its datasets.

PRIDE (Proteomics Identifications Database) Edit

A centralized, MIBBI standards compliant, public data repository for proteomics data, post-translational modifications and supporting spectral evidence.