Crystallography

Standards Add

CIF (Crystallographic Information Framework) Edit

A well-established standard file structure for the archiving and distribution of crystallographic information, CIF is in regular use for reporting crystal structure determinations to Acta Crystallographica and other journals.

Sponsored by the International Union of Crystallography, the current standard dates from 1997. As of July 2011, a new version of the CIF standard is under consideration.

CSMD (Core Scientific Metadata Model) Edit

A study-data oriented model, primarily in support of the ICAT data managment infrastructure software. The CSMD is designed to support data collected within a large-scale facility’s scientific workflow; however the model is also designed to be generic across scientific disciplines.

Sponsored by the Science and Technologies Facilities Council, the latest full specification available is v 4.0, from 2013.

NeXus Edit

NeXus is an international standard for the storage and exchange of neutron, x-ray, and muon experiment data. The structure of NeXus files is extremely flexible, allowing the storage of both simple data sets, such as a single data array and its axes, and highly complex data and their associated metadata, such as measurements on a multi-component instrument or numerical simulations. NeXus is built on top of the container format HDF5, and adds domain-specific rules for organizing data within HDF5 files in addition to a dictionary of well-defined domain-specific field names.

Open Standard for Particle-Mesh Data (openPMD) Edit

OpenPMD provides naming and attribute conventions that allow the exchange of particle and mesh based data from scientific simulations and experiments. The primary goal is to define a minimal set/kernel of meta information that enables the sharing and exchange of data to achieve

  • portability between various applications and differing algorithms;
  • a unified open-access description for scientific data (publishing and archiving);
  • a unified description for post-processing, visualization and analysis.

OpenPMD suits any kind of hierarchical, self-describing data format, such as, but not limited to ADIOS1 (BP3), ADIOS2 (BP4), HDF5, JSON, and XML.

PDBx/mmCIF (Protein Data Bank Exchange Dictionary and the Macromolecular Crystallographic Information Framework) Edit

Protein Data Bank archive (PDB) is the single worldwide archival repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies, managed by the Worldwide PDB (wwPDB). The PDB Exchange Dictionary (PDBx) is used by the wwPDB to define data content for deposition, annotation and archiving of PDB entries. PDBx incorporates the community standard metadata representation, the Macromolecular Crystallographic Information Framework (mmCIF), orginally developed under the auspices of the International Union of Crystallography (IUCr). PDBx has been extended by the wwPDB to include descriptions of other experimental methods that produce 3D macromolecular structure models such as Nuclear Magnetic Resonance Spectroscopy, 3D Electron Microscopy and Tomography.

Extensions Add

eBank UK Metadata Application Profile Edit

A Dublin Core Metadata Application Profile created for the eBank UK project, which provides access to the detailed results of scientific experiments in crystallography.

TIDCC (Towards an International Data Commons for Crystallography) Edit

A profile of the CSMD model for Australian crystallographic data.

Tools Add

CIF2Cell Edit

A tool to generate the geometrical setup for various electronic structure codes from a CIF file.

ICATLite Edit

A sister project of ICAT, consisting of a suite of CSMD-based software tools designed to support derived data management in the scientific research process.

IUCr checkCIF Edit

A tool used to check the integrity and cosistency of crystal structure encodings in CIF format.

PDBx/mmCIF Software Resources Edit
Parsing, validation, and visualization tools and libraries supporting PDBx/mmCIF, the data standard used by the Worldwide Protein Data Bank.
Software for CIF Edit

The International Union of Crystallography's list of programs and libraries available for use with CIF files.

Use Cases Add

American Mineralogist Crystal Structure Database Edit

A CIF crystal structure database that includes every structure published in the American Mineralogist, The Canadian Mineralogist, European Journal of Mineralogy and Physics and Chemistry of Minerals, as well as selected datasets from other journals.

Cambridge Structural Database Edit

A repository of small molecule crystal structures, many with accompanying CIF files.

Chem-BLAST Edit
A Web-based service for searching for and visualizing chemical structures. It uses data from the Protein Data Bank that has been transformed to RDF.
Crystallography Open Database Edit

An open-access collection of crystal structures of organic, inorganic, metal-organic compounds and minerals, many of which are in CIF form.

CSMD Pilot Studies Edit

A desk study of CSMD implementation in two facilities: the UK National Crystallography Service and the ISIS Neutron Source.

eCrystals Federation Project Edit

An archive for crystal structures generated by the Southampton Chemical Crystallography Group and the EPSRC UK National Crystallography Service; its metadata conforms to the eBank UK Dublin Core Profile.

ICAT Implementations Edit

The ICAT website's list of the facilities and organisations usingthe CSMD-based ICAT software.

wwPDB (Worldwide Protein Data Bank) Edit

Protein Data Bank archive (PDB) is the single worldwide archival repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies. The Worldwide PDB (wwPDB) organization manages the PDB archive and ensures that the PDB is freely and publicly available to the global community.