Skip to main content

Lead by Example

Introduction

NFDI4Chem has a clear vision of how chemistry research data will be collected, processed, archived, shared and published. The development of standards for metadata, minimum information, analytical data formats as well as publication standards includes sample datasets in a standard-compliant manner.

Herein we present representative as well as substantially complex real datasets from various subdisciplines of chemistry. These datasets were collected within NFDI4Chem, but also external contributions by early adopters via our survey form are listed, for which we provide support via a consulting service and data stewardship.

This collection also documents the process of evolving FAIRness of chemistry research data, surfaces practical issues and suggestions for improvements to be fed back to other projects within NFDI4Chem.

Take a look at the list for inspiration as to what is already possible today!

Do you want to have your published dataset highlighted here or do you need assistance in the preparation of your dataset for publication? Pledge your dataset to NFDI4Chem! More information can be found here.

Publication Dataset Pairs

Author(s)TitleDOI of publicationRepository and DOI of datasetCurator(s)Description of dataset
Carolin Huber, Erik Müller, Tobias Schulze, Werner Brack, Martin KraussImproving the screening analysis of pesticide metabolites in human biomonitoring by combining high-throughput in vitro incubation and automated LC−HRMS data processingDOI of publicationMetaboLighst
MassBank-data
GitHub
curated by publishing authorsThe dataset, curated by the publishing authors, contain MS data is published in MetaboLight, Massbank and MassBank-data at GitHub. All raw mass spectra were converted to mzML format. The used code is available at GitHub and were referenced in corresponding article.
Dennis Reinhard, Frank Rominger, Michael MastalerzDesymmetrization strategy to achieve triptycene-based 3,6-dimethoxytriphenylenes via oxidative cyclodehydrogenationDOI of publicationCCDC
heiData
curated by publishing authorsThe datasets, curated by the publishing authors, and published in heiData, contains data of NMR, MS, IR in the instrument manufacturers formats.The IR data is also available in TSV format. Elemental analysis data is provided as JPG. Crystallographic data as CIF are available from CCDC. References to datasets in CCDC are given in the supporting information PDF of the corresponding article.
Erik Müller, Carolin Huber, Liza-Marie Beckers, Werner Brack, Martin Krauss, Tobias SchulzeA data et of 255,000 randomly selected and manually classified extracted ion chromatograms for evaluation of peak detection methodsDOI of publicationZenodo
MetaboLights
curated by publishing authorsThe dataset, curated by the publishing authors, contains 255.000 extracted ion chromatograms (EICs or XICs) of 5000 peaks randomly sampled from across 51 environmental water samples for the evaluation on peak detection and gap filling algorithms.
Fabian Thomas, Matthias Oster, Florian Schön, Kai C. Göbgen, Benedikt Amarouch, Dominik Steden, Alexander Hoffmann, Sonja Herres-PawlisA new generation of terminal copper nitrenes and their application in aromatic C–H amination reactionsDOI of publicationChemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
CCDC
ioChem-DB
curated by publishing authorsThe datasets, curated by the publishing authors, contain NMR data in the instrument manufacturers formats and is also provided by Chemotion Repository in the open format JCAMP-DX. Crystallographic data as CIF files are available from CCDC and computational data was deposited in ioChem-DB. References to the crystallographic data is given in the section on supporting information in the corresponding publication. References on NMR data can be retrieved from the supplementary information PDF.
Linda Jütten, Karla Ramírez-Gualito, Andreas Weilhard, Benjamin albrecht, Gabriel Cuevas, María del Carmen Fernández-Alonso, Jesús Jiménez-Barbero, Nils E. Schlörer, Dolores DiazExploring the role of solvent on carbohydrate−aryl interactions by diffusion NMR-based studiesDOI of publicationNMRShiftDB2
NMRShiftDB2
NMRShiftDB2
NMRShifDB2
NMRShiftDB2
curated by publishing authorsThe dataset was curated by the publishing authors and is published in NMRShiftDB2. References to the dataset is given via DOI in the supporting information PDF.
Meike Hahn, Eric von Elert, Laurent Bigler, M. Dolores Díaz Hernández, Nils E. Schloerer5α‐Cyprinol sulfate: complete NMR assignment and revision of earlier published data, including the submission of a computer‐readable assignment in NMReDATA formatDOI of publicationNMRShiftDB2curated by publishing authorsThe dataset, curated by the publishing authors, is published in NMRShiftDB2 and also available from the publisher as supporting information. References to the dataset is given via DOI in the supporting information DOCX.
Melanie Paul, Melissa Teubner, Benjamin Grimm-Lebsanft, Christiane Golchert, Yannick Meiners, Laura Senft, Kristina Keisers, Patricia Liebhäuser, Thomas Rösener, Florian Biebl, Sören Buchenau, Maria Naumova, Vadim Murzin, Roxanne Krug, Alexander Hoffmann, Jörg Pietruszka, Ivana Ivanovic-Burmazovic, Michael Rübhausen, Sonja Herres-PawlisExceptional substrate diversity in oxygenation reactions catalyzed by a bis(µ-oxo) copper compleDOI of publicationChemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
CCDC
CCDC
CCDC
CCDC
CCDC
curated by publishing authorsThe datasets, curated by the publishing authors, contain NMR data in the instrument manufacturers formats and is also provided by Chemotion Repository in the open format JCAMP-DX. Crystallographic data as CIF files are available from CCDC. References to datasets are given in the supporting information. PDF of the corresponding article.
Nadine Vogler, Thomas Bocklitz, Firas Subhi Salah, Carsten Schmidt, Rolf Brauer, Tiantian Cui, Masoud Mireskandari, Florian R. Greten, Michael Schmitt, Andreas Stallmach, Iver Petersen, Jürgen PoppSystematic evaluation of the biological variance within the Raman based colorectal tissue diagnosticsDOI of publicationZenodocurated by publishing authorsIn the following, a short desciption for each csv files: Meta data: includes information about mice ID, scans collected from each mouse, location of extracted scans, activity of P53 gene, mouce gender, tissue type. MSpectra: contains mean spectra of tissue . Wavenumbers: includes Raman spectra. wavenumbers.TissueLabels: describes different divisions of tissue types;e.g. normal vs abnormal, normal vs HB vs Karzinom, normal vs HB vs adenoma vs carcinomatypes with respect to each extracted scan.
Robin M. Bär, Lukas Langer, Martin Nieger, Stefan BräseBicyclo[1.1.1]pentyl sulfoximines: synthesis and functionalizationsDOI of publicationChemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
Chemotion
CCDC
CCDC
CCDC
CCDC
CCDC
CCDC
CCDC
CCDC
curated by publishing authorsThe datasets, curated by the publishing authors, contains NMR data in the instrument manufacturers formats and is also provided by Chemotion Repository in the open format JCAMP-DX. Crystallographic data as CIF are available from CCDC. References to datasets in CCDC are given in the supporting. information PDF of the corresponding article.
Shuxia Guo et al.Comparability of Raman spectroscopic configurations: A large scale cross-laboratory studyDOI of publicationZenodocurated by publishing authorsSlightly processed raw data for the paper 'Comparability of Raman Spectroscopic Configurations: A Large Scale Cross-Laboratory Study': The processing include a spike removal to allow an interpolation to a common wavenumber axis. Always three files belong to each other: wavenumber axis file (wx_XYZ), spectral intensity file (spec_XYZ) and metadata file (meta_XYZ). The ‘XYZ’ refers to the samples measured (see the publication and its SI for details).
Toni Ditfe, Eileen Bette, Haider N. Sultani, Alexander Otto, Ludger A. Wessjohann, Norbert Arnold, Bernhard WestermannSynthesis and biological evaluation of highly potent fungicidal deoxy-hygrophoronesDOI of publicationRADARTillmann G. Fischer (IPB)The dataset, prepared for publication under NFDI4Chem stewardship and published in RADAR, contains NMR and MS data in the instrument manufacturers formats as well as in open formats such as JCAMP-DX, NMReDATA for NMR data and mzML for MS data. Results from the bioassay are available as CSV. Additionally, all structures are provided as CTfiles and are listed, corresponding to their numbering in the publication, in an CSV including IPB 3LC lab journal entries, SMILES structure codes and InChI and InChIkey identifiers. The corresponding article references the dataset in the section on supporting information.
Xubin Wang, Bernd Kohl, Frank Rominger, Sven M. Elbert, Michael MastalerzA triptycene-based enantiopure bis(diazadibenzoanthracene) by a chirality-assisted synthesis approachDOI of publicationheiDATA
CCDC
CCDC
curated by publishing authorsThe datasets, curated by the publishing authors, and published in heiData, contains data of NMR, MS, IR in the instrument manufacturers formats.The IR data is also available in TSV format. Elemental analysis data is provided as JPG. Crystallographic data as CIF are available from CCDC. References to datasets in CCDC are given in the supporting information PDF of the corresponding article.
This table will be continuously updated with further publication dataset pairs.