Lead by Example
Introduction
NFDI4Chem has a clear vision of how chemistry research data will be collected, processed, archived, shared and published. The development of standards for metadata, minimum information, analytical data formats as well as publication standards includes sample datasets in a standard-compliant manner.
Herein we present representative as well as substantially complex real datasets from various subdisciplines of chemistry. These datasets were collected within NFDI4Chem, but also external contributions by early adopters via our survey form are listed, for which we provide support via a consulting service and data stewardship.
This collection also documents the process of evolving FAIRness of chemistry research data, surfaces practical issues and suggestions for improvements to be fed back to other projects within NFDI4Chem.
Take a look at the list for inspiration as to what is already possible today!
Do you want to have your published dataset highlighted here or do you need assistance in the preparation of your dataset for publication? Pledge your dataset to NFDI4Chem! More information can be found here.
Publication Dataset Pairs
Author(s) | Title | DOI of publication | Repository and DOI of dataset | Curator(s) | Description of dataset |
---|---|---|---|---|---|
Carolin Huber, Erik Müller, Tobias Schulze, Werner Brack, Martin Krauss | Improving the screening analysis of pesticide metabolites in human biomonitoring by combining high-throughput in vitro incubation and automated LC−HRMS data processing | DOI of publication | MetaboLighst MassBank-data GitHub | curated by publishing authors | The dataset, curated by the publishing authors, contain MS data is published in MetaboLight, Massbank and MassBank-data at GitHub. All raw mass spectra were converted to mzML format. The used code is available at GitHub and were referenced in corresponding article. |
Dennis Reinhard, Frank Rominger, Michael Mastalerz | Desymmetrization strategy to achieve triptycene-based 3,6-dimethoxytriphenylenes via oxidative cyclodehydrogenation | DOI of publication | CCDC heiData | curated by publishing authors | The datasets, curated by the publishing authors, and published in heiData, contains data of NMR, MS, IR in the instrument manufacturers formats.The IR data is also available in TSV format. Elemental analysis data is provided as JPG. Crystallographic data as CIF are available from CCDC. References to datasets in CCDC are given in the supporting information PDF of the corresponding article. |
Erik Müller, Carolin Huber, Liza-Marie Beckers, Werner Brack, Martin Krauss, Tobias Schulze | A data et of 255,000 randomly selected and manually classified extracted ion chromatograms for evaluation of peak detection methods | DOI of publication | Zenodo MetaboLights | curated by publishing authors | The dataset, curated by the publishing authors, contains 255.000 extracted ion chromatograms (EICs or XICs) of 5000 peaks randomly sampled from across 51 environmental water samples for the evaluation on peak detection and gap filling algorithms. |
Fabian Thomas, Matthias Oster, Florian Schön, Kai C. Göbgen, Benedikt Amarouch, Dominik Steden, Alexander Hoffmann, Sonja Herres-Pawlis | A new generation of terminal copper nitrenes and their application in aromatic C–H amination reactions | DOI of publication | Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion CCDC ioChem-DB | curated by publishing authors | The datasets, curated by the publishing authors, contain NMR data in the instrument manufacturers formats and is also provided by Chemotion Repository in the open format JCAMP-DX. Crystallographic data as CIF files are available from CCDC and computational data was deposited in ioChem-DB. References to the crystallographic data is given in the section on supporting information in the corresponding publication. References on NMR data can be retrieved from the supplementary information PDF. |
Linda Jütten, Karla Ramírez-Gualito, Andreas Weilhard, Benjamin albrecht, Gabriel Cuevas, María del Carmen Fernández-Alonso, Jesús Jiménez-Barbero, Nils E. Schlörer, Dolores Diaz | Exploring the role of solvent on carbohydrate−aryl interactions by diffusion NMR-based studies | DOI of publication | NMRShiftDB2 NMRShiftDB2 NMRShiftDB2 NMRShifDB2 NMRShiftDB2 | curated by publishing authors | The dataset was curated by the publishing authors and is published in NMRShiftDB2. References to the dataset is given via DOI in the supporting information PDF. |
Meike Hahn, Eric von Elert, Laurent Bigler, M. Dolores Díaz Hernández, Nils E. Schloerer | 5α‐Cyprinol sulfate: complete NMR assignment and revision of earlier published data, including the submission of a computer‐readable assignment in NMReDATA format | DOI of publication | NMRShiftDB2 | curated by publishing authors | The dataset, curated by the publishing authors, is published in NMRShiftDB2 and also available from the publisher as supporting information. References to the dataset is given via DOI in the supporting information DOCX. |
Melanie Paul, Melissa Teubner, Benjamin Grimm-Lebsanft, Christiane Golchert, Yannick Meiners, Laura Senft, Kristina Keisers, Patricia Liebhäuser, Thomas Rösener, Florian Biebl, Sören Buchenau, Maria Naumova, Vadim Murzin, Roxanne Krug, Alexander Hoffmann, Jörg Pietruszka, Ivana Ivanovic-Burmazovic, Michael Rübhausen, Sonja Herres-Pawlis | Exceptional substrate diversity in oxygenation reactions catalyzed by a bis(µ-oxo) copper comple | DOI of publication | Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion CCDC CCDC CCDC CCDC CCDC | curated by publishing authors | The datasets, curated by the publishing authors, contain NMR data in the instrument manufacturers formats and is also provided by Chemotion Repository in the open format JCAMP-DX. Crystallographic data as CIF files are available from CCDC. References to datasets are given in the supporting information. PDF of the corresponding article. |
Nadine Vogler, Thomas Bocklitz, Firas Subhi Salah, Carsten Schmidt, Rolf Brauer, Tiantian Cui, Masoud Mireskandari, Florian R. Greten, Michael Schmitt, Andreas Stallmach, Iver Petersen, Jürgen Popp | Systematic evaluation of the biological variance within the Raman based colorectal tissue diagnostics | DOI of publication | Zenodo | curated by publishing authors | In the following, a short desciption for each csv files: Meta data: includes information about mice ID, scans collected from each mouse, location of extracted scans, activity of P53 gene, mouce gender, tissue type. MSpectra: contains mean spectra of tissue . Wavenumbers: includes Raman spectra. wavenumbers.TissueLabels: describes different divisions of tissue types;e.g. normal vs abnormal, normal vs HB vs Karzinom, normal vs HB vs adenoma vs carcinomatypes with respect to each extracted scan. |
Robin M. Bär, Lukas Langer, Martin Nieger, Stefan Bräse | Bicyclo[1.1.1]pentyl sulfoximines: synthesis and functionalizations | DOI of publication | Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion Chemotion CCDC CCDC CCDC CCDC CCDC CCDC CCDC CCDC | curated by publishing authors | The datasets, curated by the publishing authors, contains NMR data in the instrument manufacturers formats and is also provided by Chemotion Repository in the open format JCAMP-DX. Crystallographic data as CIF are available from CCDC. References to datasets in CCDC are given in the supporting. information PDF of the corresponding article. |
Shuxia Guo et al. | Comparability of Raman spectroscopic configurations: A large scale cross-laboratory study | DOI of publication | Zenodo | curated by publishing authors | Slightly processed raw data for the paper 'Comparability of Raman Spectroscopic Configurations: A Large Scale Cross-Laboratory Study': The processing include a spike removal to allow an interpolation to a common wavenumber axis. Always three files belong to each other: wavenumber axis file (wx_XYZ), spectral intensity file (spec_XYZ) and metadata file (meta_XYZ). The ‘XYZ’ refers to the samples measured (see the publication and its SI for details). |
Toni Ditfe, Eileen Bette, Haider N. Sultani, Alexander Otto, Ludger A. Wessjohann, Norbert Arnold, Bernhard Westermann | Synthesis and biological evaluation of highly potent fungicidal deoxy-hygrophorones | DOI of publication | RADAR | Tillmann G. Fischer (IPB) | The dataset, prepared for publication under NFDI4Chem stewardship and published in RADAR, contains NMR and MS data in the instrument manufacturers formats as well as in open formats such as JCAMP-DX, NMReDATA for NMR data and mzML for MS data. Results from the bioassay are available as CSV. Additionally, all structures are provided as CTfiles and are listed, corresponding to their numbering in the publication, in an CSV including IPB 3LC lab journal entries, SMILES structure codes and InChI and InChIkey identifiers. The corresponding article references the dataset in the section on supporting information. |
Xubin Wang, Bernd Kohl, Frank Rominger, Sven M. Elbert, Michael Mastalerz | A triptycene-based enantiopure bis(diazadibenzoanthracene) by a chirality-assisted synthesis approach | DOI of publication | heiDATA CCDC CCDC | curated by publishing authors | The datasets, curated by the publishing authors, and published in heiData, contains data of NMR, MS, IR in the instrument manufacturers formats.The IR data is also available in TSV format. Elemental analysis data is provided as JPG. Crystallographic data as CIF are available from CCDC. References to datasets in CCDC are given in the supporting information PDF of the corresponding article. |