Synthetic Organic / Inorganic Chemistry
Introduction
In synthesising a compound, every step of an experiment, from planning and execution to documentation and product characterisation, produces research data. These include synthetic procedures, experimental conditions, and manually or digitally collected analytical data. Interpreting these data enables proof of concept, process optimisation, and upscaling.
Data Types
Synthetic chemistry generates diverse data beyond product characterisation. A typical experiment starts with design and planning, then proceeds in the lab while observations, conditions, and yields are documented. Ideally, an electronic lab notebook (ELN) is used to collect and structure this data.
After synthesis, product properties are analysed using both manual and digital data. Results from methods without digital output (e.g., melting/boiling point, optical rotation, TLC Rf, refractive index) can be entered manually in the ELN, while digital instrument data (e.g., NMR, IR, MS) can be uploaded and analysed there. Raw files should be preserved in proprietary file formats and, where possible, converted to interoperable open file formats; if no open standard exists, export to .txt or .csv is recommended.
Overall, metadata should always be included when collecting and storing data to allow understanding of the research data in the long term.
ELNs and Other Tools
For effective data management, tools should be selected at project or group level based on workflows. Because workflows are often method-specific, usage guidelines and metadata templates should be defined and documented in a data management plan (DMP). NFDI4Chem provides an RDMO template tailored to chemistry.
Applying FAIR principles retrospectively to analogue workflows is time-consuming. ELNs help by automating key FAIR-related tasks, such as structured metadata capture in human- and machine-readable formats, and in some cases generation of interoperable open file formats. Because ELN selection is critical, consult how to choose the right ELN:
Loading...The ELN finder supports tool selection across many ELNs. Since needs differ by group, there is no one-size-fits-all solution. Within NFDI4Chem, Chemotion ELN is the reference instance (see overview of Chemotion), so FAIR-related developments are implemented there first.
Chemotion is particularly suitable for synthetic chemistry and has been extended to other disciplines via LabIMotion.
Publishing Data
Publishing research data enables reuse by researchers and machine-learning applications. For machine readability, data should be published in a structured, standardised way. Open-access data repositories are recommended, preferably data- or discipline-specific repositories where possible.
Choosing the right repository remains crucial; more guidance is available in choosing the right repository.
Your institution may also provide additional publishing guidelines and resources, so consulting local research data management experts is recommended.
Challenges
For some data types and workflows, FAIR compliance is straightforward; for others, community standards or suitable open formats are still missing, especially in niche analytical methods. FAIR is a spectrum rather than an absolute, so improving each workflow as far as currently possible is still valuable.
Many legacy devices do not produce open formats, and some have no digital output at all. This complicates RDM, but tools such as Chemotion’s ChemConverter can generate open formats from otherwise incompatible analytical outputs.
A major RDM challenge in chemistry is limited inter-ELN interoperability, which makes data transfer between ELNs difficult and complicates interdisciplinary collaboration across groups using different systems. Efforts to improve this are underway, including the ELN consortium, of which Chemotion is a member.