Skip to main content

Physical and Computational Chemistry

Introduction

Physical chemistry includes many sub-disciplines and methods that generate highly heterogeneous data. These range from spectroscopy and imaging to simulation files and in-house analysis code, with data volumes spanning very large datasets to small text-based files. Many physical chemists are data-literate and can help establish unified, streamlined data-management solutions within their research groups.

Data Types

Physical chemistry produces diverse data across sub-disciplines. Research groups may focus on imaging (e.g., super-resolution microscopy), method development, spectroscopic analysis, numeric simulations, or combinations of these.

ELNs and Other Tools

For effective data management, tools should be selected at project or group level based on existing workflows. Because workflows are often method-specific, usage guidelines and metadata templates should be defined and documented in a data management plan (DMP). NFDI4Chem provides an RDMO template tailored to chemistry.

An electronic lab notebook (ELN) supports day-to-day planning, structured experiment documentation, and sometimes workflow management. In this diverse field, ELNs should be flexible and customisable. Universities may provide central solutions, but research groups often choose tools that match their needs and resources. The ELN-Finder lists options, and choosing an ELN provides further guidance:

Loading...

In addition to ELNs, local repositories and research data management tools can help prepare data for publication.

For those writing scripts and developing research software solutions, Git is a highly recommended versioning tool. Many universities also have their own instances of GitLab to assist in managing software projects.

For research data, DataLad (built on git) helps track metadata during processing and analysis. It also works with GUI-based steps, but is especially powerful in script-based workflows.

Because many physical chemists build their own workflows and tools, community standards for file formats and metadata should be used whenever possible. At minimum, research groups should define shared documentation and format standards to support knowledge transfer. ELNs can provide templates, DMPs can record standards, and automated workflows (e.g., from instruments to ELN and storage) can reduce manual work and improve data quality via built-in integrations or REST APIs.

Publishing Data

Publishing research data, especially data underlying publications, enables reproducibility and reuse. Research data repositories support FAIR publication and include subject-specific, general, and institutional options. In physical chemistry, suitable examples include RADAR4Chem, ioChem-BD for computational chemistry, and the Image Data Resource (IDR) or self-hosted Omero for imaging data. For repository selection guidance, see here.

For in-house research software, software is data and should be published accordingly. GitHub currently provides an automated release workflow to zenodo, and other routes are available to assign software a DOI and make it citable, as outlined here.

Challenges

In physical chemistry, FAIR data challenges are closely linked to diverse methods and data types. Labs often have individual workflows, so groups should streamline common steps and define reusable metadata templates in ELNs or file systems to support both researchers and FAIR data infrastructures.

Especially in imaging, large data volumes can strain local storage. Central storage solutions can help when combined with good research data management practices to ensure reusability and avoid inefficient data organisation.