sGDML Symmetric Gradient Domain Machine Learning

Ab initio accelerated.

Accurate global machine learning force fields with hundreds of atoms

Learn more     Get started

subject Articles


Machine Learning of Accurate Energy-Conserving Molecular Force Fields

Chmiela, S., Tkatchenko, A., Sauceda, H. E., Poltavsky, I., Schütt, K. T., Müller, K.-R., Science Advances, 3(5), 2017, e1603015.


Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields

Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A., Nature Communications, 9(1), 2018, 3887.

sGDML foundations book chapter

Accurate Molecular Dynamics Enabled by Efficient Physically-Constrained Machine Learning Approaches

Chmiela, S., Sauceda, H. E., Tkatchenko, A., Müller, K.-R., In: Machine Learning Meets Quantum Physics, Lecture Notes in Physics (Springer), 968, 2020, pp. 129-154.

Large-scale sGDML

NEW Accurate Global Machine Learning Force Fields for Molecules with Hundreds of Atoms

Chmiela, S., Vassilev-Galindo, V., Unke, O. T., Kabylda, A., Sauceda, H. E., Tkatchenko, A., Müller, K.-R., Science Advances, 9(2), 2023, eadf0873.

sGDML software paper

sGDML: Constructing Accurate and Data Efficient Molecular Force Fields Using Machine Learning

Chmiela, S., Sauceda, H. E., Poltavsky, I., Müller, K.-R., Tkatchenko, A., Computer Physics Communications, 240, 2019, pp. 38-45.

Nuclear quantum effects

Dynamical Strengthening of Covalent and Non-Covalent Molecular Interactions by Nuclear Quantum Effects at Finite Temperature

Sauceda, H. E., Vassilev-Galindo, V., Chmiela, S., Müller, K.-R., Tkatchenko, A., Nature Communications, 12(1), 2021, 442.

Molecular force fields with sGDML

Molecular Force Fields with Gradient-domain Machine Learning (GDML): Comparison and Synergies with Classical Force Fields

Sauceda, H. E., Gastegger, M., Chmiela, S., Müller, K.-R., Tkatchenko, A., The Journal of Chemical Physics, 153, 2020, 124109.

sGDML application paper

Molecular Force Fields with Gradient-Domain Machine Learning: Construction and Application to Dynamics of Small Molecules with Coupled Cluster Forces

Sauceda, H. E., Chmiela, S., Poltavsky, I., Müller, K.-R., Tkatchenko, A., The Journal of Chemical Physics, 150, 2019, 114102.

sGDML applications book chapter

Construction of Machine Learned Force Fields with Quantum Chemical Accuracy: Applications and Chemical Insights

Sauceda, H. E., Chmiela, S., Poltavsky, I., Müller, K.-R., Tkatchenko, A., In: Machine Learning Meets Quantum Physics, Lecture Notes in Physics (Springer), 968, 2020, pp. 277-307.

sGDML coarse graining

Ensemble Learning of Coarse-Grained Molecular Dynamics Force Fields with a Kernel Approach

Wang, J., Chmiela, S., Müller, K.-R., Noè, F., Clementi, C., The Journal of Chemical Physics, 152, 2020, 194106.


BIGDML - Towards Accurate Quantum Machine Learning Force Fields for Materials

Sauceda, H. E., Gálvez-González, L. E., Chmiela, S., Paz-Borbó, L. O. , Müller, K.-R., Tkatchenko, A., Nature Communications, 9(1), 2018, 3887.


Machine Learning Force Fields

Unke, O. T., Chmiela, S., Sauceda, H. E., Gastegger, M., Poltavsky, I., Schütt, K. T., Tkatchenko, A., Müller, K.-R., Chemical Reviews, 121(16), 2021, pp. 10142-10186.


Combining Machine Learning and Computational Chemistry for Predictive Insights into Chemical Systems

Keith, J. A., Vassilev-Galindo, V., Cheng, B., Chmiela, S., Gastegger, M., Müller, K. R., Tkatchenko, A., Chemical Reviews, 121(16), 2021, pp. 9816-9872.

code Code (latest: v1.0.2)

Replicate our numerical results or reconstruct a force field from your own dataset with a Python implementation of sGDML.

GitHub Documentation

Get started

The sgdml package uses a proprietary dataset format, but it is easy to convert from and to Extended XYZ files and other popular file formats (learn more):

$ <xyz_dataset_file>

A force field is created via a single command-line call that yields a ready-to-use model file:

$ sgdml all <sgdml_dataset_file> <n_train> <n_validate> [<n_test>]

The last three parameters specify the sizes for the training, validation and test dataset splits, which are sampled from the provided dataset file without overlap. Leave out <n_test> to use all remaining points for testing (learn more).

A force field model is effectively a parametrization of your dataset that provides energy e and forces f for any input geometry r (learn more):

import numpy as np
from sgdml.predict import GDMLPredict
from sgdml.utils import io

model = np.load('model.npz')
gdml = GDMLPredict(model)

r,_ = io.read_xyz('')
e,f = gdml.predict(r)

This flexibility enables many applications, e.g. by interfacing to Atomic Simulation Environment (ASE). Here are a few examples:

Training service Experimental

We offer an experimental model training service for anyone without sufficient compute resources. Simply upload your dataset, schedule some training jobs and return later to collect your model files:

We accept sGDML dataset files. By uploading, you agree to our terms of use.

file_download Datasets

All geometries in Å, energy labels in kcal mol-1 and force labels in kcal mol-1 Å-1.
Name Size Benchmark Download
DFT [FHI-aims, light tier 1]
Benzene (Chmiela et al., 2018) 49,863
MD17 dataset (Chmiela et al., 2017)
Benzene 627,983
Uracil 133,770
Naphthalene 326,250
Aspirin 211,762
Salicylic acid 320,231
Malonaldehyde 993,237
Ethanol 555,092
Toluene 442,790
Paracetamol 106,490
Azobenzene 99,999
MD22 dataset (Chmiela et al., 2023)
DFT [FHI-aims, tight tiers 1&2]
Ac-Ala3-NHMe 85,109
Docosahexaenoic acidDHA 69,753
Stachyose 27,272
DNA base pair (AT-AT) 20,001
DNA base pair (AT-AT-CG-CG) 10,153
DFT [FHI-aims, light tier 1]
Buckyball catcher 6,102
Double-walled nanotubeDWNT 5,032
CCSD [Psi4, cc-pVDZ]
Aspirin 1,500
CCSD(T) [Psi4, cc-pVDZ]
Benzene 1,500
Malonaldehyde 1,500
Toluene 1,500
CCSD(T) [Psi4, cc-pVTZ]
Ethanol 2,000