Accurate global machine learning force fields with hundreds of atoms
Learn more Get started
Chmiela, S., Tkatchenko, A., Sauceda, H. E., Poltavsky, I., Schütt, K. T., Müller, K.-R., Science Advances, 3(5), 2017, e1603015.

Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A., Nature Communications, 9(1), 2018, 3887.


Chmiela, S., Vassilev-Galindo, V., Unke, O. T., Kabylda, A., Sauceda, H. E., Tkatchenko, A., Müller, K.-R., Science Advances, 9(2), 2023, eadf0873.

Chmiela, S., Sauceda, H. E., Poltavsky, I., Müller, K.-R., Tkatchenko, A., Computer Physics Communications, 240, 2019, pp. 38-45.

Sauceda, H. E., Vassilev-Galindo, V., Chmiela, S., Müller, K.-R., Tkatchenko, A., Nature Communications, 12(1), 2021, 442.

Sauceda, H. E., Gastegger, M., Chmiela, S., Müller, K.-R., Tkatchenko, A., The Journal of Chemical Physics, 153, 2020, 124109.

Sauceda, H. E., Chmiela, S., Poltavsky, I., Müller, K.-R., Tkatchenko, A., The Journal of Chemical Physics, 150, 2019, 114102.

Sauceda, H. E., Chmiela, S., Poltavsky, I., Müller, K.-R., Tkatchenko, A., In: Machine Learning Meets Quantum Physics, Lecture Notes in Physics (Springer), 968, 2020, pp. 277-307.

Wang, J., Chmiela, S., Müller, K.-R., Noé, F., Clementi, C., The Journal of Chemical Physics, 152, 2020, 194106.

Sauceda, H. E., Gálvez-González, L. E., Chmiela, S., Paz-Borbó, L. O. , Müller, K.-R., Tkatchenko, A., Nature Communications, 9(1), 2018, 3887.

Replicate our numerical results or reconstruct a force field from your own dataset with a Python implementation of sGDML.
The sgdml package uses a proprietary dataset format, but it is easy to convert from and to Extended XYZ files and other popular file formats (learn more):
$ sgdml_dataset_via_ase.py <xyz_dataset_file>
A force field is created via a single command-line call that yields a ready-to-use model file:
$ sgdml all <sgdml_dataset_file> <n_train> <n_validate> [<n_test>]
The last three parameters specify the sizes for the training, validation and test dataset splits, which are sampled from the provided dataset file without overlap. Leave out <n_test> to use all remaining points for testing (learn more).
A force field model is effectively a parametrization of your dataset that provides energy e and forces f for any input geometry r (learn more):
import numpy as np
from sgdml.predict import GDMLPredict
from sgdml.utils import io
model = np.load('model.npz')
gdml = GDMLPredict(model)
r,_ = io.read_xyz('geometry.xyz')
e,f = gdml.predict(r)
This flexibility enables many applications, e.g. by interfacing to Atomic Simulation Environment (ASE). Here are a few examples:
We offer an experimental model training service for anyone without sufficient compute resources. Simply upload your dataset, schedule some training jobs and return later to collect your model files:
| Name | Size | Benchmark | Download |
|
DFT [FHI-aims, light tier 1]
|
|||
| Benzene (Chmiela et al., 2018) | 49,863 | ||
|
MD17 dataset (Chmiela et al., 2017)
|
|||
| Benzene | 627,983 | ||
| Uracil | 133,770 | ||
| Naphthalene | 326,250 | ||
| Aspirin | 211,762 | ||
| Salicylic acid | 320,231 | ||
| Malonaldehyde | 993,237 | ||
| Ethanol | 555,092 | ||
| Toluene | 442,790 | ||
| Paracetamol | 106,490 | ||
| Azobenzene | 99,999 | ||
|
MD22 dataset (Chmiela et al., 2023)
|
|||
|
DFT [FHI-aims, tight tiers 1&2]
|
|||
| Ac-Ala3-NHMe | 85,109 | ||
| Docosahexaenoic acidDHA | 69,753 | ||
| Stachyose | 27,272 | ||
| DNA base pair (AT-AT) | 20,001 | ||
| DNA base pair (AT-AT-CG-CG) | 10,153 | ||
|
DFT [FHI-aims, light tier 1]
|
|||
| Buckyball catcher | 6,102 | ||
| Double-walled nanotubeDWNT | 5,032 | ||
CCSD [Psi4, cc-pVDZ] |
|||
| Aspirin | 1,500 | ||
CCSD(T) [Psi4, cc-pVDZ] |
|||
| Benzene | 1,500 | ||
| Malonaldehyde | 1,500 | ||
| Toluene | 1,500 | ||
CCSD(T) [Psi4, cc-pVTZ] |
|||
| Ethanol | 2,000 | ||