ICAMS / Interdisciplinary Centre for Advanced Materials Simulation

Publications

strucscan: A lightweight Python-based framework for high-throughput material simulation

I. Pietka, R. Drautz, T. Hammerschmidt.

Journal of Open Source Software, 7, 4719, (2022)

Abstract
The development of new materials by computational materials science relies to a large degree on the prediction of material properties by simulation at different time and length scales. A common challenge at the atomic scale is the need for large numbers of calculations in order to sample, e.g., different chemical compositions, different crystal structures or different simulation settings. Typical examples of the required high-throughput calculations are (i) the sampling of the combinatorial space of structure and composition for determining the most stable structure of a mixture of chemical elements, (ii) the generation of a data set for constructing an interatomic interaction model or (iii) the generation of a data set for inferring properties by machine learning. Depending on the problem at hand, the number of required calculations may range from hundreds to millions. The Python-based framework strucscan provides a robust solution to handle such high-throughput calculations in an efficient way on compute clusters with a queueing system or on the local host. The simple and transparent workflow of strucscan loops over a specified list of crystal structures and chemical compositions and computes a specified list of properties for each combination. The property calculations are represented as a pipeline of successive, interdependent steps which can easily be adapted and extended. The data is stored in a human-readable data-tree with flat hierarchy. strucscan performs a series of scalable and easily extendable pre-processing and post-processing steps and compiles the results in Python dictionaries for further evaluation. Data provenance for research-data management and analytics is realized in terms of the data-tree structure that includes all input files. The present version of strucscan is tailored to the calculation of frequently needed material properties with widely used atomistic simulation codes on common scheduling systems. The implemented interfaces particularly support the VASP software package for density-functional theory calculations on SunGridEngine and slurm scheduler systems. With the well-defined and documented interfaces, strucscan can be extended, with basic programming skills, with additional scheduling systems, simulation codes and material properties at the atomic scale as well as other simulation scales.


Keyword(s): high-throughput methods; dft;
Cite as: https://joss.theoj.org/papers/10.21105/joss.04719
DOI: 10.21105/joss.04719
Download BibTEX

┬ź back