A powder diffraction indexing program that uses machine learning models to initialize the SVD-Index algorithm. It takes an input peak list and returns a list of unit cells ranked by Figure of Merit.
Note: The paper describing the methods is currently in submission to the Journal of Applied Crystallography. This application is in beta stage. Usage and feedback would be greatly appreciated to improve user experience.
pip install mlindex
mlindex.download_modelsmlindex.download_models fetches the ML model files (~1 GB) from GitHub using git and git-lfs and installs them to ~/.local/share/mlindex/models/. git-lfs must be installed before running this step.
The model directory can be customized with --models-dir or the MLINDEX_MODELS_DIR environment variable:
mlindex.download_models --models-dir /path/to/models
export MLINDEX_MODELS_DIR=/path/to/modelsRequired for model training, dataset generation, or contributing to the codebase. The machine learning models are version controlled through git-lfs.
-
Clone the repository:
git clone [email protected]:dwmoreau/MLI.git
-
Retrieve the model files:
git lfs fetch --all git lfs checkout
-
Install the project:
cd /path/to/the/cloned/repo pip install .
Peak list files generated by GSAS-II can be used directly. GSAS-II provides tutorials for creating peak lists:
Alternatively, provide the d-spacings of the observed diffraction peaks in units of q², where q² = (2 sin θ / λ)² = 1/d² (Å⁻²). Save this list to a numpy array.
Note: Only the first 20 peaks in the list are used internally.
mlindex.run --peak-file /path/to/your/file/peaks.npyWhen using a GSAS-II pkslst file, you must supply the wavelength:
mlindex.run --peak-file /path/to/your/file/peaks.pkslst --wavelength 0.413128Use --nproc N to run with N parallel worker processes. This is the recommended way to speed up indexing:
mlindex.run --peak-file /path/to/your/file/peaks.npy --nproc 4If your instrument has a systematic 2θ offset, use --zero-error to correct for it during indexing. This option requires a wavelength to be specified:
mlindex.run --peak-file /path/to/your/file/peaks.pkslst --wavelength 0.413128 --zero-errorMPI mode is available for use on HPC clusters with MPI infrastructure. It requires exactly 6 MPI ranks and the --mpi flag:
mpiexec -n 6 mlindex.run --peak-file /path/to/your/file/peaks.pkslst --wavelength 0.413128 --mpimlindex.run_analytical uses a geometry-based guess-and-check approach instead of ML models. It covers the 11 higher-symmetry Bravais lattices (cF, cI, cP, hP, hR, tI, tP, oC, oF, oI, oP) and requires no model files.
mlindex.run_analytical --peak-file /path/to/your/file/peaks.npymlindex.run_analytical --peak-file /path/to/your/file/peaks.pkslst --wavelength 0.413128mlindex.run_analytical --peak-file /path/to/your/file/peaks.npy --nproc 4mlindex.run_analytical --peak-file /path/to/your/file/peaks.pkslst --wavelength 0.413128 --zero-errormpiexec -n 6 mlindex.run_analytical --peak-file /path/to/your/file/peaks.pkslst --wavelength 0.413128 --mpiResults are written to analytic_results.json.
The program outputs the top 20 unit cell candidates ranked by M20 score and writes them to indexing_results.json:
| Column | Description |
|---|---|
| M20 | de Wolff Figure of Merit (Wolff 1968) |
| Minfo | Figure of Merit from Taupin (1988) |
| n_indexed | Number of indexed peaks, using a probability from Taupin (1988) and a 95% threshold |
| bravais_lattice | Assumed Bravais lattice for the unit cell optimization |
| spacegroup | Spacegroup whose systematic absences best align with the observed peak list |
| volume | Unit cell volume (ų) |
| a, b, c | Unit cell edge lengths (Å) |
| alpha, beta, gamma | Unit cell angles (°) |
The US Department of Energy Integrated Computational and Data Infrastructure for Scientific Discovery supported this work via grant DE-SC0022215 to Aaron S. Brewster (LBL), Tess Smidt (MIT), and Nate Hohmann (UCONN).
- Taupin, D. (1988). J. Appl. Cryst. 21, 485-489.
- Wolff, P. M. D. (1968). J. Appl. Cryst. 1, 108.
