5.3. [mlref] section¶
Set options for retrieving only atomic configurations from the results of RXMC calculations. This is used, for example, to evaluate the accuracy of neural network models and to extend the training data. The file format is as follows.
[mlref] nreplicas = 3 ndata = 50
5.3.1. Input Format¶
Keywords and their values are specified by a keyword and its value in the form keyword = value
.
Comments can also be entered by adding # (Subsequent characters are ignored).
5.3.2. Key words¶
About replica
nreplicas
Format : int (natural number)
Description : The number of replicas.
ndata
Format : int (natural number)
Description : The number of data (configuration) to be sampled
sampler
Format : string (default: “linspace”)
Description : The method to extract \(N_\text{data}\) samples from \(N\) samples generated by Monte Carlo method.
“linspace”
Extract equilispaced samples by
numpy.linspace(0, N-1, num=ndata, dtype=int)
“random”
Random sampling by
numpy.random.choice(range(N), size=ndata, replace=False)
5.4. [mlref.solver] section¶
Configure the solver used to calculate the training data (configuration energy). This section specifies solver parameters such as solver type (VASP, QE, …), path to solver, directory with solver-specific input file(s). It is basically the same as the [sampling.solver] section and has the following file format.
[mlref.solver]
type = 'vasp'
base_input_dir = './baseinput'
perturb = 0.1
5.4.1. Input Format¶
Keywords and their values are specified by a keyword and its value in the form keyword = value
.
Comments can also be entered by adding # (Subsequent characters are ignored).
5.4.2. Keywords¶
type
Format : str
Description : The solver type (
OpenMX, QE, VASP, aenet
).
base_input_dir
Format : str or list of str
Description : The path to the base input file. If multiple calculations are set up in the form of a list, each calculation using each input is performed in turn. For the second and subsequent calculations, the structure from the last step of the previous calculation is used as the initial coordinates, and the energy from the last calculation is used. For example, it is possible to perform a fast structural optimization in the first input file at the expense of accuracy, and then perform the structural optimization in the second and later input files with a higher accuracy setting. Or, in the case of grid vector relaxation, one can run the same input multiple times to reset the computational mesh based on a set plane-wave cutoff.
perturb
Format : float
Description : If a structure with good symmetry is input, structure optimization tends to stop at the saddle point. In order to avoid this, an initial structure is formed by randomly displacing each atom in proportion to this parameter. It can also be set to 0.0 or false. Default value = 0.0.
ignore_species
Format : list
Description : Specify atomic species to “ignore” in neural network models such as
aenet
. For those that always have an occupancy of 1, it is computationally more efficient to ignore their presence when training and evaluating neural network models.