Tutorial: How to Use Post-Processing Tools¶

Running PAMC Calculations

As an example, we’ll use a calculation from the TRHEPD forward problem solver (odatse-STR). The parameter space is 3-dimensional, with 51 temperature points logarithmically spaced from T=1.0 to 1.0e-6. Each annealing step consists of 20 MCMC steps. The number of replicas is set to 100 per process with 4 MPI processes.

Results are output under the output directory. MCMC calculation logs for each MPI process are written to output/{rank}/result_T{index}.txt, split by temperature points. The expected values and variances of f(x), along with partition function values, are output to output/fx.txt.
Note

If export_combined_files is set to True, logs are consolidated in combined.txt.
```
python3 extract_combined.py -t result.txt -d output
```
Run this to extract result.txt. Then split result.txt by temperature points.

extract_combined.py is a tool for extracting lines starting with specific tags, with the following options:
- -t, --tag : Target tag for extraction (required)
- -d, --data_dir : Directory containing data files
For details, see extract_combined.py.
Note

If separate_T is False, logs are output to result.txt.
```
python3 separateT.py -d output
```
Run this to split into result_T{index}.txt files by temperature point.

separateT.py is a tool for splitting MCMC data files by temperature, with the following options:
- -d, --data_dir : Directory containing data files
- -t, --file_type : Specific filename to split (processes result.txt in each directory if not specified)
For details, see separateT.py.
Calculating Model Evidence

The model evidence \(\log P(D;\beta)\) is expressed as:

\[\log P(D;\beta) = \log\left(\dfrac{Z_\beta}{Z_{\beta_0}}\right) - \log V_\Omega + \sum_\mu \dfrac{n_\mu}{2}\log\left(\dfrac{\beta w_\mu}{\pi}\right)\]

Calculate model evidence using the partition function values \(\log Z/Z_0\) from output/fx.txt. This requires specifying the search space volume \(V_\Omega\) (normalization factor for prior probability) and the number of data points \(n\).

In this example, the search space spans [3.0, 6.0] for each of z1, z2, z3. The number of data points (rows in experiment.txt) is 70.
```
python3 plt_model_evidence.py -V 27.0 -n 70 output/fx.txt
```
Model evidence values are written to model_evidence.txt, and a plot against beta is output to model_evidence.png.

plt_model_evidence.py accepts these options:
- -V, --Volume : Search space volume \(V_\Omega\)
- -n, --ndata : Number of data points (required)
- -o, --output : Output plot filename. The output format is determined by the file extension.
For details, see plt_model_evidence.py.

Fig. 8 Plot of model evidence. Maximum value occurs at beta= \(1.91\times 10^5\) (Tstep=44).¶
Summarizing Search Data by Temperature Points

Extract and combine replica configurations at the end of annealing from MCMC step information in output/{rank}/result_T{index}.txt.
```
python3 summarize_each_T.py -d output -o summarized
```
Results are written to summarized/result_T{index}_summarized.txt.

summarize_each_T.py extracts and combines replica configuration data for each temperature point, with these options:
- -d, --data_directory : Directory containing MCMC data files
- -o, --export_directory : Output directory
Using the -i, --input_file option with the TOML configuration file from the PAMC calculation automatically retrieves parameters such as the number of replicas.

For details, see summarize_each_T.py.
Creating 1D and 2D Marginalized Histograms

Plot weighted posterior probability distributions \(P(z_i|D;\beta) = \dfrac{P(D|z_i\beta) P(z_i)}{P(D;\beta)}\) using replica configuration data.

To create 1D histograms marginalized along each \(z_i\):
```
python3 plt_1D_histogram.py -d summarized -o 1dhist -r 3.0,6.0
```
This creates histograms for each data file in summarized/, outputting to 1dhist/ as 1Dhistogram_result_T{index}_beta_{beta}.png. Value range is set to 3.0-6.0.

plt_1D_histogram.py accepts these main options:
- -d, --data_dir : Directory containing data files
- -o, --output_dir : Output directory
- -r, --range : Variable range (“min,max” format)
- -b, --bins : Number of histogram bins (default: 60)
- -f, --format : Output file formats (comma-separated list, default: “png”)
- --config : Configuration file path (TOML format)
- --params : Path to parameter file used in PAMC calculation
Using a configuration file allows setting multiple options together.

For details, see plt_1D_histogram.py.

Fig. 9 Example 1D marginalized histogram output (Tstep=22, \(\beta=4.365\times 10^2\)).¶

To create 2D marginalized histograms:
```
python3 plt_2D_histogram.py -d summarized -o 2dhist -r 3.0,6.0
```
This creates 2D histograms for combinations (z1,z2), (z1,z3), (z2,z3), outputting to 2dhist/ as 2Dhistogram_result_T{index}_beta_{beta}_x1_vs_x2.png etc. (Axis labels are x1, x2, … if field_list not specified.)

plt_2D_histogram.py has the same options as plt_1D_histogram.py plus these features:
- Generates histograms for each pair of variables
- Visualizes probability density using logarithmic color mapping
Output filenames follow the pattern: 2Dhistogram_[filename]_[x-axis label]_vs_[y-axis label].[format]

Example: 2Dhistogram_result_T44_beta_1.91e+05_x1_vs_x2.png

For details, see plt_2D_histogram.py.

Fig. 10 Example 2D marginalized histogram output (Tstep=22, z1-z2 axis plot).¶