Documentation status
This page was generated and edited with the assistance of an LLM and is still in development. It has not been fully vetted by the developer. Verify commands, UI labels, file paths, workflow descriptions, and scientific claims against the current code and your local workflow before relying on it.
If you notice an error, omission, or outdated guidance, please open an issue on GitHub.
MD Extraction and Cluster Preparation¶
Cluster extraction is the bridge between raw trajectory data and the SAXS model. In this repository, that bridge spans more than one tool.
Typical path¶
- Use
mdtrajectoryto inspect a trajectory and export frames. - Optionally use
xyz2pdbif you need molecule-aware PDB frames. - Use
clustersto extract stoichiometry-sorted cluster files. - Use
clusterdynamicsto build time-dependent cluster-distribution heatmaps and lifetime tables from the extracted frames. - Optionally use
bondanalysisto measure bond or angle distributions on those clusters. - Optionally use Debye-Waller Analysis to estimate project-backed pairwise disorder coefficients from sorted PDB cluster folders.
- Feed the resulting cluster folder into the SAXS project.
Run the examples from the repository root after creating the
saxshell-py312 conda environment.
mdtrajectory¶
This tool is responsible for:
- inspecting trajectory metadata
- optionally reading CP2K
.enerfiles - suggesting a cutoff
- exporting selected frames into a sibling folder
- writing
mdtrajectory_export.jsonmetadata beside the exported frames so downstream tools can recover the original frame indices and times
Example:
PYTHONPATH=src conda run --no-capture-output -n saxshell-py312 python -m saxshell.mdtrajectory inspect traj.xyz --energy-file traj.ener
PYTHONPATH=src conda run --no-capture-output -n saxshell-py312 python -m saxshell.mdtrajectory suggest-cutoff traj.xyz --energy-file traj.ener --temp-target-k 300 --window 2
PYTHONPATH=src conda run --no-capture-output -n saxshell-py312 python -m saxshell.mdtrajectory export traj.xyz --energy-file traj.ener --use-suggested-cutoff --temp-target-k 300 --window 2
When a cutoff is applied, the default folder name now uses the form
splitxyz_f995_t497p5fs or splitpdb_f995_t497p5fs, where f995 records the
first exported source-frame index and t497p5fs records the first exported
time in femtoseconds.
xyz2pdb¶
Use this only when residue identity matters downstream.
The current UI and CLI support:
- analyzing one sample frame and detecting the element inventory
- defining free atoms and reference molecules directly in the UI
- editing per-bond percentage tolerances and tight/relaxed search windows
- estimating molecule counts before export
- converting frames in the background while reusing the first-frame mapping template
- optional assertion mode for per-molecule geometry checks and reference updates
- browsing and creating reference molecules in the library
See the dedicated guide for the full interface and workflow:
clusters¶
The cluster workflow supports both UI and CLI usage. Its CLI exposes separate
inspect, preview, and export modes, plus settings for:
- node, linker, and shell rules
- pair cutoffs
- box dimensions
- periodic boundary conditions
- search mode
- save-state frequency
The CLI help text explicitly calls out faster neighbor search modes such as
kdtree and vectorized.
Project-backed cluster run files¶
For repeatable project runs, launch the setup window and save a run file in the SAXSShell project folder:
From the main SAXSShell window, use Tools > CLI Setup > Open Cluster Extraction CLI Setup (Beta). The same setup window can also be launched from a terminal:
PYTHONPATH=src conda run --no-capture-output -n saxshell-py312 python -m saxshell.cluster setup-ui /path/to/saxs_project
The setup window records the project folder, extracted frames folder, output
clusters folder, atom rules, pair cutoffs, PBC/box settings, shell options, and
neighbor-search settings in cluster_extraction_cli_run.json. Paths inside the
project are stored relative to the project folder so the project can move as a
unit.
After saving, run the extraction from the terminal:
PYTHONPATH=src conda run --no-capture-output -n saxshell-py312 python -m saxshell.cluster run /path/to/saxs_project
Use --run-file custom_run.json to run a different JSON file. Relative
--run-file paths are resolved against the project folder. A completed run
updates the project clusters_dir reference to the output folder while leaving
the existing frames and PDB-frames references unchanged.
clusterdynamics¶
This application consumes the extracted XYZ or PDB frames from mdtrajectory
and applies the same cluster definitions and pair-cutoff rules used by
clusters, but bins the results over time instead of writing one
stoichiometry-folder export.
Key outputs:
- time-binned cluster-distribution heatmaps
- optional CP2K
.eneroverlays aligned to the same time axis - a sortable lifetime table by stoichiometry label
- saved JSON/CSV datasets that can be reopened later for plotting
Downstream structure analysis¶
After the sorted cluster folders have been produced, the main UI exposes two structure-analysis tools that reuse them:
- Bond Analysis for bond-pair and angle-triplet distributions
- Debye-Waller Analysis for pairwise
thermal-displacement coefficients from sorted
PDBcluster folders
What SAXS expects from this stage¶
The SAXS workflow expects a usable cluster folder with representative structure files and enough component identity to build prior weights and scattering components.
If you are unsure whether your cluster folder is ready, start in Project Setup and confirm that the project can discover the expected clusters before moving on.