Cluster Extraction¶
Cluster extraction is the bridge between raw trajectory data and the SAXS model. In this repository, that bridge spans more than one tool.
Typical path¶
- Use
mdtrajectoryto inspect a trajectory and export frames. - Optionally use
xyz2pdbif you need molecule-aware PDB frames. - Use
clustersto extract stoichiometry-sorted cluster files. - Use
clusterdynamicsto build time-dependent cluster-distribution heatmaps and lifetime tables from the extracted frames. - Use
bondanalysisto measure bond or angle distributions on those clusters. - Feed the resulting cluster folder into the SAXS project.
mdtrajectory¶
This tool is responsible for:
- inspecting trajectory metadata
- optionally reading CP2K
.enerfiles - suggesting a cutoff
- exporting selected frames into a sibling folder
- writing
mdtrajectory_export.jsonmetadata beside the exported frames so downstream tools can recover the original frame indices and times
Example:
mdtrajectory inspect traj.xyz --energy-file traj.ener
mdtrajectory suggest-cutoff traj.xyz --energy-file traj.ener --temp-target-k 300 --window 3
mdtrajectory export traj.xyz --energy-file traj.ener --use-suggested-cutoff --temp-target-k 300 --window 3
When a cutoff is applied, the default folder name now uses the form
splitxyz_f995_t497p5fs or splitpdb_f995_t497p5fs, where f995 records the
first exported source-frame index and t497p5fs records the first exported
time in femtoseconds.
xyz2pdb¶
Use this only when residue identity matters downstream.
The current UI and CLI support:
- analyzing one sample frame and detecting the element inventory
- defining free atoms and reference molecules directly in the UI
- editing per-bond percentage tolerances and tight/relaxed search windows
- estimating molecule counts before export
- converting frames in the background while reusing the first-frame mapping template
- optional assertion mode for per-molecule geometry checks and reference updates
- browsing and creating reference molecules in the library
See the dedicated guide for the full interface and workflow:
clusters¶
The cluster workflow supports both UI and CLI usage. Its CLI exposes separate
inspect, preview, and export modes, plus settings for:
- node, linker, and shell rules
- pair cutoffs
- box dimensions
- periodic boundary conditions
- search mode
- save-state frequency
The CLI help text explicitly calls out faster neighbor search modes such as
kdtree and vectorized.
clusterdynamics¶
This application consumes the extracted XYZ or PDB frames from mdtrajectory
and applies the same cluster definitions and pair-cutoff rules used by
clusters, but bins the results over time instead of writing one
stoichiometry-folder export.
Key outputs:
- time-binned cluster-distribution heatmaps
- optional CP2K
.eneroverlays aligned to the same time axis - a sortable lifetime table by stoichiometry label
- saved JSON/CSV datasets that can be reopened later for plotting
bondanalysis¶
Bond analysis is downstream of cluster extraction. Use it to derive bond-pair and angle-triplet distributions from the stoichiometry folders produced by the cluster workflow.
What SAXS expects from this stage¶
The SAXS workflow expects a usable cluster folder with representative structure files and enough component identity to build prior weights and scattering components.
If you are unsure whether your cluster folder is ready, start in Project Setup and confirm that the project can discover the expected clusters before moving on.