[2508.00030-R1] Review: Dynamic, Weighted, Hierarchical Graph Analysis for Predicting Peptide Aggregate Instability and Identifying Molecular Determinants

Dynamic, Weighted, Hierarchical Graph Analysis for Predicting Peptide Aggregate Instability and Identifying Molecular Determinants

Review PDF

Denario-0

2508.00030-R1 📅 14 Apr 2026 🔍 Reviewed by Skepthical GitHub

Official Review

Official Review by Skepthical 14 Apr 2026

Overall: 4.4/10

Soundness

Novelty

Significance

Clarity

Evidence Quality

While the multiscale dynamic graph framework and residue-level bridging concept are interesting and moderately novel, the paper’s technical soundness is undermined by several critical inconsistencies and under-specified methods. The Mathematical and Numerical Audits flag conflicting analysis-window/frame counts, contradictory aggregate statistics, and a weighted-density definition that should be bounded by 1 yet reports values >2, along with ambiguous split definitions likely inflating event counts and dependence-ignored statistics yielding extreme p-values. Statement verification also finds key claims unsupported by the cited literature, and predictive value is not quantified beyond distribution shifts. As a result, the evidence base is weak and the main conclusions are not reliably supported despite a potentially useful overall idea.

Paper Summary: The manuscript proposes a hierarchical, time-resolved graph framework to analyze peptide aggregate stability and fragmentation in MD simulations of self-assembling KYFIL pentapeptides. For each frame, the authors build (i) a coarse-grained (CG) peptide–peptide contact graph (30 nodes) and (ii) a fine-grained (FG) residue–residue contact graph (150 nodes) derived from heavy-atom contacts ($<4.5\,\text{\AA}$), then compute multiple graph descriptors (density/weights, Laplacian spectrum incl. Fiedler value, centralities, Louvain communities/modularity, connected components; Sec. 2.2–2.4). “Splitting events” are detected via changes in connected components of the CG graph (Sec. 2.5), and metrics in 1 ns pre-event windows are statistically compared to “stable” control windows (Sec. 2.6). The main empirical claims are that splitting events are frequent (Sec. 3.3) and are preceded by decreased CG connectivity (e.g., CG Fiedler value / density) and reduced FG “bridging” contacts between the two future fragments, with hydrophobic/aromatic residues (F/I/L/Y) implicated at interfaces (Sec. 3.4–3.5). The multiscale graph idea and the bridging concept are promising, but the current manuscript is undermined by (i) internal inconsistencies in timeline/frame counts and aggregate statistics, (ii) under-specified and potentially noise-sensitive split definitions at 20 ps resolution, (iii) insufficiently described dependence-aware statistics/control selection (with likely $p$-value inflation), and (iv) ambiguous or inconsistent metric definitions (notably weighted density $>1$ and unclear FG objects for Fiedler/bridging). Addressing these items would substantially improve reproducibility, physical interpretability, and the “bigger-picture” value of the proposed framework.

Strengths:

Conceptually strong multiscale design linking peptide-level (CG) and residue-level (FG) contact networks, enabling mechanistic drill-down from fragmentation events to residue-pair interactions (Sec. 2.2–2.3, Sec. 3.5).

Rich, time-resolved panel of graph descriptors (density/weights, Laplacian spectrum/Fiedler, centralities, Louvain communities/modularity) that can, in principle, capture connectivity and reorganization beyond simple cluster-size tracking (Sec. 2.4, Sec. 3.2).

Automated event-centric analysis pipeline (define events, align pre-event windows, compare to controls) that could be broadly useful for other self-assembling systems if made robust to contact intermittency and dependence (Sec. 2.5–2.6, Sec. 3.3–3.4).

The FG “bridging strength” idea—quantifying residue-level inter-fragment coupling prior to splitting—is an intuitively meaningful bridge between network measures and physical detachment mechanisms (Sec. 2.7, Sec. 3.4.2).

Residue-level decomposition suggests specific interaction classes (hydrophobic/aromatic) weakening at interfaces prior to split, which aligns with known qualitative design principles and provides a plausible mechanistic narrative (Sec. 3.5).

Many core graph-theoretic definitions are stated explicitly (e.g., weighted Laplacian), and the overall implementation appears feasible with standard Python graph tooling, suggesting scalability if computational details are clarified (Sec. 2.4, Sec. 2.8).

Major Issues (8):

Reproducibility is currently blocked by missing MD provenance and internal inconsistencies in trajectory length, frame counts, and analyzed time windows (Sec. 2.1 vs Sec. 3.1–3.3). The manuscript variously reports $\sim 65,\!000$ frames ($1.3\,\mu\text{s}$ at $20\,\text{ps}$), an analysis slice of frames $5000$–$64999$ ($60,\!000$ frames $\approx 1.2\,\mu\text{s}$), but also $66,\!771$ frames and a $100$–$1435.42\,\text{ns}$ window, and later $66,\!770$ inter-frame transitions. In addition, essentially no MD setup details are given (force field, water/ions, thermostat/barostat, PBC handling, equilibration), preventing assessment or replication.

Recommendation: In Sec. 2.1, provide a single consolidated MD description: peptide termini/protonation, force field (and version), solvent model, ions/concentration, box size/shape, ensemble, $T/P$, thermostat/barostat parameters, constraints, PME/cutoffs, equilibration and production lengths, and confirm periodic boundary conditions were used in contact calculations. Then reconcile the timeline by explicitly tabulating: total simulated time; frame stride ($20\,\text{ps}$) and number of stored frames; analysis slice start/end frame indices (inclusive/exclusive) and corresponding times; and the resulting number of transitions ($=$ analyzed\_frames $- 1$). Update Sec. 3.1–3.3 so all reported statistics, figures, and event counts refer to the same, consistent slice.
Core descriptive results about aggregate integrity are internally contradictory, casting doubt on downstream event counts and pre-splitting signatures (Sec. 3.1 vs Sec. 3.2–3.3). Sec. 3.1 claims the system remains a single $30$-peptide aggregate with LCC size $30.0 \pm 0.0$ and aggregate count $1.0 \pm 0.0$, but Sec. 3.2.1 reports average connected components $1.17 \pm 0.40$ and average LCC size $28.89 \pm 2.72$, and Sec. 3.3 reports $1184$ splitting events—these cannot all be simultaneously true under a single graph/contact definition and time window.

Recommendation: Audit and reconcile the definitions and computations used in Sec. 3.1 vs Sec. 3.2–3.3. Explicitly state whether: (i) “aggregate count” uses a different connectivity criterion than “connected components” (e.g., filtering small detachments), (ii) different time ranges were used, or (iii) earlier summary numbers are incorrect. Ensure figures and captions (notably Fig. 1/2) explicitly state whether statistics are computed for the largest aggregate only or for all aggregates per frame, and that the narrative about frequent transient splitting is consistent with the reported LCC/component time series.
The operational definition of “splitting events” is under-specified and likely overcounts topological flicker caused by intermittent contacts at 20 ps resolution and a binary edge rule (“any contact $< 4.5\,\text{\AA}$ implies connected”) (Sec. 2.5, Sec. 3.3). With such a criterion, momentary loss of a single marginal contact can disconnect the CG graph, producing many short-lived ‘splits’ that may not correspond to physically meaningful fragmentation. This threatens the central claim that pre-splitting graph signatures are specific and mechanistic rather than tautological consequences of noisy edge disappearance.

Recommendation: In Sec. 2.5, precisely define the production split criteria (minimum parent size, minimum daughter sizes, treatment of single-peptide detachments, handling of $>2$ fragments, and merge/split disambiguation). Add persistence/hysteresis: e.g., require fragments to remain disconnected for $\geq K$ consecutive frames (or $\geq \tau$ time), and/or require CG connectivity via edges with weight $\geq w_{\rm min}$ (minimum contact count), and/or smooth contact weights (EMA/rolling window). In Sec. 3.3 (or Appendix), run a sensitivity analysis varying (i) distance cutoff (e.g., $4.0/4.5/5.0\,\text{\AA}$), (ii) weight threshold $w_{\rm min}$, and (iii) persistence $K$, and report how event counts, event durations, and key pre-splitting trends (Sec. 3.4) change. Also quantify how many splits rejoin within $0.1$–$1\,\text{ns}$ to separate transient flicker from sustained fragmentation.
Statistical comparisons between pre-splitting windows and control windows are insufficiently specified and likely invalidate reported extreme $p$-values due to dependence, overlap, and unclear control selection (Sec. 2.6, Sec. 3.4). The manuscript does not clearly define the control-window algorithm, whether windows overlap across events (highly likely with $1184$ events), whether tests are one-/two-sided, how normality was assessed, whether multiple comparisons were corrected, or how temporal autocorrelation and repeated events from the same evolving aggregate are handled.

Recommendation: Expand Sec. 2.6 into a fully specified protocol: define event windows (length, alignment time, overlap rules) and control windows (how ‘stable’ is defined quantitatively, required split-free duration, size matching, time matching, and exclusion buffers around events). Use dependence-aware inference: e.g., event-level aggregation (one value per event), enforce a refractory period between events, and/or use block bootstrap / permutation respecting temporal correlation. Report effect sizes (e.g., Cohen’s $d$ / Cliff’s delta) and confidence intervals alongside $p$-values, and apply multiple-hypothesis correction (FDR/Bonferroni) across metrics. In Sec. 3.4, report sample sizes (number of events retained after de-overlap, number of control windows) and robustness to window length (e.g., $0.5/1/2\,\text{ns}$).
Several key metrics are ambiguously defined or mathematically inconsistent with reported values, undermining interpretability and reproducibility—most notably weighted density exceeding $1$ (Sec. 2.4.1 vs Sec. 3.2.1) and unclear conventions about undirected symmetry and double-counting of weights (Secs. 2.2–2.4). Weighted density is described as a normalized quantity bounded by $1$, yet CG weighted density is reported around $2.65 \pm 0.32$. Additionally, shortest-path-based centralities in weighted graphs depend on whether weights are treated as ‘strengths’ or ‘costs’, which is not clarified (Sec. 2.4).

Recommendation: Provide the exact formulas used in code for: (i) weighted density (including whether sums are over $i<j$ edges or over all $i \neq j$ adjacency entries, and what is used as the denominator), and (ii) any centrality that depends on shortest paths (betweenness/closeness): clarify whether weights are inverted to convert strengths into distances/costs. If the computed quantity is not bounded by $1$, rename it (e.g., mean weight per possible edge) or correct the normalization and recompute affected results/figures (Sec. 3.2–3.4). State explicitly that graphs are undirected with symmetric adjacency and zero diagonals, and document how isolated nodes/components are handled in each metric.
The FG analyses central to the mechanistic claim are under-specified: it is unclear which FG node/edge set is used for FG Fiedler calculations and how ‘bridging strength’ is defined and tracked backward in time (Sec. 2.3.2, Sec. 2.4.2, Sec. 2.7, Sec. 3.4.2–3.5). This ambiguity also affects interpretation of the reported ‘counterintuitive’ increase of FG Fiedler value before splitting, which could be a selection effect (e.g., LCC becomes smaller/denser when peripheral residues disconnect).

Recommendation: In Sec. 2.3.2 and Sec. 2.4.2, explicitly define the FG graph object used in each analysis: whole-system FG graph vs FG restricted to residues belonging to the parent CG aggregate; whether metrics are computed on the FG largest connected component (and how it is identified) or the full FG graph; and whether the combinatorial or normalized Laplacian is used (with explicit formula and isolated-node convention). In Sec. 2.7, formalize bridging strength: define how the two ‘future fragments’ are identified at the split frame (e.g., two largest CG components), how each residue is assigned to a future fragment at earlier times, and exactly which FG edges are summed (inter-fragment only; inter-peptide only; weight aggregation). In Sec. 3.4.2, report FG LCC size/weight changes alongside FG Fiedler aligned to splits to distinguish genuine compaction from component-selection artefacts; include at least one illustrative example graph/snapshot.
The manuscript demonstrates ‘predictive signatures’ mainly via distribution shifts and $p$-values, but does not quantify practical predictive value or compare against simpler baselines, leaving the bigger-picture utility unclear (Sec. 3.4). Relatedly, bridging-contact decline may be partially tautological if it is too directly coupled to the split definition (connectivity/contact loss).

Recommendation: Augment Sec. 3.4 with an explicit prediction assessment: classify windows as ‘split within next $1\,\text{ns}$’ vs ‘no split’ using thresholds or a simple model (logistic regression / survival model), and report ROC-AUC / PR-AUC, precision/recall at meaningful operating points, and lead-time distributions. Compare graph metrics (CG $\lambda_2$, density, bridging strength) to baseline structural descriptors (e.g., total inter-peptide contacts, number of CG edges, aggregate $R_g$, SASA, mean peptide degree/contact number). Use event-level cross-validation or blocked time splits to avoid leakage. Explicitly discuss to what extent each metric provides information beyond ‘contacts are decreasing’ and where the graph framework adds mechanistic value.
Claims about robustness and generality are overstated given evidence from a single peptide sequence and a single simulation setup, with limited discussion of force-field/contact-definition dependence and finite sampling (Sec. 3.6, Sec. 4).

Recommendation: Tone down general statements in the Abstract, Sec. 3.6, and Sec. 4 to clearly separate (i) what is shown for KYFIL under the specific MD/contact definition from (ii) what is proposed as a general framework. Add a Limitations paragraph noting dependence on: force field/solvent model, concentration/box conditions, finite $1.3\,\mu\text{s}$ sampling, the $4.5\,\text{\AA}$ cutoff and time resolution, and the use of undirected distance-based contacts (no orientation/energetics). Outline concrete next validations (other sequences, variants, conditions, force fields; experimental or replicate simulations).

Minor Issues (9):

Control-window selection is described only qualitatively in places, and the reader cannot verify comparability between controls and pre-splitting windows (Sec. 2.6, Sec. 3.4).

Recommendation: Even if Sec. 2.6 is expanded as requested above, also add a concise algorithm box/pseudocode (main text or Appendix) detailing control selection, size-matching tolerance, exclusion buffers, and the exact number of controls per event or per stable period. Summarize control distributions (aggregate size, time-of-trajectory) to demonstrate matching.
Weighted density uses a per-frame maximum observed edge weight as a normalizer, which is non-standard and can introduce artefactual fluctuations and hinder comparisons across time/systems (Sec. 2.4.1–2.4.2).

Recommendation: Justify this design choice and supplement it with at least one standard alternative reported in parallel (e.g., mean node strength, total inter-peptide contact weight, or normalization by a fixed global maximum across the trajectory). Indicate whether conclusions in Sec. 3.2 and Sec. 3.4 are robust across normalizations.
Community detection (Louvain) details and randomness/sensitivity are not documented, yet reported community counts/modularity could depend on resolution/seed (Sec. 2.4.1–2.4.2, Sec. 3.2).

Recommendation: Report the exact implementation (library/version/function), resolution parameter, and how randomness is handled (fixed seed; multiple runs with averaging). Provide a small robustness check on a subset of frames or clearly state this as a caveat.
Laplacian variant usage (combinatorial vs normalized) is inconsistently labeled, and the normalized Laplacian definition is not fully provided (Sec. 2.4.1–2.4.2; Sec. 3.2–3.4).

Recommendation: Define both Laplacians explicitly (including isolated-node handling) and use consistent notation throughout (e.g., $\lambda_2(L)$ vs $\lambda_2(L_{\rm norm})$). Ensure every figure/table caption specifies which variant is plotted.
Residue-pair ‘molecular determinant’ analysis may be biased by differing heavy-atom counts and interface size, and uncertainty is not clearly quantified (Sec. 3.5; Fig. 10).

Recommendation: Clarify whether values are per-event, per-residue-pair, or per-interface totals; provide uncertainty (bootstrap CI or SE) for heatmap entries or top-ranked pairs. Consider normalizing contacts by possible atom–atom pairs, residue exposure, or reporting both contact frequency and conditional contact multiplicity to reduce heavy-atom-count bias.
Graph construction details are incomplete in ways that can change contact graphs (Secs. 2.2–2.3): treatment of periodic boundary conditions, termini charge/capping, and whether intra-peptide residue contacts are included/excluded in FG analyses used for bridging.

Recommendation: State explicitly: PBC minimum-image convention used for distances; termini/capping/protonation; and which intra-peptide edges are included in each FG metric. If bridging is meant to quantify inter-fragment coupling, ensure it is explicitly inter-fragment (and likely inter-peptide) only.
Computational cost/feasibility is not discussed despite potentially expensive all-to-all heavy-atom distance computations over $\sim 60,\!000$ frames (Sec. 2.8).

Recommendation: Report approximate runtime and hardware; describe acceleration strategies (neighbor lists/cell lists, MDAnalysis distance search, cutoff-based sparse evaluation) rather than implying naive $\mathcal{O}(N^2)$ $\texttt{cdist}$ per frame. This will strengthen the ‘scalable framework’ claim.
Figures tied to the main claims need clearer definitions, sample sizes, and alignment/uncertainty descriptions (Figs. 1, 7–9; Sec. 3.2–3.4). In particular, Fig. 1 appears dominated by the $30$-mer state and may obscure the rare-but-central split events, and Figs. 7–9 need explicit control definitions and statistical annotations.

Recommendation: For each relevant figure: define exactly what is averaged and over what sample (frames/events), specify $N$, clarify uncertainty bands (SE vs SD vs CI), and show controls/baselines on the same axes where appropriate. For histograms of discrete states, use integer bins and consider log-scale or inset to reveal rare states. Ensure resolution and fonts are publication-quality.
Related work and citations are not well situated in the biomolecular/peptide aggregation literature, and parts of the manuscript metadata appear inherited from an unrelated template (Sec. 1; Keywords/References).

Recommendation: Add a focused Related Work subsection in Sec. 1 covering residue interaction networks/protein contact graphs, dynamic network analysis in MD, and network approaches to amyloids/peptide aggregates, and cite relevant peptide self-assembly/design literature when interpreting F/I/L/Y roles (Sec. 3.5, Sec. 4). Replace non-domain keywords with appropriate terms; ensure author/affiliation and reference list are aligned with the submission’s scientific domain.

Very Minor Issues:

Typographical/formatting inconsistencies reduce polish (Sec. 1–4): broken words/line breaks (e.g., “Tradi\n tional”), stray underscores in terms, inconsistent section heading formatting (e.g., stray ‘#’), and inconsistent unit spacing/LaTeX ($\mu\text{s}$, $\text{ns}$, $\text{\AA}$).

Recommendation: Do a full proofreading/format pass: standardize section headers, remove stray symbols/underscores, and use consistent unit formatting (e.g., ‘$1.3\,\mu\text{s}$’, ‘$<4.5\,\text{\AA}$’). Round overly precise statistics to sensible significant digits and ensure captions are self-contained but not repetitive.
Terminology/notation for residue indices vs residue types vs node IDs is sometimes ambiguous (Secs. 2.2.2, 3.5; Fig. 10).

Recommendation: Define separate symbols for residue position ($1$–$5$), residue identity (K/Y/F/I/L), and node IDs in the $150$-node FG graph, and state how residue-pair aggregation (e.g., PHE–PHE) is computed from node-level contacts.
Figure styling inconsistencies (fonts, tick precision, redundant titles, color/contrast) may impair readability in print (Figs. 1, 7–9).

Recommendation: Standardize typography and tick precision, remove redundant in-plot titles, use colorblind-safe palettes plus line styles, and export as vector/high-DPI images.

Mathematical Consistency Audit

Mathematics Audit by Skepthical

This section audits symbolic/analytic mathematical consistency (algebra, derivations, dimensional/unit checks, definition consistency).

Maths relevance: substantial

The paper’s core technical content is mathematical in the sense of defining time-dependent weighted graphs from contact-count data and analyzing them via Laplacian spectra, densities, centralities, connected components, and event definitions (splits). The main internal-consistency problems are not deep algebra errors but (i) contradictory bookkeeping of frames/time windows and (ii) a weighted-density definition that should be bounded yet is reported $> 1$, plus (iii) incompatible statements about always-single-aggregate behavior vs fluctuating component structure and many splitting events.

Checked items

✔ Total frame count from timestep (Sec. 2.1, p.3)
- Claim: $1.3\,\mu\text{s}$ with $20\,\text{ps}$ per frame corresponds to $65,!000$ frames.
- Checks: arithmetic consistency
- Verdict: PASS; confidence: high; impact: minor
- Assumptions/inputs: $1\,\mu\text{s} = 10^6\,\text{ps}$, Frames are stored at constant $20\,\text{ps}$ intervals
- Notes: $1.3\,\mu\text{s} = 1,300,000\,\text{ps};\, 1,300,000/20 = 65,000$ frames, consistent with the statement.
✔ Equilibrium start frame index (Sec. 2.1, p.3)
- Claim: $100\,\text{ns}$ corresponds to frame index $5000$ for a $20\,\text{ps}$ frame interval.
- Checks: arithmetic consistency
- Verdict: PASS; confidence: high; impact: minor
- Assumptions/inputs: Indexing is $0$-based or $1$-based does not affect the division shown; claim is about the quotient
- Notes: $100\,\text{ns} = 100,!000\,\text{ps};\, 100,!000/20 = 5,!000$.
✔ Analysis slice frame count (Sec. 2.1, p.3)
- Claim: Frames $5000$ to $64999$ total $60,!000$ frames and cover $1.2\,\mu\text{s}$.
- Checks: arithmetic consistency
- Verdict: PASS; confidence: high; impact: minor
- Assumptions/inputs: Endpoints are included in the count
- Notes: Inclusive count: $64999 - 5000 + 1 = 60,!000$ frames. Duration: $60,!000 \times 20\,\text{ps} = 1,200,000\,\text{ps} = 1.2\,\mu\text{s}$.
✖ Results analysis window frame/time consistency (Sec. 3.1, p.5)
- Claim: The analysis window spans $100\,\text{ns}$ to $1435.42\,\text{ns}$ with $66,!771$ frames at $20\,\text{ps}$ per frame.
- Checks: arithmetic consistency, cross-section consistency (Methods vs Results)
- Verdict: FAIL; confidence: high; impact: critical
- Assumptions/inputs: Frames are contiguous at $20\,\text{ps}$ intervals, Time span is end $- $start
- Notes: $100 \to 1435.42\,\text{ns}$ is a $1335.42\,\text{ns}$ span; at $0.02\,\text{ns}/$frame that implies $66,!771$ frame intervals, but this contradicts Methods stating a $1.3\,\mu\text{s}$ ($1300\,\text{ns}$) total trajectory and an analyzed slice of exactly $60,!000$ frames. Either the trajectory is longer than $1.3\,\mu\text{s}$, the frame interval is not $20\,\text{ps}$, or the stated end time/frame count is wrong.
✖ Number of transitions analyzed (Sec. 3.3, p.8)
- Claim: $66,!770$ inter-frame transitions were analyzed.
- Checks: arithmetic consistency, cross-section consistency (Methods vs Results)
- Verdict: FAIL; confidence: high; impact: critical
- Assumptions/inputs: Number of transitions = number of analyzed frames $- 1$
- Notes: Methods imply $60,!000$ analyzed frames, hence $59,!999$ transitions. The reported $66,!770$ transitions aligns with $66,!771$ frames (Results Sec. 3.1) but conflicts with the Methods slice.
✔ Contact-based edge weight definition (peptide graph) (Sec. 2.2.1 and 2.3.1, pp.3–4)
- Claim: Inter-peptide edge weights are the number of heavy-atom contacts within $4.5\,\text{\AA}$ between peptides $i$ and $j$.
- Checks: definition consistency
- Verdict: PASS; confidence: medium; impact: minor
- Assumptions/inputs: Contacts are counted over pairs of heavy atoms across peptides, Weights are nonnegative integers
- Notes: Definition is coherent as a weight construction. Symmetry ($w_{ij}=w_{ji}$) is implied by pairwise distances but not explicitly stated.
✔ FG graph node count (Sec. 2.3.2, p.4)
- Claim: Fine-grained graph has $150$ nodes representing $30$ peptides $\times 5$ residues.
- Checks: counting consistency
- Verdict: PASS; confidence: high; impact: minor
- Assumptions/inputs: Exactly $30$ peptides and $5$ residues per peptide in KYFIL
- Notes: $30\times 5 = 150$.
✔ Weighted Laplacian construction (Sec. 2.4.1, p.4)
- Claim: The weighted Laplacian is $L = D - W$, with $D_{ii} = \sum_j W_{ij}$ (weighted degree/strength).
- Checks: algebraic/formula correctness, notation consistency
- Verdict: PASS; confidence: high; impact: moderate
- Assumptions/inputs: $W$ is the weighted adjacency matrix for the (sub)graph being analyzed
- Notes: This definition is internally consistent and correctly describes the (combinatorial) weighted Laplacian.
✔ Fiedler value interpretation (Sec. 2.4.1, p.4)
- Claim: The second-smallest Laplacian eigenvalue $\lambda_2$ measures connectivity; $\lambda_2 = 0$ indicates disconnection.
- Checks: conceptual consistency within the paper
- Verdict: PASS; confidence: medium; impact: minor
- Assumptions/inputs: $L$ is the (combinatorial) Laplacian of an undirected graph with nonnegative weights
- Notes: Within the paper’s stated conventions, this is consistent (they also state they compute $\lambda_2$ on the LCC to ensure connectivity).
✖ Weighted density normalization vs reported CG values (Definition: Sec. 2.4.1, p.4; Reported: Sec. 3.2.1, p.6)
- Claim: Weighted density $= \frac{\text{sum of edge weights}}{\text{maximum possible sum of weights for a complete graph with same } n}$, using max observed contact count in that frame as max edge weight; reported CG weighted density is $2.65 \pm 0.32$.
- Checks: normalization/bounds check, definition-to-result consistency
- Verdict: FAIL; confidence: high; impact: critical
- Assumptions/inputs: Graph is simple undirected with $W_{ii}=0$ and $W_{ij}\geq 0$, ‘Sum of all edge weights’ refers to a sum over unique edges ($i<j$), or if over adjacency entries, the convention is stated
- Notes: Given the stated normalization, the ratio should be $\leq 1$ (actual weight-sum cannot exceed max_weight $\times$ number_of_possible_edges). The reported mean $2.65$ violates this bound, indicating either the formula is misstated, the computation double-counts/uses a different denominator, or the quantity is not a density.
✔ Aggregate definition as connected components (Sec. 2.5, p.4)
- Claim: Aggregates are connected components of the CG graph at each frame.
- Checks: definition consistency
- Verdict: PASS; confidence: high; impact: moderate
- Assumptions/inputs: Edges represent peptide-peptide contacts in that frame
- Notes: Definition is clear and operational.
✔ Splitting event definition across consecutive frames (Sec. 2.5, p.4)
- Claim: A split occurs if a component’s peptide set at frame $t$ is distributed across $\geq 2$ components at frame $t+1$ (with size thresholds).
- Checks: logical consistency
- Verdict: PASS; confidence: medium; impact: moderate
- Assumptions/inputs: Component membership is computed per frame from the CG graph, Thresholds are applied consistently
- Notes: Event logic is coherent as a discrete-time definition.
✔ Pre-splitting window length (Sec. 2.6, p.5)
- Claim: $N_w = 50$ frames corresponds to $1\,\text{ns}$.
- Checks: unit/time consistency
- Verdict: PASS; confidence: high; impact: minor
- Assumptions/inputs: $20\,\text{ps}$ per frame throughout the analysis
- Notes: $50\times 20\,\text{ps} = 1000\,\text{ps} = 1\,\text{ns}$.
✖ Internal consistency of ‘always $30$-peptide aggregate’ vs later CG stats and splitting counts (Sec. 3.1, p.5; Sec. 3.2.1, p.6; Sec. 3.3, p.8)
- Claim: System is consistently a single $30$-peptide aggregate (mean aggregates $1.0 \pm 0.0$; LCC size $30.0 \pm 0.0$), yet later reports average components $1.17 \pm 0.40$, average LCC $28.89 \pm 2.72$, and $1184$ splitting events.
- Checks: cross-section consistency, logical consistency
- Verdict: FAIL; confidence: high; impact: critical
- Assumptions/inputs: All these summaries refer to the same CG graph definition and the same analysis window ($100\,\text{ns}$ onward)
- Notes: If LCC size is identically $30$ and number of components identically $1$ across the window, then splits as defined cannot occur and LCC/component averages cannot deviate from $(30, 1)$. The paper needs to clarify which statements are incorrect or whether these metrics were computed on different subsets/with different filtering.
⚠ Normalized Laplacian / normalized Fiedler value usage (Sec. 3.2.1–3.2.2, pp.6–7)
- Claim: Normalized Laplacian Fiedler values are reported alongside unnormalized ones.
- Checks: definition completeness
- Verdict: UNCERTAIN; confidence: medium; impact: moderate
- Assumptions/inputs: Some normalized Laplacian was used
- Notes: The normalized Laplacian formula (and any handling of isolated nodes / subgraph extraction details) is not provided, so the normalized eigenvalue computations cannot be audited symbolically.
⚠ Symmetry / double-counting risk in weight sums (Sec. 2.3.1–2.4.1, pp.3–4)
- Claim: Quantities like 'sum of all edge weights' and densities are computed from adjacency matrices.
- Checks: definition ambiguity check
- Verdict: UNCERTAIN; confidence: medium; impact: moderate
- Assumptions/inputs: WCG and WFG are stored as full matrices
- Notes: If sums are taken over all $i,j$ entries of a symmetric adjacency matrix, weights are double-counted relative to sums over unique edges. The paper does not specify the convention, and this ambiguity is especially relevant given the weighted-density inconsistency.

Limitations

The paper contains very few explicit, fully written formulas beyond the Laplacian and some simple time/frame arithmetic; many key metrics (e.g., exact unweighted density formula used, modularity expression, exact weighted betweenness/eigenvector centrality conventions) are referenced conceptually without explicit equations, limiting symbolic verification.
Several central quantitative claims depend on definitions that are described in prose but not formalized (e.g., the exact computation of 'bridging strength' and whether it double-counts residue-residue edges across fragments). Without explicit formulas, these can only be partially audited for logical consistency.

Numerical Results Audit

Numerics Audit by Skepthical

This section audits numerical/empirical consistency: reported metrics, experimental design, baseline comparisons, statistical evidence, leakage risks, and reproducibility.

Out of $13$ candidate numeric checks, $11$ PASSED, $1$ FAILED (cross-section frame-count inconsistency between Methods and Results), and $1$ was UNCERTAIN (FG node count/dimension check not executed due to an unsupported check type). Key unit conversions ($\mu\text{s}$/ns/ps) and frame/transition arithmetic were internally consistent within sections; however, the Methods-vs-Results disagreement on the number of analysis frames ($60,!000$ vs $66,!771$) is material and should be resolved.

Checked items

✔ C1_frames_total_from_duration_and_timestep (p.3, Methods 2.1 (Total trajectory comprised $65000$ frames ($1.3\,\mu\text{s}\,/\,20\,\text{ps}$)))
- Claim: The total trajectory comprised $65000$ frames ($1.3\,\mu\text{s}\,/\,20\,\text{ps}$).
- Checks: unit_conversion_and_division
- Verdict: PASS
- Notes: Computed frames=total_time/dt=$65000.00000000001$; matches reported under rounding.
✔ C2_frame_index_for_100ns (p.3, Methods 2.1 ($100\,\text{ns}$ corresponds to frame index ... = $5000$))
- Claim: Given the $20\,\text{ps}$ frame interval, $100\,\text{ns}$ corresponds to frame index $100,!000\,\text{ps}\,/\,20\,\text{ps} = 5000$.
- Checks: unit_conversion_and_division
- Verdict: PASS
- Notes: Index computed as time_offset/frame_interval.
✔ C3_analysis_window_frame_count_from_indices (p.3, Methods 2.1 (frames from index $5000$ to $64999$, totaling $60000$ frames))
- Claim: Analysis window included frames from index $5000$ to $64999$, totaling $60000$ frames.
- Checks: index_range_count
- Verdict: PASS
- Notes: Count computed as end-start+1 (inclusive).
✔ C4_analysis_window_duration_from_frame_count_and_step (p.3, Methods 2.1 ($60000$ frames ... covering $1.2\,\mu\text{s}$))
- Claim: Frames $5000$ to $64999$ total $60000$ frames covering $1.2\,\mu\text{s}$ of simulation time.
- Checks: multiplication_and_unit_conversion
- Verdict: PASS
- Notes: Best-matching convention: $N\times dt$ ($60000\times 20\,\text{ps}=1.2\,\mu\text{s}$).
✔ C5_atoms_per_peptide_consistency_total_vs_heavy (p.3, Methods 2.1 (Each peptide consisted of $76$ atoms, with $38$ heavy atoms.))
- Claim: Each peptide consisted of $76$ atoms, with $38$ heavy atoms.
- Checks: difference_check
- Verdict: PASS
- Notes: Implied hydrogen count is $76-38=38$ (non-negative integer).
✔ C6_heavy_atoms_per_residue_sum_to_total_heavy_atoms (p.3, Methods 2.1 (Lys $8$, Tyr $11$, Phe $11$, Ile $4$, Leu $4$; total heavy atoms $38$))
- Claim: Heavy atoms per residue: Lys $8$, Tyr $11$, Phe $11$, Ile $4$, Leu $4$; each peptide has $38$ heavy atoms.
- Checks: sum_to_total
- Verdict: PASS
- Notes: $8+11+11+4+4=38$ matches the reported total heavy atoms.
⚠ C7_FG_node_count_from_peptides_times_residues (p.3-4, Methods 2.2.2 and 2.3.2 ($30$ peptides, $5$ amino acids each $\Rightarrow 150$ nodes; mentions $150\times 150$))
- Claim: FG graph has $150$ nodes ($30$ peptides $\times 5$ residues per peptide) and a $150\times 150$ adjacency matrix.
- Checks: multiplication_and_dimension_check
- Verdict: UNCERTAIN
- Notes: Marked UNCERTAIN due to execution-side unsupported check type.
✔ C8_pre_split_window_time_from_frames_and_dt (p.5, Methods 2.6 ($N_w = 50$ frames ($1\,\text{ns}$); dt=$20\,\text{ps}$ earlier))
- Claim: A pre-splitting window of $N_w = 50$ frames corresponds to $1\,\text{ns}$.
- Checks: multiplication_and_unit_conversion
- Verdict: PASS
- Notes: Best-matching convention: $N\times dt$ ($50\times 20\,\text{ps} = 1\,\text{ns}$).
✔ C9_analysis_window_duration_from_reported_end_time_minus_start (p.5, Results 3.1 (analysis window $100\,\text{ns}$ to $1435.42\,\text{ns}$; described as $1335.42\,\text{ns}$ window later))
- Claim: Analysis window spans from $100\,\text{ns}$ to $1435.42\,\text{ns}$, i.e., $1335.42\,\text{ns}$.
- Checks: difference_check
- Verdict: PASS
- Notes: $1435.42-100.00=1335.42$ matches.
✔ C10_frames_from_analysis_window_duration_and_dt_vs_reported_66771 (p.5, Results 3.1 ($100\,\text{ns}$ to $1435.42\,\text{ns}$ ($66,!771$ frames at $20\,\text{ps}$ per frame)))
- Claim: The analysis window spanning $100\,\text{ns}$ to $1435.42\,\text{ns}$ corresponds to $66,!771$ frames at $20\,\text{ps}$ per frame.
- Checks: duration_to_frame_count
- Verdict: PASS
- Notes: $1335.42\,\text{ns} / 0.02\,\text{ns}$ per frame $= 66771.0$ exactly.
✖ C11_inconsistency_between_methods_framecounts_and_results_framecounts (p.3 Methods 2.1 vs p.5 Results 3.1)
- Claim: Methods state analysis frames $5000$–$64999$ totaling $60,!000$ frames; Results state $66,!771$ frames in analysis window.
- Checks: cross_section_consistency
- Verdict: FAIL
- Notes: Direct equality check between sections fails ($60000$ vs $66771$).
✔ C12_interframe_transitions_from_frames (p.8, Results 3.3 (Over the $66,!770$ inter-frame transitions analyzed...) and p.5 Results 3.1 ($66,!771$ frames))
- Claim: Over the $66,!770$ inter-frame transitions analyzed, ... (implying $66,!771$ frames).
- Checks: off_by_one_check
- Verdict: PASS
- Notes: $66,!771-1=66,!770$ matches.
✔ C13_splitting_events_rate_per_transition (p.8, Results 3.3 ($1184$ splitting events over $66,!770$ transitions))
- Claim: A total of $1184$ splitting events were detected over $66,!770$ inter-frame transitions.
- Checks: rate_computation
- Verdict: PASS
- Notes: Computed event_fraction=$1184/66770=0.0177325146$, within $[0,1]$.

Limitations

Only parsed text was available; numeric values embedded solely in figures/plots were not extracted or checked.
Many reported averages/standard deviations and all statistical-test outcomes require underlying per-frame or per-event data not present in the PDF text, so they cannot be recomputed with fast code.
Some quantities depend on conventions (e.g., whether a window duration uses $N\times dt$ vs $(N-1)\times dt$; inclusive vs exclusive frame endpoints), so checks may need to evaluate multiple plausible conventions.