-
The title and some phrasing in the Abstract, Introduction, and Conclusions emphasize "QTT-Based Compression" and assembly bias, which can be read as implying a substantive evaluation of QTT-based compression, even though only baseline models and dummy-QTT failures are quantitatively analyzed (Title; Abstract; Introduction; Sec. 4.3.1; Conclusions).
Recommendation: Adjust the Title and key summary statements to make the scope unambiguous, for example by explicitly mentioning "pipeline" and/or "dummy implementation" (e.g., "Pipeline for QTT-Based Compression… with Dummy Implementation"). In the Abstract and Conclusions, state early that QTT is only implemented via a placeholder and that no conclusions about real QTT performance are drawn, while foregrounding the baseline and pipeline contributions.
-
The description of the dataset and merger trees is relatively generic and omits important contextual details such as the specific simulation used, cosmological parameters, mass resolution, and selection criteria for the 1000-tree subset out of 5099 available trees (Sec. 2.1; Sec. 4.1).
Recommendation: Expand Sec. 2.1 / Sec. 4.1 to specify: (i) which N-body or hydrodynamical simulation provides the merger trees, including basic cosmology and mass resolution; (ii) how the 1000 trees were selected from the 5099 available (e.g., mass range, redshift range, random subsample); and (iii) any quality cuts applied, such as minimum number of snapshots or progenitors. State explicitly whether all selected trees yielded valid trajectories or whether some were discarded.
-
The explanation of main progenitor trajectory extraction is high-level and does not fully describe how branching, multiple progenitors, missing links, or ambiguous mask_main indices are handled, which could affect the resulting trajectories and potential biases (Sec. 2.2.1; Sec. 4.1).
Recommendation: In Sec. 2.2.1, provide a more precise description of the main progenitor algorithm: e.g., specify whether at each step you follow the most massive progenitor, use mask_main indices directly, or apply another tie-breaking rule; clarify how missing or ambiguous indices are handled; and state whether any halos were removed due to inconsistencies. If all 1000 trees produced valid trajectories (as suggested in Sec. 4.1), explicitly mention that and why.
-
Citation formatting is inconsistent and sometimes malformed, with repeated years such as "*(Ye and Loureiro, 2024, 2024)*" and "*(Ye and Loureiro, 2024b, 2024b)*", and mixed citation styles in the References (e.g., bracketed entries like "[Elahi, P. J., …]") (Introduction; Sec. 2.3.2; References).
Recommendation: Standardize citation formatting across the manuscript: remove duplicated years so each in-text reference appears as, for example, "(Ye and Loureiro 2024)", and only use suffixes (2024a/2024b) if genuinely distinct works exist and are clearly listed. In the References, adopt a single consistent style (e.g., unbracketed author-year) and ensure one-to-one correspondence between in-text citations and reference-list entries.
-
The description of trajectory padding and reshaping for QTT in Sec. 2.3.1–2.3.2 is somewhat ambiguous and not fully aligned with later text in Sec. 4.1–4.2 mentioning padding first to length 98 and then to 128 and giving a shape like $(2,2,2,2,2,2,2,4)$. It is unclear exactly how many padding steps occur and how the feature dimension is incorporated.
Recommendation: Clarify Sec. 2.3.1–2.3.2 to match Sec. 4.1–4.2 by explicitly stating: (i) the original trajectory length range; (ii) that trajectories are first padded to a uniform length of 98 and then further padded to 128 ($2^7$) for QTT; and (iii) how the four features are included in the tensor shape (e.g., as a separate mode to obtain $(2,2,2,2,2,2,2,4)$). Ensure the text is self-consistent and reflects the actual implementation.
-
Section 2.4.4 (Baseline Comparison) appears truncated in the provided text (e.g., line break around "Wang et al., 2023"), and the description of how baselines are trained/evaluated is slightly unclear (Sec. 2.4.4).
Recommendation: Review Sec. 2.4.4 in the source manuscript to ensure that all sentences are complete and that the citation to Wang et al. (2023) is correctly formatted on a single line. If any details of the baseline training/evaluation procedure were inadvertently omitted, reinsert them so that the section reads as a coherent paragraph.
-
Some interpretive statements about unexplained variance being "potentially attributable to assembly bias" do not distinguish assembly history from other sources of scatter, since no conditioning on mass or other covariates is performed (Abstract; Sec. 4.3.1; Sec. 4.6; Conclusions).
Recommendation: Qualify these statements throughout by explicitly acknowledging that the unexplained variance reflects a combination of assembly history, environmental effects, stochasticity, and modeling noise. Where feasible, add a mass-conditioned or mass-binned $R^2$ analysis in Sec. 4.3.1 to better isolate potential assembly-bias-related variance, and then refer to those results when discussing what might be attributable to assembly bias.
-
The latent-space visualization section currently describes mainly what is "expected" from PCA and t-SNE but provides limited explicit description of what is actually observed in the plots (e.g., gradients, clustering, or lack thereof for dummy QTT features) (Sec. 4.4).
Recommendation: Revise Sec. 4.4 to report concrete qualitative or simple quantitative observations from the PCA and t-SNE plots: for example, note whether baseline features show a visible gradient of the target along a principal component, whether t-SNE reveals clustering, and confirm that dummy-QTT projections appear structureless with respect to the target. Optionally, include simple summary statistics such as correlations between principal components and the target.
-
The computational-cost discussion is conceptual and includes specific numbers such as "$< 0.001$ ms" and "4–5 ms per tree" without stating whether these are measured from the dummy implementation or theoretical estimates, and without specifying hardware or software context (Sec. 4.5).
Recommendation: In Sec. 4.5, explicitly distinguish measured timings from rough estimates, and, for measured numbers, specify the hardware (CPU/GPU model, RAM) and software environment. If some values are extrapolations for a real QTT implementation, label them as such and present them as order-of-magnitude expectations rather than precise benchmarks.
-
Many figures omit secondary but important details, such as sample sizes (e.g., Figure 1, Figure 7, Figure 9), summary statistics overlays (Figure 1), explicit whisker conventions (Figures 3, 4), target normalization or units (Figures 5, 6, 7, 8, 9), and evaluation protocols (Figures 5, 6, 7). Some figures lack clear legends, concise labeling, or panel identification, and several do not specify preprocessing steps or random seeds for reproducibility.
Recommendation: Add sample sizes, summary statistics, and whisker conventions to captions or plots; annotate target normalization/units and evaluation protocols; provide clear legends and panel labels; and specify preprocessing steps and random seeds for all relevant figures.
-
Visual clarity and accessibility are sometimes reduced by small or inconsistent typography, cramped layouts, overplotting, non-uniform axis limits, and insufficient annotation of colorbars or colormaps. Some figures do not harmonize style or formatting with the rest of the manuscript.
Recommendation: Increase font and marker sizes, adjust plot margins and spacing, use transparency or density overlays to address overplotting, enforce consistent axis limits and tick formatting, clearly label colorbars with variable names and units, and harmonize visual style across all figures.
-
The paper interprets a “relative reconstruction error $\approx 1.0$” as meaning the reconstruction is as different as a zero trajectory would be, but no reconstruction-error definition is provided, so this implication cannot be verified from the paper alone.
Recommendation: Explicitly define the reconstruction error used (e.g., $\|X-\hat X\|/\|X\|$ with specified norm and whether computed per-trajectory, per-feature, including/excluding padded zeros), then re-check the interpretation that error $\approx 1$ corresponds to a zero reconstruction or uncorrelated output.
-
“Compression ratio” is discussed and compared across ranks, but no analytic definition is given (parameter count formula / storage cost for QTT cores vs original tensor). This prevents internal verification of claims like “ratios equal the rank” being an artifact.
Recommendation: Define compression ratio precisely (e.g., original element count divided by number of stored parameters in TT/QTT cores) and specify how core sizes and ranks are counted, including the handling of the final feature dimension.