-
Mismatch between advertised aims and delivered results: the title/Abstract/Introduction/Conclusions foreground a diffusion‑MRI microstructural biomarker (NDDV) linking brain integrity, DNAmAge, and cognition, but NDDV cannot be computed because diffusion data are unusable (3D not 4D) and an atlas is missing (Sec. 1; Sec. 2.1.1; Sec. 2.3; Sec. 3.3; Sec. 4.1–4.2). This creates a substantial positioning problem: readers may expect an imaging biomarker study, while the empirical contribution is behavioral–epigenetic only plus a data-QC cautionary note.
Recommendation: Reframe the manuscript explicitly around what is actually supported by the data: (i) behavioral phenotype extraction and CPI construction; (ii) effect-size/uncertainty bounds on DNAmAge–performance association in this cohort; (iii) heterogeneity (e.g., colony) and limitations; and (iv) a concise, actionable “DTI data QC” case study. Revise the title and Abstract to de-emphasize NDDV/microstructural biomarkers (or clearly label NDDV as conceptual/unimplemented). Move most NDDV derivation/pipeline detail to an Appendix and keep only a brief, internally consistent conceptual description in the main text (Sec. 1; Sec. 2.3; Sec. 3.3).
-
Insufficient quantitative reporting and template/metadata problems undermine credibility and reproducibility: Table 1 is unfilled (Sec. 2.4.1/Sec. 3.1), Table 2 (GLM output) is referenced but missing, and at least one reported statistic appears as a placeholder/formatting error (e.g., adjusted R-squared “00.00” in Sec. 3.4). The manuscript also contains clear template mismatches (astronomy-related keywords after the Abstract) and (per the unstructured report) non-scientific placeholder affiliation text; such issues can signal to readers that other parts may be similarly unvetted.
Recommendation: Fully populate Table 1 ($N$, mean, SD, min, max for DNAmAge and CPI; counts for Sex and Origin) and include Table 2 with complete regression output (coefficients, SEs, $95\%$ CIs, test statistics, df, exact $p$-values, $R^2$/adjusted $R^2$) (Sec. 2.4.1; Sec. 3.1; Sec. 3.4). Remove placeholder/incorrect values (e.g., “00.00”) and verify all reported statistics against the analysis output. Replace astronomy keywords with appropriate terms and ensure affiliations and other front-matter metadata are correct and publication-ready (Abstract/front matter).
-
CPI construction is underspecified and insufficiently validated for its central role. CPI is an unweighted sum of z-scored components spanning heterogeneous types (continuous latencies, counts, and binary perseveration indicators), potentially across multiple phases, without demonstrating dimensional coherence, reliability, or robustness (Sec. 2.2.2–2.2.3; Sec. 3.2). The exact CPI component set is not enumerated (e.g., whether Visit_Efficiency enters once or per phase), preventing replication and making the CPI scale hard to interpret (Sec. 2.4.1; Sec. 3.2).
Recommendation: In Sec. 2.2.3 and Sec. 3.2, explicitly list all CPI components (with phase indexing), indicate sign flips, and provide the CPI formula as a sum over a clearly defined metric set. Add a component-level table: distribution summaries (mean/SD/range), missingness, and directionality. Report a correlation matrix among components and at least one reliability/structure check (e.g., Cronbach’s $\alpha$; simple PCA/factor analysis) to justify aggregation. Provide sensitivity analyses showing whether the DNAmAge association remains similar when (i) excluding binary perseveration variables, (ii) using PCA-derived factor scores, and/or (iii) using domain sub-indices (learning/STM/LTM/efficiency) rather than a single CPI (Sec. 3.2–3.4).
-
Behavioral task description is not sufficiently operational for readers outside the immediate project, limiting interpretability of CPI components and the cognitive domains claimed. The multi-phase spatial foraging task needs more concrete details (arena/box layout, number of boxes, cues, timing, how phases are separated, what defines “correct,” and how/when it changes), and explicit handling of key edge cases (e.g., never visiting the correct box, censored times) (Sec. 2.1–2.2; Sec. 2.2.2). Without this, ceiling/floor effects and what the task truly measures (learning vs flexibility vs inhibition) are hard to assess.
Recommendation: Expand Sec. 2.1–2.2 with a step-by-step task specification: number and arrangement of boxes, cue availability, phase definitions and timing (within-day vs across days), trial/session duration, inter-phase interval, and rule for “correct” location across phases. In Sec. 2.2.2, define metric computation with explicit edge-case rules: censoring/maximum time if correct is never visited, treatment of missing timestamps or simultaneous events, and how perseveration is coded when entries are zero. Report how often such edge cases occurred and whether they affect CPI distribution (Sec. 3.2).
-
Interpretation of the null DNAmAge–CPI result is overstated relative to the design and reporting. The manuscript sometimes implies “lack of age-related cognitive decline” or “remarkable resilience,” but the study is cross-sectional (not longitudinal), modest in size (N=32), and spans a limited mid-to-late adult epigenetic-age range (Sec. 2.4.2; Sec. 3.1; Sec. 3.4; Sec. 3.6; Sec. 4.3–4.4). Absence of statistical significance is not evidence of absence; without effect sizes, CIs, and detectable-effect analyses, readers cannot judge what magnitude of decline is ruled out.
Recommendation: Recast conclusions in the Abstract, Sec. 3.6, and Sec. 4.3–4.4 to: “no detectable association in this sample/age range,” rather than broad claims of preserved cognition across aging. Report standardized effect sizes (e.g., CPI SD change per DNAmAge year) with $95\%$ CIs and interpret the CI as an upper bound on plausible decline. Add sensitivity checks for nonlinearity (quadratic DNAmAge term; or spline if feasible) and robustness (Spearman correlation; robust regression or permutation test) (Sec. 2.4.2; Sec. 3.4). Optionally include a detectable-effect-size or post-hoc power analysis to clarify what effects this design could reasonably detect.
-
The “Cognitive Resilience Score” (residual from $\text{CPI} \sim \text{DNAmAge}$) is presented as a substantive phenotype, but when the DNAmAge slope is near zero the residual will be almost identical to CPI, and the term “resilience” may mislead readers into inferring demonstrated age-related decline that individuals are resilient to (Sec. 2.4.3; Sec. 3.5–3.6). The score’s distribution, outlier sensitivity, and dependence on model choice are not characterized.
Recommendation: If retained, explicitly label it as “age-adjusted performance (residual)” and clarify its interpretation when no age effect is detected. Report its mean/SD/range, identify influential points (Cook’s distance/leverage), and report the correlation between CPI and the residual score to show what is gained by the transformation (Sec. 3.5–3.6). Include sensitivity to alternative specifications (e.g., adding Origin, Sex, or nonlinearity in the baseline model) and note that residualization is model-dependent (Sec. 2.4.3).
-
Potential confounding and heterogeneity (especially colony of origin) are not analyzed deeply enough. Origin shows a trend-level association with CPI (Sec. 3.4) that could reflect housing/rearing differences, genetic structure/relatedness, prior task exposure, sensory/health status, or different DNAmAge distributions across colonies, potentially masking or mimicking aging patterns. Interactions (DNAmAge$\times$Origin, DNAmAge$\times$Sex) are not discussed.
Recommendation: In Sec. 3.1 and Sec. 3.4–3.6, describe what is known to differ between colonies (environmental history, husbandry, capture/rearing, testing context). Provide descriptive comparisons: DNAmAge by Origin and CPI by Origin (means/SDs, plots) and check for DNAmAge–Origin confounding. If power permits, test DNAmAge$\times$Origin (and DNAmAge$\times$Sex) interactions or clearly justify why they are omitted. Regardless, treat the Origin effect as exploratory (report estimate and $95\%$ CI; avoid “trend toward significance” framing) and discuss plausible mechanisms and future controls (Sec. 3.6; Sec. 4.4).
-
Reproducibility and data provenance details are incomplete. The Methods lack software/package versions, explicit inclusion/exclusion criteria, handling of missing/anomalous events, categorical coding choices, and model diagnostics (Sec. 2.1–2.2; Sec. 2.4; Sec. 3.4). Additionally, inclusion of absolute internal filesystem paths (as noted in the unstructured report) is not appropriate for a publication and may expose private infrastructure while not helping others reproduce the work.
Recommendation: Add a reproducibility subsection (Sec. 2.4 or end of Sec. 2) specifying: languages and package versions; data cleaning rules; subject/trial exclusion criteria; handling of missing/censored metrics; and exact coding of Sex and Origin (reference levels). Summarize model diagnostics (residual plots, influence checks) supporting GLM assumptions (Sec. 3.4). Remove any absolute internal file paths and replace with portable descriptions; state code/data availability (repository/DOI) or, if not shareable, provide a detailed analysis workflow and synthetic example inputs (Sec. 2; Data availability statement).
-
Figures and terminology currently limit standalone interpretability and may mislead: Figure 1’s overlap depiction is hard to read and may imply near-total overlap without numeric labels; Figures 2–7 often lack units, sample sizes, and key numeric/statistical annotations; and there is at least one documented inconsistency in the color mapping for the resilience score (Sec. 3.5; Fig. 7). Some terminology is inconsistent (e.g., CPI vs. CAI) and CPI scale is hard to interpret without the number of components.
Recommendation: Revise Figure 1 to an UpSet plot or an area-proportional Euler diagram with explicit counts for every region/overlap; increase resolution and use a colorblind-safe palette. For Figs. 2–7, add axis units (DNAmAge in years; clarify CPI units as “sum of $k$ $z$-scores”), annotate $N$ per panel/group, and include key statistics (slope/CI/$R^2$) on the regression plot (Sec. 3.2–3.5). Fix the Fig. 7 color convention so text/caption/colorbar all match (positive residual = higher age-adjusted performance) and standardize CPI terminology throughout.