-
Internal inconsistencies in dataset sizes, family counts, and which subsets are used for which analyses undermine clarity and reproducibility (Abstract; Sec. 2.2; Sec. 3.1–3.3; Sec. 4). The manuscript variously cites (i) an aggregated dataset of $16{,}774$ asteroids and $37$ well-populated families (Abstract), (ii) $3{,}872$ asteroids in $89$ families with $15$ well-populated families (Sec. 2.2), and (iii) $32$ families used in age correlations (Sec. 3.3), while figures/text discuss additional families beyond the listed “well-populated” set (Sec. 3.1–3.2). It is not currently possible to tell which families enter the quantile fits, the $k/C$ distributions, and the correlation tests.
Recommendation: In Sec. 2.2 and at the start of Sec. 3, define and use a consistent set of named samples (e.g., $N_{\rm raw}$, $N_{\rm merged}$, $N_{\rm core}=\{a,D,P,{\rm fam}\}$, $N_{\rm fit}$ where V-fits attempted, $N_{\rm robust}$ where fits pass QA, $N_{\rm age}$ where ages exist). Add a single flow table/diagram with counts at each merge/filter step and a table listing every family that appears in any figure or enters any statistic, including: family name, $N$ used, whether it meets $N\geq50$, whether both wings pass fit-quality criteria, and whether an age is available. Ensure the Abstract and Sec. 4 summarize the same final analysis sample (or explicitly state multiple tiers, e.g., “$37$ families plotted; $15$ primary families used for quantitative summaries; $32$ used for age correlations”).
-
Key quantitative outputs are missing or mis-referenced, preventing verification and reuse (Sec. 2.2; Sec. 3.2; Sec. 3.4). Table numbering appears inconsistent (Table 1 used as a template/summary but later referenced as containing full fit results), and the manuscript does not provide a complete per-family results table with ($a_c, m_L, m_R, k, C$) and fit status/flags. As a result, claims about ranges of $k/C$, which families have NaN slopes (e.g., Karin), and which families drive “inverted” behavior cannot be checked.
Recommendation: Populate the data-summary table in Sec. 2.2 with actual numbers and create a separate, correctly numbered results table (main text, appendix, or supplement) listing for each fitted family: $N$, $a_c$ definition used, wing $N$s, $m_L$ and $m_R$ with uncertainties, $k$ (and any asymmetry metric), $C$, and a fit-quality flag (pass/fail with reason). Fix all in-text references (Sec. 3.2, Sec. 3.4) to point to the correct table(s).
-
Physical justification for the specific choice of $P\times D$ (and the assumed linear envelope in $a$–log space) is currently ambiguous and at points logically inconsistent with the stated drift scaling (Sec. 1; Sec. 2.3.1; Sec. 3.1–3.2). The Introduction cites an idealized scaling resembling $\mathrm{d}a/\mathrm{d}t \propto 1/(P\cdot D^2)$, then concludes “Consequently” that $P\times D$ is a comprehensive proxy—this does not follow algebraically. More broadly, Yarkovsky’s dependence on rotation period is regime-dependent (thermal parameter, diurnal/seasonal components, obliquity), so the expected mapping from Yarkovsky drift limits to an envelope in $(a, \log(P\times D))$ needs either a derivation or a clear statement that the metric is empirical/heuristic.
Recommendation: Revise Sec. 1 and Sec. 2.3.1 to either (a) provide a short, explicit scaling argument linking a boundary in $|a-a_c|$ to a boundary in a chosen spin–size variable (and justify why $D^2$ is reduced to $D$, if that is intended), including assumptions/regime; or (b) explicitly frame $P\times D$ as an empirical proxy selected for exploratory morphology, and soften causal language (“Consequently”). In either case, state what physical behavior would produce an upright vs inverted V-shape under your plotting convention, and what deviations might plausibly indicate (e.g., selection bias, resonance truncation, heterogeneous thermophysics, spin-state asymmetries).
-
Sign conventions and axis definitions are inconsistent across Methods and Results, making the meaning of “upright/inverted” V-shapes and negative $k$ unclear (Sec. 2.3.1; Sec. 3.1–3.2; figure axes). The Methods define $Y' = \log_{10}(P\times D)$, while many figures/Results use $\log_{10}(1/(P\times D)) = -Y'$ to obtain a visually upright V. This interacts with the choice of fitting an upper ($95$th) quantile and with the expected signs of $m_L$ and $m_R$; without a single fixed convention, some negative-$k$ cases may be artifacts of plotting/definition rather than physical inversions.
Recommendation: Choose one dependent variable definition for fitting and interpretation and enforce it everywhere: either $Y = \log_{10}(1/(P\times D))$ or $Y = \log_{10}(P\times D)$. Then: (i) explicitly state whether you fit the upper or lower envelope ($\tau=0.95$ vs $\tau=0.05$) and why; (ii) derive the expected sign pattern for $m_L$ and $m_R$ under an “ideal” family; (iii) update the definition/interpretation of $k$ accordingly. Add a short note in Sec. 3.2 clarifying which negative-$k$ cases remain negative under the unified convention.
-
The definitions of $k$ and $C$, and their interpretation as “steepness” and “consistency/sharpness,” are not statistically robust as currently used (Sec. 2.3.3; Sec. 3.2). $k=(m_R-m_L)/2$ is described as an “average magnitude” in places, which is only true under strict sign assumptions; when slopes change sign (a core result here), $k$ conflates opening, asymmetry, and sign convention. Meanwhile, $C$ (fraction of points below a fitted $95$th-quantile line) is close to $0.95$ largely by construction and can deviate due to finite-sample effects or model misfit, so treating small departures as morphological “filled-in vs sharp” is risky.
Recommendation: In Sec. 2.3.3, redefine or supplement the metrics so they remain meaningful under sign violations: e.g., report an opening metric $k_{\rm open}=(|m_L|+|m_R|)/2$ and an asymmetry metric $k_{\rm asym}=|m_R|-|m_L|$ (or a ratio), while separately tracking sign anomalies as flags. For “sharpness/consistency,” either (i) treat $C$ as a diagnostic of quantile-regression calibration with uncertainty (bootstrap CI for $C-0.95$), or (ii) replace/supplement $C$ with a dispersion-based envelope metric (e.g., median distance to the fitted envelope; density contrast in a thin band below the envelope; or quantile spacing between $\tau=0.95$ and $\tau=0.80$). Update Sec. 3.2 interpretations accordingly.
-
Quantile-regression methodology, fit-quality criteria, and uncertainty estimates are under-specified, which jeopardizes claims about extreme/inverted $k$ and any downstream correlations (Sec. 2.3.2–2.3.3; Sec. 3.2–3.4). The manuscript does not specify the software/solver/options used, how NaNs/duplicates are handled, whether any robustness options are used, or explicit criteria for declaring a wing/family “fitable” (minimum $N$ per wing, minimum $a$-span). Uncertainties on $m_L$, $m_R$, $k$, and derived statistics are not reported.
Recommendation: Expand Sec. 2.3.2–2.3.3 to specify the quantile-regression implementation (package, solver, convergence criteria), preprocessing for each wing, and explicit fit-quality thresholds (e.g., $N_{\rm left}$ and $N_{\rm right}$ minima; minimum $\Delta a$ per wing; handling of leverage points). Use bootstrap/resampling within each family to compute uncertainty intervals for $m_L$, $m_R$, and derived metrics, and propagate these into any family-level comparisons. In Sec. 3.2, clearly mark and exclude failed/unstable fits from summary histograms and from age correlations, and report results for both the “all attempted” and “robust-only” sets.
-
Age-correlation analyses (age–IQR($a$) and especially age–$k$) lack clear sample definition, treatment of missing/uncertain values, and robustness checks (Sec. 2.5; Sec. 3.3). It is unclear whether semimajor axis is osculating or proper (family work typically uses proper elements), which age compilation is used, how age uncertainties are handled, and whether outliers or families with unstable $k$ dominate the correlation results. The mismatch between counts with age data (e.g., $41$ vs $32$) further clouds interpretation.
Recommendation: In Sec. 2.5 and Sec. 3.3: (i) specify whether $a$ is proper or osculating and justify; if possible, redo using proper semimajor axis for the family analysis; (ii) provide a table of the exact families used in each correlation, with age values and references (plus uncertainties/ranges when available), $N$, $\mathrm{IQR}(a)$, and $k$ (with uncertainty); (iii) state how NaN/failed fits are handled; (iv) add robustness checks (Spearman with/without extreme $|k|$, Kendall’s $\tau$, and sensitivity to plausible age uncertainties). Report $n$ used in each test and interpret “non-correlation” cautiously when uncertainty is large.
-
Selection effects from requiring measured spin periods (and the joint availability of $D$ and $P$) are acknowledged but not quantified; they may materially shape the observed envelopes and the apparent “improvement” of $P\times D$ (Sec. 3.4; implications for Sec. 3.1–3.2). Because spin-period measurements are highly incomplete and biased toward larger/brighter objects and certain observing campaigns, the sampled boundary at small $D$ (where drift is largest) may be truncated or distorted, potentially creating spurious sharpness or inversions.
Recommendation: Quantify selection bias per family: show, for a few representative families and/or in an appendix, (i) the $D$ distribution of all cataloged family members versus the subset with $P$, and (ii) how the fitted envelope changes when restricting to size ranges with more uniform completeness. Where possible, compare against the full family distribution in a standard family catalog (even if $P$ is missing) to assess how representative the $P$-available subset is. Temper Sec. 3.1 claims about systematic improvement of $P\times D$ unless supported under these checks.
-
Reproducibility is limited by incomplete data provenance and processing documentation (Sec. 2.1; Sec. 2.2). Listing local CSV filenames is insufficient: sources, versions/dates, units, and conflict-resolution rules (multiple $P$ solutions, diameter choices, family membership definitions) are not specified. This prevents independent replication.
Recommendation: In Sec. 2.1–2.2, provide full provenance for each input (catalog/survey name, citation, version/date, access link/DOI if available), the units used ($P$ in hours/days; $D$ in km; $a$ in au), and rules for resolving duplicates/conflicting measurements. Add a short reproducibility statement indicating whether code and processed tables will be archived (recommended), and at minimum include enough detail that a reader can reconstruct the merged dataset.