Yarkovsky Drift Fidelity: Unveiling Dynamical Boundaries in Asteroid Family Dispersal and Implications for Spin Evolution

2508.00065-R1 📅 15 Apr 2026 🔍 Reviewed by Skepthical GitHub

Official Review

Official Review by Skepthical 15 Apr 2026
Overall: 4.4/10
Soundness
3
Novelty
6
Significance
5
Clarity
5
Evidence Quality
3
While the paper proposes a novel diagnostic diagram and surfaces a potentially interesting steep-wall morphology across many families, core methodological issues undermine the validity of the conclusions. The Mathematical Audit flags ambiguity in the central drift-proxy formula (UNCERTAIN, critical) and internal inconsistencies in the envelope-fitting specification (coverage target FAIL; slope-grid bounds FAIL), and the reported age–YDFI correlation is explicitly invalidated by a data-merge error. Moreover, the envelope-fitting design mechanically saturates slopes/YDFI, and the resonance-barrier interpretation lacks quantitative validation or tests to rule out pipeline artifacts. The result is an intriguing idea with insufficiently supported evidence and presentation inconsistencies that require substantial revision.
  • Paper Summary: This manuscript introduces a new diagnostic representation for asteroid-family dispersal by plotting $\log_{10}(\dot{a}_{\mathrm{YK,relative}})$ versus $\log_{10}(a)$ (Secs. 2.3–2.4). The per-object $\dot{a}_{\mathrm{YK,relative}}$ is computed from a simplified thermophysical/Yarkovsky model using diameter and spin period, with fixed thermal parameters and a single effective obliquity factor (Sec. 2.3). A lower-envelope boundary-fitting algorithm is then applied (Sec. 2.4), and the fitted left/right slopes are combined into a Yarkovsky Drift Fidelity Index (YDFI) intended to quantify how well present-day spin states preserve a simple Yarkovsky-driven dispersal pattern, with an intended (but currently invalid) correlation to independently estimated family ages (Secs. 2.5–2.6, 3.4). Using a large compiled dataset (570,405 asteroids across 62 families; Sec. 2.1), the main empirical outcome is that most families show extremely steep, “bucket/U”-like lower boundaries in the new phase space, with fitted slopes frequently saturating at the limits of the slope-search grid (about $\pm50$; Sec. 3.3, Table 2, Figs. 13–16). This suggests that family extents in this diagram may often be truncated by sharp dynamical boundaries (e.g., resonances) rather than by drift potential alone (Secs. 3.5, 4.2). However, because the fitting procedure tends to select maximal steepness, the derived YDFI saturates near its maximum for most families (Sec. 3.4, Table 3), undermining its intended discriminative role. In addition, the reported age–YDFI correlation is explicitly acknowledged as invalid due to a severe data-merging error that duplicates families and misassigns ages (Sec. 3.4, Fig. 17). Overall, the diagnostic diagram and the qualitative “steep-wall” morphology observation could be valuable. To make the results physically interpretable and statistically defensible, the paper needs (i) an unambiguous and validated definition of the Yarkovsky drift proxy, (ii) a theoretical motivation for what envelope shape is expected in the proposed $(a, \dot{a})$ plane, (iii) a revised envelope-fitting/metric design that does not trivially saturate, (iv) quantitative validation of the resonance-barrier interpretation (preferably in proper elements), and (v) corrected and reproducible data handling for any age-related inference.
Strengths:
Proposes a novel diagnostic phase space that explicitly incorporates spin-rate information (via a drift proxy) alongside size and orbit, potentially adding insight beyond traditional $(1/D, a)$ V-shape plots (Secs. 2.3–2.4, 3.3).
Large sample size (570,405 objects; 62 families) provides strong statistical leverage for identifying robust morphological patterns (Sec. 2.1, Sec. 3.1).
The manuscript is unusually transparent about key shortcomings discovered during the study (YDFI saturation; invalid age-correlation due to a merge error), which helps readers assess reliability (Sec. 3.4, Sec. 4.2).
The cross-family figure set (e.g., Figs. 13–16) makes the steep-walled boundary morphology visually apparent and encourages dynamical interpretation and follow-up.
The boundary-fitting approach is described step-by-step (Sec. 2.4) and tables report fitted parameters (Tables 1–2), which is a good basis for reproducibility once core definition/implementation inconsistencies are resolved.
Major Issues (9):
  • The definition of $\dot{a}_{\mathrm{YK,relative}}$ is not mathematically unambiguous and may be physically inconsistent as presented. In Eqs. (1)–(2) (Sec. 2.3), the thermal-function factor is typeset/parsed ambiguously (e.g., “$\Theta$ 2 + 2$\Theta$ + $\Theta^2$” without clear parentheses), preventing verification. More broadly, the quantity is labeled as a drift rate while constants/dependencies are dropped; it is unclear whether it is meant to be a dimensioned drift rate, a proportional surrogate, or a purely ad hoc “drift potential proxy”. This ambiguity propagates directly into the log–log diagram (Sec. 2.4), envelope fits (Sec. 3.3), and any interpretation of family morphology as dynamical rather than definitional.
    Recommendation: In Sec. 2.3, rewrite Eqs. (1)–(2) with explicit fraction structure and parentheses (e.g., $\frac{\Theta}{2+2\Theta+\Theta^2}$ if that is intended), correct any typographical corruption, and cite the exact reference formula used (diurnal vs seasonal Yarkovsky; rapid-rotator approximation, etc.). Explicitly state whether $\dot{a}_{\mathrm{YK,relative}}$ is (i) a physically dimensioned estimate of $\dot{a}$ or (ii) a proxy score; use $\equiv$/$\propto$ consistently. If it is a proxy, list exactly which parameters are held fixed/omitted (density, albedo/Bond albedo, thermal inertia variation, roughness/beaming, etc.) and justify why comparing objects/families in this proxy is meaningful.
  • The theoretical motivation for expecting informative “envelopes” in the proposed plane $\big(\log_{10}(a), \log_{10}(\dot{a}_{\mathrm{YK,relative}})\big)$ is currently underdeveloped. Traditional family V-shapes arise from integrated drift producing $|a-a_c|$ scaling with $1/D$ (and time), whereas the new diagram plots an (instantaneous/proxy) drift quantity against current $a$. Without a short derivation or conceptual model, it is unclear what the lower envelope represents physically, and thus what a “bucket/U” boundary means under pure Yarkovsky evolution versus resonance truncation (Secs. 1, 2.4, 3.3, 3.5).
    Recommendation: Add a concise theoretical subsection (end of Sec. 1 or in Sec. 2) that derives/argues what boundary shape is expected in $(a, \dot{a}_{\mathrm{YK,relative}})$ under: (i) Yarkovsky-only drift without barriers, (ii) drift plus hard truncation at specific $a$ (resonances), and (iii) realistic mixtures of prograde/retrograde and size/spin distributions. Clearly state what the lower envelope is intended to approximate (e.g., a drift-feasibility limit, a selection-induced floor, or a resonance-imposed wall), and how this differs from or complements the traditional $(1/D, a)$ V-shape analysis (Sec. 3.2).
  • The envelope-fitting algorithm (Sec. 2.4) is structured to select the steepest slope meeting a one-sided coverage constraint on a bounded slope grid. This design can mechanically drive fits to the grid limits ($\pm50$ in Sec. 3.3/Table 2), especially when the data cloud has any sharp cutoff in $a$, and thus can produce the observed near-vertical “bucket walls” and force YDFI saturation (Secs. 3.3–3.4). As implemented, it is therefore unclear how much of the “universal bucket” morphology is intrinsic vs an optimizer artifact.
    Recommendation: Revise Sec. 2.4 and the fitting pipeline to estimate the envelope rather than maximize steepness. For example: (i) fit the desired lower quantile directly via quantile regression (with slope free) instead of selecting maximal $|m|$; (ii) use a hull/alpha-shape envelope extraction and measure local tangents; or (iii) explicitly compare parametric models (V-shape vs bucket-with-walls) using an objective criterion. In Sec. 3.3, add robustness tests varying (a) slope bounds (e.g., $\pm50$, $\pm100$, effectively unbounded), (b) slope step size, and (c) coverage quantile (e.g., 90/95/99%) for several benchmark families, and report whether steep walls persist.
  • Core internal inconsistencies in the stated/implemented coverage criterion undermine confidence in the boundary fits. Sec. 2.4 states a 95% coverage target and uses a 5th-percentile intercept rule, but Table 2 reports coverage$_1 =$ coverage$_2 = 0.9$, and Table 1 reports coverages often below 0.95. This mismatch affects fitted lines and any downstream metric based on slope/coverage.
    Recommendation: Make the target coverage parameter explicit and consistent throughout: define $q$ (e.g., $q = 0.95$) and set the intercept to the $(1-q)$ quantile (e.g., 5% for 0.95, 10% for 0.90). Regenerate Tables 1–2 under the declared $q$, or revise the text to match the implemented $q$, and add one sentence clarifying how “coverage” is computed in the tables.
  • YDFI (Sec. 2.5) is not currently a stable or physically interpretable index: it saturates for most families because it is driven by extreme fitted slopes (Sec. 3.4, Table 3), and its “sharpness” component is inherently dependent on arbitrary choices (log scaling, slope-grid bounds, and units/normalization of the logged quantities). As a result, the paper’s original central hypothesis (age $\rightarrow$ spin evolution $\rightarrow$ degraded fidelity) is not meaningfully testable with the current YDFI design.
    Recommendation: Either (a) redesign YDFI to be bounded and less sensitive to arbitrary slope caps (e.g., using curvature, wall verticality measured via derivative distribution, residual dispersion from an envelope model, or a likelihood ratio comparing “Yarkovsky-only” vs “Yarkovsky+barrier” models), and re-run Sec. 3.4; or (b) reframe the paper so YDFI is explicitly presented as an exploratory metric that failed (due to saturation), moving the main contribution to the diagnostic diagram and the boundary morphology. Update the Abstract, Sec. 1, Sec. 2.6, Sec. 3.4, and Sec. 4.1–4.2 accordingly.
  • The age–YDFI correlation analysis (Secs. 2.6, 3.4; Fig. 17) is explicitly acknowledged to be invalid due to a severe merge error (duplicated rows; misassigned ages). Reporting a numerical Spearman $\rho$ and p-value in the Results despite invalid data risks confusing readers and undermines statistical credibility.
    Recommendation: Fix the merge procedure (document join keys and row counts before/after), rebuild the correct 62-family analysis table, and re-run the correlation (even if null). If a corrected analysis cannot be completed, remove Fig. 17 and the numerical correlation from the main Results, and confine the failed attempt to an Appendix with a clear “invalid” label. Ensure the Abstract/Conclusions do not imply a completed age test.
  • The manuscript attributes the steep-walled boundaries to “hard dynamical barriers” such as mean-motion or secular resonances (Secs. 3.5, 4.2–4.3), but this is not quantitatively demonstrated. Additionally, it is unclear whether $a$ is osculating or proper semimajor axis; resonance/family analyses typically require proper elements, and mixing osculating elements can blur resonance alignment. Without systematic resonance mapping, the central dynamical interpretation remains largely qualitative.
    Recommendation: Clarify in Sec. 2.1/2.4 whether semimajor axis is osculating or proper, and strongly consider repeating the analysis in proper $a_p$. For representative families (e.g., Vesta, Flora, Eunomia, Themis), overplot the locations of major resonances on the $a$-axis in Figs. 13–16 and/or provide a table listing: (i) inferred wall locations (in $a$), (ii) nearby strong resonances, and (iii) a qualitative match score. Clearly separate “supported by overlays/table” from “hypothesis” in Secs. 3.5 and 4.2–4.3.
  • The simplified thermophysical/spin assumptions used to compute $\dot{a}_{\mathrm{YK,relative}}$ (single $\Gamma$, fixed emissivity/albedo handling, rapid-rotator assumption, and a single effective obliquity factor $\cos(30^\circ)$; Sec. 2.3) are strong enough to potentially reshape the distribution in the diagnostic plane. In particular, using a fixed positive $\cos(\epsilon)$ removes sign information and ignores known bimodal obliquity distributions in families, which is central to interpreting drift-driven morphology and asymmetry.
    Recommendation: Expand Sec. 2.3 with justification and citations for the chosen parameter values and approximations, and add sensitivity tests (Sec. 3.3 or Appendix) varying $\Gamma$, obliquity distributions (e.g., isotropic, bimodal at $\pm1$), and rotation-regime assumptions for a few benchmark families. Explicitly discuss how losing drift-direction (prograde vs retrograde) affects the meaning of “lower envelope” in the new diagram, and whether separating objects by inferred drift sign is possible or required for interpretation.
  • Key validation steps are missing to distinguish physical discovery from pipeline-induced structure. Given the novelty of the diagnostic plane and the envelope-fitting rule, the current analysis would benefit greatly from synthetic tests demonstrating that the pipeline (i) recovers known shapes when they are present and (ii) does not create buckets from smooth V-shaped data clouds.
    Recommendation: Add a validation subsection (end of Sec. 2 or in Sec. 3) using synthetic/mocked families: generate ensembles with known Yarkovsky-only evolution (producing traditional V-shapes) and with imposed resonance truncations, then run the full pipeline (drift proxy $\rightarrow$ diagram $\rightarrow$ envelope fit $\rightarrow$ YDFI). Report whether and when the method yields bucket walls and slope saturation. This single test would substantially strengthen the credibility of Sec. 3.3–3.4 outcomes.
Minor Issues (6):
  • Family-center definition: the vertex/apex split point $a_c$ is treated as fixed (Sec. 2.4), and appears to be taken from the largest member in at least some discussion. This choice is not validated and may bias branch assignment and envelope fits for families with offsets, substructure, or interlopers.
    Recommendation: In Sec. 2.4 and/or Sec. 3.3, justify the chosen $a_c$ definition and test alternatives (median $a$, density peak in $(a,e,i)$, literature family center). Report sensitivity of fitted slopes/walls to $a_c$ for a few families.
  • Data provenance and reproducibility are incomplete (Sec. 2.1, Sec. 3.1): the six CSV inputs are not fully described with source catalogs, versions/retrieval dates, identifiers used for joins, and rules for resolving conflicting diameter/spin measurements. Selection biases from requiring measured spin periods and diameters are likely strong and may affect inferred envelopes.
    Recommendation: Expand Sec. 2.1 with explicit sources/versions (e.g., LCDB for spin periods; WISE/NEOWISE for diameters, etc., if applicable), join keys, deduplication rules, and an estimate of completeness/selection bias (e.g., distributions of $H$, $D$, or family member fractions retained). Add a data/code availability statement (repository/DOI) to enable replication.
  • Ambiguity in the log transforms of dimensioned quantities (e.g., $\log_{10}(a)$, $\log_{10}(\dot{a}_{\mathrm{YK,relative}})$) and unit conventions (meters vs AU) appears in Sec. 2.1/2.4 and figure labels; this affects intercepts and comparability across plots (even if slopes are unchanged).
    Recommendation: Define normalized variables explicitly (e.g., $X = \log_{10}(a/\mathrm{AU})$) and ensure all figures/tables match that convention. If $\dot{a}_{\mathrm{YK,relative}}$ is a proxy, define its reference normalization (or state the unit convention used before logging).
  • Resonance and element-choice discussion would benefit from clearer integration with the Results narrative: Sec. 3.5/4.2 discuss barriers, but the figures do not annotate where the walls are in $a$ nor provide resonance overlays, making the interpretation hard to check visually.
    Recommendation: Add annotations to representative panels (Figs. 13–16) marking inferred wall locations and nearby major resonances; include a small table summarizing these values for the plotted families.
  • Figures and captions need more self-contained clarity (notably Figs. 1–4, 9–17): missing/unclear panel labels, inconsistent axis-unit/log indications, lack of sample sizes and selection criteria, and overplotting reduce interpretability.
    Recommendation: Revise captions to include (i) variable definitions and units/log base, (ii) $N$ per panel/family, (iii) key selection criteria, and (iv) fit parameters/coverage used. Consider density/hexbin rendering for large-$N$ scatter plots and use colorblind-safe palettes.
  • YORP/spin-evolution framing is conceptually plausible but not quantitatively supported (Secs. 1, 4.3). Without order-of-magnitude YORP timescale context, it is unclear what strength of any age dependence one should expect for any prospective fidelity metric.
    Recommendation: Add brief YORP timescale estimates versus size (with citations) in Sec. 1 or Sec. 4.3 and relate them to the age range in Table 3. Use this to set expectations and to motivate why an age trend might be weak even with a good metric.
Very Minor Issues:
  • Equation and notation consistency: occasional inconsistencies between symbols for the drift proxy (e.g., dot/hat usage) and typographical issues in exponents/parentheses (Secs. 2.3, 2.5) reduce readability and impede verification.
    Recommendation: Proofread equations for consistent notation and unambiguous typography (parentheses, superscripts). Use a consistent symbol for $\dot{a}_{\mathrm{YK,relative}}$ throughout and define it once in Sec. 2.3 and again at first use in Sec. 3.3.
  • Acronym reuse: YDFI is not always re-expanded at first appearance in later major sections (e.g., Sec. 3.4, Sec. 4.1), which mildly hinders standalone reading.
    Recommendation: Expand “Yarkovsky Drift Fidelity Index (YDFI)” at first occurrence in each major section.
  • Minor textual/formatting issues: inconsistent SI formatting, quotation usage around “V-shapes”/“bucket”, and minor typos (e.g., p-value spacing) occur across sections.
    Recommendation: Perform a final style pass to standardize SI units, hyphenation, capitalization (semimajor axis), and quotation/italics conventions; correct minor typographical errors.

Mathematical Consistency Audit

Mathematics Audit by Skepthical

This section audits symbolic/analytic mathematical consistency (algebra, derivations, dimensional/unit checks, definition consistency).

Maths relevance: substantial

The paper defines a per-asteroid surrogate Yarkovsky drift-rate expression from solar flux, mean motion, sub-solar temperature, and a thermal parameter; it then constructs log–log diagnostic diagrams and fits two one-sided linear lower-envelope boundaries per family under a coverage constraint. A composite index (YDFI) is defined from the fitted slopes via a sharpness and symmetry term. Core internal-consistency concerns arise from ambiguity in the central drift-rate formula’s algebraic structure and mismatches between stated and reported fitting-coverage parameters and slope-grid limits.

Checked items

  1. Diameter-to-radius conversion (Sec. 2.1, p.2 (bullet list))

    • Claim: Convert diameter $D$ (km) to radius $R$ (m) via $R = D \times 1000/2$.
    • Checks: algebra, units/dimensions, definition consistency
    • Verdict: PASS; confidence: high; impact: minor
    • Assumptions/inputs: $D$ is provided in kilometers and represents diameter.
    • Notes: Conversion km$\rightarrow$m introduces factor 1000 and radius is half the diameter; consistent.
  2. Spin period to angular frequency (Sec. 2.1, p.3 (bullet list))

    • Claim: Convert spin period $P$ (hours) to spin rate $\omega$ (rad/s) via $\omega = 2\pi/(P \times 3600)$.
    • Checks: algebra, units/dimensions
    • Verdict: PASS; confidence: high; impact: minor
    • Assumptions/inputs: $P$ is a rotation period in hours.
    • Notes: Hours$\rightarrow$seconds conversion and $\omega = 2\pi/P$ are consistent.
  3. Semimajor axis AU-to-meters conversion (Sec. 2.1, p.2 (bullet list))

    • Claim: Convert semimajor axis $a_{\rm AU}$ (AU) to $a$ (m) via $a = a_{\rm AU} \times 1.496 \times 10^{11}$.
    • Checks: units/dimensions, definition consistency
    • Verdict: PASS; confidence: high; impact: minor
    • Assumptions/inputs: $a_{\rm AU}$ is expressed in astronomical units.
    • Notes: A linear scaling maintains dimensional correctness; later usage of $a$ vs $a_{\rm AU}$ is, however, ambiguous (see separate item).
  4. Solar flux scaling with distance (Eq. (3), Sec. 2.3, p.3)

    • Claim: Solar flux at heliocentric distance scales as $\Phi = 1365\cdot(a_{\rm AU})^{-2}\ [{\rm W/m}^2]$.
    • Checks: algebra, units/dimensions, symbol consistency
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: 1365 is the solar constant at 1 AU., $a_{\rm AU}$ is in AU.
    • Notes: Inverse-square dependence is algebraically consistent with scaling by $(a_{\rm AU})^{-2}$; units are carried as W/m$^2$.
  5. Mean motion formula and units (Eq. (4), Sec. 2.3, p.3)

    • Claim: Mean motion $n = \sqrt{GM_\odot/a^3}$ [rad/s].
    • Checks: units/dimensions, symbol consistency
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: $a$ is in meters (as stated in Sec. 2.1)., Radians are treated as dimensionless.
    • Notes: $GM$ has units m$^3$/s$^2$ and dividing by $a^3$ yields $1/{\rm s}^2$; square root yields $1/{\rm s}$.
  6. Sub-solar temperature expression (Eq. (5), Sec. 2.3, p.3)

    • Claim: $T_{\rm ss} = \left(\Phi/(f_e \sigma)\right)^{1/4}$ [K].
    • Checks: units/dimensions, symbol consistency
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: $f_e$ is dimensionless emissivity factor., $\sigma$ has units W m$^{-2}$ K$^{-4}$.
    • Notes: $\Phi/(\sigma)$ has units K$^4$, so the 1/4 power gives Kelvin.
  7. Thermal parameter $\Theta$ dimensionless check (Eq. (6), Sec. 2.3, p.3)

    • Claim: $\Theta = \Gamma\sqrt{\omega}/(f_e\sigma T_{\rm ss}^3)$ is dimensionless.
    • Checks: units/dimensions, symbol consistency
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: $\Gamma$ is given as J m$^{-2}$ s$^{-0.5}$ K$^{-1}$., $\omega$ has units s$^{-1}$., $\sigma$ has units W m$^{-2}$ K$^{-4}$.
    • Notes: $\Gamma\sqrt{\omega}$ yields W m$^{-2}$ K$^{-1}$; $f_e\sigma T^3$ also yields W m$^{-2}$ K$^{-1}$; ratio is dimensionless.
  8. Core drift-rate algebra is unambiguous (Eqs. (1)–(2), Sec. 2.3, p.3)

    • Claim: Define a relative drift-rate proxy $a\dot{}_{\mathrm{YK,relative}} \propto \left(\Phi/(Rn)\right) \cdot \left(\frac{\Theta}{2+2\Theta+\Theta^2}\right) \cdot \cos\epsilon$, then drop $\cos\epsilon$ and constants for a proportional quantity.
    • Checks: notation/parentheses, symbol consistency, dimensional reasoning
    • Verdict: UNCERTAIN; confidence: medium; impact: critical
    • Assumptions/inputs: The intended expression is a product of $\Phi/(Rn)$ and a rational function of $\Theta$., Obliquity factor $\cos\epsilon$ is treated as constant under the averaging assumption.
    • Notes: As rendered, the $\Theta$-dependent factor appears as "$\Theta$ 2 + 2$\Theta$ + $\Theta^2$" without clear fraction structure, so it is not verifiable whether the intent is $\Theta/(2+2\Theta+\Theta^2)$ or another polynomial expression. Additionally Eq. (2) switches from proportionality language to an equality sign, making the definition unclear. This blocks an unambiguous audit of the central derived quantity used in all subsequent log plots and fits.
  9. Dropping the obliquity factor after averaging (Sec. 2.3, p.3 (text between Eqs. (1) and (2)))

    • Claim: Assume $\cos\epsilon \approx \cos 30^\circ$ and ignore it (and other constants) when computing a relative drift quantity.
    • Checks: algebra, definition consistency
    • Verdict: PASS; confidence: medium; impact: minor
    • Assumptions/inputs: $\cos\epsilon$ is treated as constant for all objects in the proxy.
    • Notes: Given the stated goal (shape in phase space) and treating $\cos\epsilon$ as constant, removing the factor is algebraically consistent; physical appropriateness is out of scope.
  10. Log-space variable definitions vs earlier unit conversion (Sec. 2.4, p.3–4)

    • Claim: Define $X = \log_{10}(a)$ and $Y = \log_{10}(a\dot{}_{\mathrm{YK,relative}})$ for boundary fitting.
    • Checks: definition consistency, units/dimensions, notation clarity
    • Verdict: UNCERTAIN; confidence: medium; impact: moderate
    • Assumptions/inputs: $a$ refers to semimajor axis after preprocessing (unclear whether in meters or AU)., $a\dot{}{\mathrm{YK,relative}}$ is positive so $\log$ is defined.
    • Notes: Sec. 2.1 states $a$ is converted to meters, but solar flux uses $a_{\rm AU}$ and embedded figure axes appear to label $\log_{10}(a/\mathrm{AU})$. The paper does not explicitly state which version of $a$ is logged. This ambiguity mainly affects fitted intercepts ($c_1,c_2$) rather than slopes.
  11. Coverage enforcement via intercept quantile (Sec. 2.4, p.4 (steps 2–3))

    • Claim: For a fixed slope $m$, set intercept $c$ to the 5th percentile of $c_i = Y_i - mX_i$ so that 95% of points lie above $Y = mX + c$.
    • Checks: algebra/inequality logic, method consistency
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: Percentile is computed over $c_i$ values in the branch.
    • Notes: If $c$ is the 5th percentile of $(Y_i - mX_i)$, then for $\sim95\%$ of points, $Y_i - mX_i \geq c$, equivalently $Y_i \geq mX_i + c$.
  12. Stated 95% coverage vs reported 0.9 coverage (Sec. 2.4, p.4; Tables 1–2, pp.6 and 9)

    • Claim: The fit lines are chosen to ensure 95% of points lie above them in each branch.
    • Checks: definition consistency, constraint consistency
    • Verdict: FAIL; confidence: high; impact: critical
    • Assumptions/inputs: Tables 1–2 report the actual achieved/target coverages.
    • Notes: Sec. 2.4 explicitly specifies 95% coverage and uses a 5th-percentile rule, but Table 2 lists coverage$_1 = $ coverage$_2 = 0.9$ and Table 1 lists coverages $< 0.95$. If 0.90 is the true target, step 3’s percentile should be 10%, not 5%. This inconsistency affects the mathematical definition of the fitted boundary and thus YDFI.
  13. Slope-grid bounds inconsistency (Sec. 2.4, p.4; Sec. 3.3 and Table 2, p.9)

    • Claim: Candidate slopes are searched over a predefined grid; results saturate at grid bounds ($\pm50$).
    • Checks: definition consistency, algorithm specification consistency
    • Verdict: FAIL; confidence: high; impact: moderate
    • Assumptions/inputs: Sec. 2.4’s slope ranges are meant to describe the actual grid used.
    • Notes: Sec. 2.4 gives an example left-branch grid around $-20$ to $-0.1$ (and positive for right), but Sec. 3.3/Table 2 discuss saturation at $\pm50$. The exact grid bounds used are a key mathematical specification because they cap the fitted slopes and directly cap the Sharpness and hence YDFI.
  14. YDFI sharpness definition (Eq. (8), Sec. 2.5, p.4)

    • Claim: Sharpness $= (|m_1| + m_2)/2$, with $m_1$ negative and $m_2$ positive by construction.
    • Checks: algebra, definition consistency
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: Left-branch slopes are negative; right-branch slopes are positive.
    • Notes: Under the stated sign convention, $|m_1|$ and $m_2$ are both nonnegative, so Sharpness is well-defined and increases with boundary steepness.
  15. YDFI symmetry definition and range behavior (Eq. (9), Sec. 2.5, p.4)

    • Claim: Symmetry $= 1 - \frac{|m_1+m_2|}{|m_1|+m_2}$, equaling $1$ when $m_2=-m_1$ and approaching $0$ when one slope dominates.
    • Checks: algebra, sanity/limiting cases
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: Denominator $|m_1|+m_2 > 0$ (non-degenerate fit)., $m_2 \geq 0$ as per the fitting setup.
    • Notes: If $m_2=-m_1$ then numerator is 0 so symmetry=1. If $m_2\rightarrow0$ with $m_1<0$, ratio$\rightarrow|m_1|/|m_1|=1$ so symmetry$\rightarrow0$. Range can fall outside [0,1] only if sign assumptions are violated.
  16. YDFI saturation at slope-grid maximum (Sec. 3.4 and Table 3, p.10 (with Eqs. (7)–(9)))

    • Claim: If fits saturate at $m_1\approx-50$ and $m_2\approx+50$, then Sharpness$\approx50$, Symmetry$\approx1$, so YDFI$\approx50$ (maximum).
    • Checks: algebra, definition consistency
    • Verdict: PASS; confidence: high; impact: minor
    • Assumptions/inputs: Sharpness and Symmetry are computed exactly as in Eqs. (8)–(9)., Slope-grid bounds are $\pm50$.
    • Notes: With $m_1=-50$ and $m_2=+50$: Sharpness$=(50+50)/2=50$; Symmetry$=1-|0|/100=1$; YDFI$=50$.

Limitations

  • Audit is restricted to the provided PDF text/images; some equations (notably Eqs. (1)–(2)) appear ambiguously rendered in the extracted text, and the PDF does not include derivation steps sufficient to disambiguate the intended algebra.
  • No external theoretical formulae were consulted; only internal symbolic consistency was assessed.
  • Where the paper’s narrative appears inconsistent with tables/figures (coverage and slope-grid bounds), the audit flags internal inconsistency but cannot determine which version reflects the actual implementation.

Numerical Results Audit

Numerics Audit by Skepthical

This section audits numerical/empirical consistency: reported metrics, experimental design, baseline comparisons, statistical evidence, leakage risks, and reproducibility.

18 numeric candidates were checked: 17 PASS and 1 FAIL. The single FAIL is a cross-section inconsistency between a stated 95% coverage requirement and Table 2’s reported 0.9 coverage values. All other spot-checks of constants, conversions, inverse-square scaling, and Table 3 internal formula relationships (sharpness, symmetry, and YDFI products/identities) were internally consistent within the specified tolerances.

Checked items

  1. C1_cos30_approx (p.3, Sec. 2.3 (text near Eq. 1))

    • Claim: Assumed an average value of $\cos(30^\circ)\approx0.866$.
    • Checks: constant_evaluation
    • Verdict: PASS
    • Notes: Stated value matches $\cos(30^\circ)$ to 3 decimals within rounding tolerance.
  2. C2_AU_to_m (p.2, Sec. 2.1 (unit conversion bullet for semimajor axis))

    • Claim: Semimajor axis conversion: $a = a_{\rm AU} \times 1.496 \times 10^{11}$ (meters).
    • Checks: unit_conversion_constant_consistency
    • Verdict: PASS
    • Notes: Parser/notation check passed: raw '1.496 × 1011' interpreted as $1.496\times10^{11}$ (not $1.496\times1011$).
  3. C3_diameter_to_radius (p.2, Sec. 2.1 (unit conversion bullet for diameter/radius))

    • Claim: Converted diameter $D$ (km) to radius $R$ (m): $R = D \times 1000 / 2$.
    • Checks: algebraic_equivalence
    • Verdict: PASS
    • Notes: Algebraic simplification confirms $R({\rm m}) = 500\cdot D({\rm km})$.
  4. C4_spin_period_to_omega (p.3, Sec. 2.1 (spin conversion bullet))

    • Claim: Spin period $P$ (hours) converted to angular spin rate $\omega$: $\omega = 2\pi/(P\times3600)$.
    • Checks: formula_sanity_check
    • Verdict: PASS
    • Notes: Spot-check at $P=1$ hr reproduces $\omega = 2\pi/3600$ exactly within tolerance.
  5. C5_solar_flux_scaling (p.3, Eq. (3))

    • Claim: Solar flux scaling: $\Phi = 1365 \cdot (a_{\rm AU})^{-2}$ [W/m$^2$].
    • Checks: inverse_square_scaling_spotcheck
    • Verdict: PASS
    • Notes: Arithmetic spot-check: $\Phi(1\ \mathrm{AU})=1365$ and $\Phi(2\ \mathrm{AU})=341.25$.
  6. C6_mean_motion_constants_parse (p.3, Eq. (4) and following line)

    • Claim: Uses $n = \sqrt{G M_{\odot} / a^3}$ with $G = 6.674 \times 10^{-11}$ and $M_\odot = 1.989 \times 10^{30}$.
    • Checks: constant_parsing_and_dimensional_sanity_spotcheck
    • Verdict: PASS
    • Notes: Multiplication of provided parsed constants yields $GM$ consistent with the internally targeted order-of-magnitude; check is intended to catch exponent/OCR parsing errors.
  7. C7_Tss_expression_internal_consistency (p.3, Eq. (5) and following line)

    • Claim: $T_{\rm ss} = \left(\Phi/(f_e \sigma)\right)^{1/4}$ with $f_e=0.9$ and $\sigma=5.67\times10^{-8}$.
    • Checks: numeric_recomputation_from_given_constants
    • Verdict: PASS
    • Notes: Computed finite positive value ($T_{\rm ss}=404.4145486431211\ \mathrm{K}$) from stated constants; no paper-stated numeric target was available for direct comparison.
  8. C8_YDFI_sharpness_formula_check (p.4, Eq. (8))

    • Claim: Sharpness $= (|m_1| + m_2) / 2$.
    • Checks: formula_application_table_check
    • Verdict: PASS
    • Notes: Recomputed sharpness matches the provided Table 3 sharpness (example values shown for Vesta).
  9. C9_YDFI_symmetry_formula_check (p.4, Eq. (9))

    • Claim: Symmetry $= 1 - \frac{|m_1 + m_2|}{|m_1| + m_2}$.
    • Checks: formula_application_table_check
    • Verdict: PASS
    • Notes: Maria symmetry computed as $0.989817\dots$, consistent with Table 3 value 0.990 under 3-decimal rounding.
  10. C10_YDFI_product_check (p.4, Eq. (7) and Table 3)

    • Claim: YDFI $=$ Sharpness $\times$ Symmetry; values in Table 3 should match this product.
    • Checks: table_internal_multiplication
    • Verdict: PASS
    • Notes: Example row (Hoffmeister): product differs by ~0.00784, consistent with expected rounding at 3 decimals.
  11. C11_Maria_YDFI_equals_abs_m1_claim (p.10, Table 3 row 'Maria' and p.9 Table 2 row 'Maria')

    • Claim: For Maria, Table 3 reports YDFI $=48.992$; given the method often pins slopes near $\pm50$, check whether YDFI numerically equals $|m_1|$ from Table 2 when $m_2=50$ and symmetry computed accordingly.
    • Checks: derived_identity_check
    • Verdict: PASS
    • Notes: Eqs. (7)-(9) reproduce Table 3 YDFI exactly for Maria with the provided $m_1$, $m_2$; computed YDFI also equals $|m_1|$ for this case.
  12. C12_Table1_Themis_asymmetry_ratio (p.6, Table 1 and p.6 text describing Themis)

    • Claim: Themis has $m_1 = -3.124$ and $m_2 = 0.604$ in Table 1, described as markedly shallower right branch.
    • Checks: cheap_recompute_ratio
    • Verdict: PASS
    • Notes: Computed $|m_2/m_1| = 0.1933418693982074$ (quantifies right-branch shallowness using the stated numbers).
  13. C13_Table1_coverage_within_0_1 (p.6, Table 1)

    • Claim: Coverage values in Table 1 should be valid proportions (0 to 1).
    • Checks: range_check
    • Verdict: PASS
    • Notes: All listed Table 1 coverage values provided in the candidate are within $[0,1]$.
  14. C14_Table2_coverage_mismatch_vs_method (p.4 Sec. 2.4 (95% criterion) vs p.9 Table 2 (coverage_1/2 = 0.9))

    • Claim: Method states boundary lines must ensure 95% of points above the boundary; Table 2 reports coverage$_1 = 0.9$ and coverage$_2 = 0.9$ for listed families.
    • Checks: cross_section_numeric_consistency
    • Verdict: FAIL
    • Notes: Strict consistency check fails: 0.90 is not equal to the stated 0.95 requirement, implying a definition mismatch or an error.
  15. C15_Table2_slope_at_grid_limit (p.8 text (grid limits mentioned) and p.9 Table 2)

    • Claim: Table 2 slopes are at/near the search grid limits (e.g., -50 for left and +50 for right).
    • Checks: boundary_limit_frequency_check
    • Verdict: PASS
    • Notes: For the shown sample (Vesta, Themis, Maria), all slopes are within $d=abs(|m|-50)\leq1.5$ of the $\pm50$ grid limits (fraction near limit = 1.0 for this sample).
  16. C16_Spearman_values_consistency (p.1 Abstract vs p.10 Sec. 3.4 vs p.11 Fig. 17 caption)

    • Claim: Spearman correlation reported as ($\rho = -0.0004$, $p = 0.989$) but Fig. 17 caption reports $p = 0.9887$ and $\rho = -0.0$.
    • Checks: repeated_number_consistency
    • Verdict: PASS
    • Notes: Differences are consistent with rounding ($\Delta p=0.0003$; $\rho=-0.0004$ rounds to -0.0 at 1 decimal).
  17. C17_family_count_duplication_factor (p.10, Sec. 3.4 (data merging error description))

    • Claim: Instead of 62 families, final correlation dataset contained 1,068 entries due to duplication.
    • Checks: ratio_check
    • Verdict: PASS
    • Notes: Computed entries per family $= 17.225806451612904 ~(>1)$, consistent with a duplication scenario as described.
  18. C18_Vesta_membercount_threshold_claim (p.5, Sec. 3.1)

    • Claim: Vesta family: 0.93 Gyr, $>92,000$ members; Themis family: 2.5 Gyr, $>47,000$ members.
    • Checks: internal_numeric_format_check
    • Verdict: PASS
    • Notes: Sanity/range check only: the stated lower bounds are positive and less than the stated total sample size 570,405.

Limitations

  • Only the PDF text was available; underlying CSV data, per-asteroid parameters, and code outputs are not included, preventing verification of dataset-dependent quantities (counts, fitted parameters, coverages, correlations).
  • Figures are not used for numeric extraction beyond captions/tables; no plot-pixel value reading was performed as requested.
  • Some numeric strings in the parsed text may reflect OCR/encoding ambiguities (e.g., '1011' vs '10^11'); candidates include parser checks to avoid misinterpretation.