Decisive Cosmological Evidence for the Normal Neutrino Mass Hierarchy from DESI Data Release 2

2605.00001-R1 📅 01 May 2026 🔍 Reviewed by Skepthical View Paper GitHub

Official Review

Official Review by Skepthical 01 May 2026
Overall: 5.0/10
Soundness
4
Novelty
6
Significance
6
Clarity
5
Evidence Quality
4
While the topic is timely and the high-level Bayesian framing is appropriate, the Mathematical Audit flags a critical flaw: the DESI 1D marginalized posterior is treated as an effective likelihood without deconvolving priors (FAIL, medium confidence), risking prior double counting and undermining the Bayes factor interpretation. Additional uncertainties arise from ambiguous incorporation of oscillation Δχ², under-specified priors (especially HS) and evidence computation, and limited robustness to cosmological extensions—issues that directly affect the central ‘decisive’ claim. The Numerical Audit confirms internal arithmetic (e.g., Kfull from Kbase, truncated-normal ULs), and the DESI limit being below the IH minimum does support a qualitative NH preference, but the current methodology and reporting leave substantial gaps in rigor and reproducibility.
  • Paper Summary: The manuscript performs a Bayesian model comparison between the Normal (NH) and Inverted (IH) neutrino mass hierarchies by combining (i) cosmological constraints on the summed neutrino mass $\Sigma m_\nu$ extracted from DESI DR2 (+ Planck CamSpec) chains and (ii) oscillation information from NuFIT 6.0. The DESI cosmological information is compressed to a 1D distribution in $\Sigma m_\nu$, approximated as a truncated Gaussian (Sec. 2.1, Sec. 3.1), and combined with an oscillation likelihood and two prior families on the neutrino masses: an exchangeable hierarchical log-normal prior (SJPV) and a Fisher-information/reference prior (HS) (Sec. 2.3). The resulting Bayes factors $K = P(D|{\rm NH})/P(D|{\rm IH})$ are reported to be very large in $\Lambda {\rm CDM}$ (and still strong in $w_0 w_a$CDM), and the paper further propagates the inferred mass posteriors to predictions for $m_{\beta\beta}$ and $0\nu\beta\beta$ half-lives (Sec. 3.5–3.6), plus a historical reconstruction of $K$ from earlier cosmological $\Sigma m_\nu$ limits (Sec. 3.3, Fig. 4). The topic is timely and the overall Bayesian framing is appropriate, but several core steps are currently ambiguous or conceptually fragile: most importantly, the use of a marginalized DESI posterior as an effective likelihood (risking prior double counting), the sensitivity of $K$ to tail modeling near the IH threshold, incomplete specification of priors and numerical evidence computation, and limited discussion of cosmological-model/systematic dependence given the strength of the paper’s “decisive” claims. Addressing these points is necessary for the conclusions to be both robust and reproducible.
Strengths:
Addresses a high-impact question (cosmological discrimination of NH vs IH) using newly available DESI DR2 constraints (Sec. 1, Sec. 3).
Uses a clear Bayesian model-comparison target (evidence and Bayes factor) and attempts prior-robustness checks by considering two qualitatively different prior families (Sec. 2.2–2.3, Sec. 3.4).
Provides phenomenology-facing outputs (individual mass posteriors; $m_{\beta\beta}$ implications for $0\nu\beta\beta$) that are useful to experimental audiences (Sec. 3.5–3.6).
Includes some robustness exploration across dataset choices and an extension beyond $\Lambda {\rm CDM}$ ($w_0 w_a{\rm CDM}$), and provides a useful historical narrative of how constraints have evolved (Sec. 3.1–3.4, Sec. 3.7).
Figures are generally readable and the logical flow from $\Sigma m_\nu$ constraints to hierarchy preference is easy to follow at a high level.
Major Issues (8):
  • Using a marginalized DESI $\Sigma m_\nu$ posterior as a cosmological likelihood risks prior double counting and conceptual inconsistency (Sec. 2.1, Sec. 3.1). The manuscript states it extracts the 1D marginalized posterior for $\Sigma m_\nu$ from DESI DR2 MCMC chains and then uses it as $P(D_{\rm cosmo}|\Sigma m_\nu)$ in the evidence integral. But a chain-derived marginalized posterior is proportional to $L(\Sigma m_\nu,{\rm other})\times\pi_{\rm DESI}(\Sigma m_\nu,{\rm other})$ marginalized over other parameters using DESI priors, not a likelihood in $\Sigma m_\nu$. Treating it as a likelihood can silently import DESI priors and volume effects into the second-stage evidence, undermining the interpretation of differences between SJPV vs HS and potentially biasing $K$.
    Recommendation: In Sec. 2.1 (and where Eq. (1)–(2) are defined), explicitly distinguish likelihood vs posterior and state the DESI priors relevant to $\Sigma m_\nu$ and the parameters being marginalized. Then do one of the following: (i) reconstruct an approximate likelihood in $\Sigma m_\nu$ (e.g., from a profile likelihood if available, or by deconvolving a known $\Sigma m_\nu$ prior and carefully discussing what remains after marginalization), or (ii) if you intentionally use a posterior-based surrogate, rewrite the evidence expression to avoid double counting and clearly state the approximation and its limitations. Provide a quantitative comparison of $K$ obtained from (a) a likelihood-based surrogate (profile-likelihood or equivalent) versus (b) your posterior-derived surrogate to demonstrate stability.
  • The DESI cosmological information is overly compressed to a 1D truncated Gaussian in $\Sigma m_\nu$ without quantitative validation in the IH-critical tail and without accounting for degeneracies (Sec. 2.1, Sec. 3.1–3.2, Sec. 3.7). The Bayes factor is dominated by the likelihood/posterior tail near $\Sigma m_\nu \approx 0.09$–$0.12$ eV (around the IH minimum). Small mismodeling of the tail or ignoring correlations with parameters that broaden $\Sigma m_\nu$ (especially in $w_0 w_a{\rm CDM}$) can change $\ln K$ materially. The negative untruncated mean ($\mu_0 < 0$) further emphasizes that the effective distribution is boundary-dominated.
    Recommendation: In Sec. 2.1 and Sec. 3.1, validate the 1D surrogate specifically in the range $0.06$–$0.12$ eV: provide goodness-of-fit diagnostics in the tail (not only bulk), and compare to at least one alternative surrogate (e.g., KDE/interpolated density used directly, skew-normal/mixture model). Report how $\ln K$ changes across these alternatives for both $\Lambda {\rm CDM}$ and $w_0 w_a{\rm CDM}$ (Sec. 3.7). Where feasible, include at least a low-dimensional approximation that retains the main $\Sigma m_\nu$ degeneracies (e.g., $\Sigma m_\nu$ with $\Omega_m$ and $w_0$) or reweight the original chains under NH/IH priors to avoid the 1D collapse.
  • Oscillation information (NuFIT 6.0) and the hierarchy preference $\Delta\chi^2$ are incorporated ambiguously and may be double-counted or mis-applied (Sec. 2.1–2.2, Sec. 3.2). The manuscript mentions a NuFIT $\Delta\chi^2 \approx 6.1$ preference for NH and applies it via a multiplicative factor $\exp(\Delta\chi^2/2)$ to produce $K_{\rm full}$, but it is unclear whether $P(D_{\rm osc}|\theta)$ is also integrated over oscillation parameters in the evidence integral. The sign convention and the mapping from NuFIT’s global-fit $\Delta\chi^2$ to a Bayes factor between discrete hierarchies are not defined.
    Recommendation: In Sec. 2.1–2.2 and Sec. 3.2: (i) define $\Delta\chi^2$ explicitly (e.g., $\Delta\chi^2 \equiv \chi^2_{\rm IH} - \chi^2_{\rm NH}$) and therefore the likelihood ratio mapping; (ii) state precisely what enters the evidence integral for each hierarchy (full $P(D_{\rm osc}|\theta)$ surface vs approximations vs none); and (iii) if you keep the $K_{\rm base}/K_{\rm full}$ decomposition, make it mathematically explicit and prove/justify that the $\exp(\Delta\chi^2/2)$ factor does not double-count any oscillation information already included. Ideally, implement a single end-to-end evidence calculation that multiplies $P(D_{\rm cosmo}|\Sigma m_\nu(\theta))\times P(D_{\rm osc}|\theta)$ and integrates over $\theta$ for NH and IH, and tabulate the resulting $\ln K$ alongside the decomposed values.
  • Prior definitions (SJPV and HS) are insufficiently specified to reproduce results and to interpret prior dependence (Sec. 2.3, Sec. 3.4). The SJPV hierarchical log-normal prior needs explicit generative form, hyperpriors, bounds, and the exact procedure for mapping exchangeable draws to ordered eigenmasses under NH/IH (label switching/order constraints). The HS “reference” prior is given in a closed form (e.g., proportional to a polynomial in masses) without a transparent derivation, definition of the base measure, parameterization choices, or normalization domain; evidence values can depend strongly on those choices despite “reference” wording.
    Recommendation: Expand Sec. 2.3 (preferably with an Appendix) to fully define both priors: for SJPV, write the full hierarchical model (including hyperparameter priors and explicit bounds), clarify whether masses are treated as unordered then ordered, and show how NH vs IH mapping is done. For HS, provide a step-by-step derivation from the chosen observable set/Fisher information to the stated mass-space density, including the Jacobian, the measure with respect to which the density is defined, and the normalization domain (upper mass bounds). In Sec. 3.4, add sensitivity checks showing how $\ln K$ changes under reasonable variations of hyperprior bounds and HS normalization bounds.
  • Evidence computation methodology and numerical uncertainty are not documented at the level required for a paper that claims very large Bayes factors (Sec. 2.2, Sec. 3.2–3.4). The manuscript reports $K$ values with high apparent precision (e.g., $K=10231.4$), but does not specify the integral dimensionality, parameterization (three masses vs $m_{\rm lightest}+$splittings vs hyperparameters), numerical method (grid/MC/nested sampling/importance sampling), convergence checks, or uncertainty on $\ln K$. It is also unclear whether the “likelihood” surrogate is treated as normalized or up to a constant, which matters for evidences.
    Recommendation: In Sec. 2.2 and/or an Appendix, specify: (i) the exact integration variables for each hierarchy/prior; (ii) the numerical integration/sampling method; (iii) sample sizes / grid resolution; (iv) convergence diagnostics; and (v) an estimated numerical error bar on $\ln Z$ and $\ln K$ (report $\ln K \pm \sigma$). Reduce reported significant figures accordingly. Also state explicitly whether your $P(D_{\rm cosmo}|\Sigma m_\nu)$ surrogate is normalized and how any normalization constants are handled consistently across hierarchies.
  • Cosmological-model and dataset/systematic dependence is not explored or discussed in proportion to the strength of the “decisive” claim (Sec. 1, Sec. 3.1–3.3, Sec. 3.7, Conclusion). Only $\Lambda {\rm CDM}$ and $w_0 w_a{\rm CDM}$ are analyzed directly. However, $\Sigma m_\nu$ constraints can weaken in standard extensions (e.g., non-flat $\Lambda {\rm CDM}$, $\Lambda {\rm CDM}+N_{\rm eff}$, early dark energy, modified gravity), and $K$ is highly sensitive to any broadening of the $\Sigma m_\nu$ tail above the IH threshold. Additionally, the paper should more clearly separate “DESI+Planck (CamSpec) under a specific pipeline” from a general statement about cosmology.
    Recommendation: In Sec. 3.7 and the Conclusion, add a structured robustness discussion: (i) explicitly list which cosmological extensions are expected to most impact $\Sigma m_\nu$ and why; (ii) either run at least one or two additional representative extensions (e.g., $\Lambda {\rm CDM}+N_{\rm eff}$ and non-flat $\Lambda {\rm CDM}$) and report $\ln K$, or (if infeasible) use published constraints in those models to bound how much $\ln K$ could plausibly decrease; and (iii) provide a table of $\ln K$ across CMB likelihood choices (CamSpec vs alternatives) and dataset combinations (DESI-only, DESI+Planck, DESI+Planck+SN) using a consistent methodology. Soften global statements (“definitive”/“decisive”) to be explicitly conditional on the assumed cosmological model class and dataset combination.
  • $0\nu\beta\beta$ implications are presented too definitively without propagating key nuclear/theory uncertainties and mechanism dependence (Sec. 3.6, Conclusion). Translating $m_{\beta\beta}$ to half-life requires isotope choice, phase-space factors, and nuclear matrix elements (NMEs), which carry sizable spreads; moreover, the link assumes standard light Majorana neutrino exchange dominance.
    Recommendation: In Sec. 3.6 and the Conclusion, specify the isotope(s), phase-space factors, and NME set(s) used, and propagate NME uncertainties into a band of predicted $T_{1/2}$ for a given $m_{\beta\beta}$ (e.g., show ranges across multiple standard NME calculations). State explicitly that results are conditional on the light-Majorana exchange mechanism and that alternative mechanisms can decouple $m_{\beta\beta}$ from the rate. Rephrase detection-probability statements to make these conditions explicit.
  • Scholarly grounding and references need strengthening, and the framing risks overstating novelty/authority (Sec. 1, Sec. 4/References, Conclusion). Key primary references for DESI DR2 cosmology outputs, Planck likelihoods, NuFIT 6.0, prior Bayesian hierarchy studies, and $0\nu\beta\beta$ experimental sensitivities should be clearly and reliably cited. Any non-standard, placeholder, or hard-to-verify references undermine credibility, especially for a “decisive evidence” claim.
    Recommendation: Revise Sec. 4/References to ensure all inputs are traceable to primary, citable sources (DESI DR2 cosmology/likelihood papers, Planck likelihood documentation, NuFIT 6.0 global-fit papers/data release, and key prior work on cosmological hierarchy inference). Update the Introduction/Conclusion phrasing to position the contribution relative to that literature (what is new: DESI DR2-specific update, prior comparison, etc.), and remove/replace any references that are not verifiable or not appropriate for the venue.
Minor Issues (6):
  • Equation and notation clarity: Eq. (1) is rendered ambiguously (integral/evidence notation and the symbol $Z$), and mass-index notation alternates between $(m_1,m_2,m_3)$ and $(m_L,m_M,m_H)$ without an explicit mapping under NH vs IH (Sec. 2.2–2.4).
    Recommendation: Rewrite Eq. (1) unambiguously as $Z_M \equiv P(D|M)=\int P(D|\theta,M)P(\theta|M)\,d\theta$, and use consistent symbols thereafter. Add a small mapping table (Sec. 2.3 or 2.4): under NH, $(m_L,m_M,m_H)=(m_1,m_2,m_3)$; under IH, $(m_L,m_M,m_H)=(m_3,m_1,m_2)$, and state which basis each prior is defined on.
  • Construction of the truncated-Gaussian fit is under-described (Sec. 2.1, Sec. 3.1): KDE/binning choice, fit criterion, and how the 95% upper limit $0.0642$ eV is computed (from chains vs from the fit vs from DESI publication) are not clearly distinguished.
    Recommendation: Add a concise, reproducible description: chain selection, burn-in/thinning, method to estimate the 1D density (KDE bandwidth or binning), fitting procedure for $(\mu_0,\sigma)$ (e.g., MLE for truncated normal), and whether the reported UL is computed directly from the chain or from the parametric fit. Explicitly write the truncated-normal formula and normalization constant.
  • Historical reconstruction of Bayes factors lacks a reproducible mapping from published $\Sigma m_\nu$ upper limits to likelihood surrogates (Sec. 3.3, Fig. 4). It is unclear which experiments/datasets are used at each epoch, what cosmological model each limit assumes, and whether oscillation inputs are time-dependent or fixed to modern NuFIT.
    Recommendation: Add a table (main text or Appendix) listing each epoch: reference, dataset/model, quoted $\Sigma m_\nu$ constraint, the assumed likelihood surrogate (and how $\sigma$ is inferred from a 95% UL), and whether oscillation information is contemporaneous or held fixed. This will make Fig. 4 auditable.
  • Interpretation of $K$ relies heavily on Jeffreys-scale labels (“strong/decisive”) without sufficient caveats about prior/model dependence (Sec. 2.2, Sec. 3.2–3.4, Conclusion).
    Recommendation: Keep reporting numerical $\ln K$ as the main result; add a short note that Jeffreys-scale categories are heuristic. Consider translating headline $\ln K$ values into posterior model probabilities for a few plausible prior odds to help readers interpret the magnitude without categorical language.
  • $m_{\beta\beta}$ Monte Carlo propagation is not described in enough detail to reproduce (Sec. 2.4, Sec. 3.6): sample counts, correlation handling among oscillation parameters, convergence checks, and whether IH $m_{\beta\beta}$ curves are conditional on IH or hierarchy-marginalized are unclear.
    Recommendation: State the sampling procedure ($N$ samples, how correlated NuFIT parameters are drawn), show a simple convergence test (stability of key quantiles), and clarify in captions/text whether plotted distributions are conditional on each hierarchy or weighted by the hierarchy posterior odds.
  • Figures and captions: several plots mix likelihood/posterior language and do not always specify normalization, scale ($\ln K$ vs $\log_{10} K$), or credible-interval definitions; some legends/labels are hard to parse (Sec. 3; Figs. 1, 3–6, 8–9).
    Recommendation: Standardize figure labeling: explicitly state whether curves are posteriors or likelihood surrogates, include axis units and log base, and define the interval/UL construction used. Add a compact table in Sec. 3.2 summarizing $\ln K_{\rm base}$ and $\ln K_{\rm full}$ for each dataset/prior/model shown in figures.
Very Minor Issues:
  • Terminology/typos and consistency: inconsistent formatting of model names (“$w_0 w_a{\rm CDM}$” vs $w_0w_a{\rm CDM}$), interval terminology (95% C.L./UL/credible), and occasional stray acronyms/spacing issues (Sec. 2–4).
    Recommendation: Proofread for consistent notation and terminology throughout; standardize model names and interval language; fix minor typographical issues in equations, captions, and references.
  • Truncated Gaussian parameter interpretation: $\mu_0$ and $\sigma$ are described as “central value” and standard deviation even though $\mu_0<0$ and truncation changes the mean/median (Sec. 2.1, Sec. 3.1).
    Recommendation: Clarify that $(\mu_0,\sigma)$ refer to the underlying untruncated normal parameters, and (optionally) report the truncated distribution mean/median for intuition.
  • Majorana phases in Eq. (3) are not explicit though the text says phases are sampled (Sec. 2.4, Eq. (3)).
    Recommendation: State explicitly whether the PMNS elements $U_{ei}$ in Eq. (3) include the Majorana phases, or write the phases as separate factors.

Mathematical Consistency Audit

Mathematics Audit by Skepthical

This section audits symbolic/analytic mathematical consistency (algebra, derivations, dimensional/unit checks, definition consistency).

Maths relevance: light

The paper’s analytic content centers on (i) defining Bayesian evidence and Bayes factors for NH vs IH, (ii) factorizing the likelihood into a cosmological component depending on $\Sigma m_\nu$ and an oscillation component, (iii) specifying two priors over neutrino masses (SJPV hierarchical and HS reference), and (iv) defining the effective Majorana mass $m_{\beta\beta}$. The main internal-consistency risks are definitional: whether the extracted DESI 1D $\Sigma m_\nu$ distribution is a likelihood or a posterior, ambiguity in the written evidence integral (Eq. (1)), and insufficient derivation detail for the HS prior and for the $\Delta\chi^2$-to-likelihood-ratio conversion.

Checked items

  1. Evidence (marginal likelihood) definition (Eq. (1), Sec. 2.2, p.3)

    • Claim: Defines the model evidence $P(D|M)$ as an integral of likelihood times prior over parameters $\theta$.
    • Checks: notation consistency, definition correctness
    • Verdict: UNCERTAIN; confidence: medium; impact: critical
    • Assumptions/inputs: $D$ denotes combined cosmological and oscillation data, $\theta$ denotes neutrino mass parameters under a fixed hierarchy $M$
    • Notes: As rendered, Eq. (1) appears to miss an integral sign and places '$Z$' as a multiplicative factor rather than an equality to the integral. If the intended equation is '$Z \equiv P(D|M) = \int ... d\theta$', it is fine; but the text alone is ambiguous and blocks strict verification.
  2. Likelihood factorization (Sec. 2.2, p.3 (unnumbered equation after Eq. (1)))

    • Claim: Uses $P(D|\theta,M) = P(D_{\rm cosmo}|\Sigma m_\nu(\theta)) \times P(D_{\rm osc}|\theta)$.
    • Checks: symbol consistency, assumption clarity
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: Conditional independence of cosmological and oscillation datasets given $\theta$ (and $M$), $\Sigma m_\nu$ is a deterministic function of $\theta$
    • Notes: Factorization is internally consistent with the narrative: cosmology depends only on $\Sigma m_\nu$ while oscillations depend on mass splittings and mixing parameters in $\theta$.
  3. Bayes factor definition (Eq. (2), Sec. 2.2, p.4)

    • Claim: Defines $K = P(D|{\rm NH})/P(D|{\rm IH})$.
    • Checks: definition correctness, notation consistency
    • Verdict: PASS; confidence: high; impact: moderate
    • Assumptions/inputs: Same dataset $D$ used for both evidences, NH and IH treated as competing models
    • Notes: Standard and consistent with later text ($K_{\rm base}, K_{\rm full}$).
  4. Use of DESI chain-derived $\Sigma m_\nu$ distribution as likelihood (Sec. 2.1, p.3; Sec. 3.1, p.5)

    • Claim: Extracts the 1D marginalized posterior for $\Sigma m_\nu$ from DESI DR2 MCMC chains and models it as a truncated Gaussian likelihood $P(D_{\rm cosmo}|\Sigma m_\nu)$.
    • Checks: definition/measure consistency, Bayesian coherence
    • Verdict: FAIL; confidence: medium; impact: critical
    • Assumptions/inputs: The chain-derived 1D distribution is proportional to the likelihood in $\Sigma m_\nu$ (or can be treated as such), Any DESI prior on $\Sigma m_\nu$ is negligible/flat or appropriately accounted for
    • Notes: Within the paper text, no step is provided to justify replacing a marginalized posterior with a likelihood, nor is the DESI prior on $\Sigma m_\nu$ specified and removed. This is an internal mathematical inconsistency relative to Eq. (1), where $P(D_{\rm cosmo}|\Sigma m_\nu)$ must be a likelihood factor rather than a posterior density.
  5. Truncated Gaussian on $\Sigma m_\nu \geq 0$ (Sec. 2.1, p.3; Sec. 3.1, p.5)

    • Claim: Approximates the $\Sigma m_\nu$ distribution by a Gaussian with parameters $(\mu_0,\sigma)$ truncated to $\Sigma m_\nu \geq 0$.
    • Checks: domain/constraint consistency, notation clarity
    • Verdict: PASS; confidence: medium; impact: minor
    • Assumptions/inputs: $\Sigma m_\nu$ is physically nonnegative, Truncation is implemented as conditional normalization on $[0,\infty)$
    • Notes: Truncating at $\Sigma m_\nu\geq 0$ is mathematically consistent even when $\mu_0<0$; however, calling $\mu_0$ a 'central value' of the truncated distribution is imprecise (it is the underlying normal’s location parameter).
  6. HS prior functional form (Sec. 2.3, p.4)

    • Claim: States the HS reference prior on $(m_L,m_M,m_H)$ is proportional to $m_L m_M + m_L m_H + m_M m_H$ as a Jacobian-derived reference prior.
    • Checks: derivation completeness, symbol/definition consistency
    • Verdict: UNCERTAIN; confidence: low; impact: moderate
    • Assumptions/inputs: Reference prior is derived from Fisher information of $(\Delta m^2_{21}, \Delta m^2_{3\ell})$, A specific parameter transformation is used, with ordering constraints
    • Notes: The claimed polynomial form is not derivable from the provided text alone: the mapping and Jacobian (and any additional parameter needed beyond two $\Delta m^2$’s) are not shown. Without these, the prior’s correctness cannot be audited.
  7. SJPV prior description vs parameterization (Sec. 2.3, p.4)

    • Claim: Assumes exchangeable $(m_1,m_2,m_3)$ drawn from a common log-normal distribution, inducing a 'geometric volume penalty' against IH.
    • Checks: conceptual consistency, notation consistency
    • Verdict: PASS; confidence: medium; impact: minor
    • Assumptions/inputs: Exchangeability in the prior over labeled eigenstates, Hierarchy imposed as a constraint defining NH/IH subspaces
    • Notes: While no equations are given, the qualitative claim is internally consistent: imposing ordering/near-degeneracy constraints typically reduces allowed prior volume under an exchangeable prior.
  8. $m_{\beta\beta}$ definition (Eq. (3), Sec. 2.4, p.5)

    • Claim: Defines $m_{\beta\beta} = \left| \sum_{i=1}^3 U_{ei}^2 m_i \right|$.
    • Checks: definition correctness, notation consistency
    • Verdict: PASS; confidence: high; impact: minor
    • Assumptions/inputs: $U_{ei}$ are (possibly complex) PMNS elements in a convention compatible with $0\nu\beta\beta$, $m_i$ are neutrino mass eigenvalues
    • Notes: Expression is internally consistent with later description of sampling CP phases; however, the equation does not explicitly display Majorana phases, so the reader must infer they are included in $U$ or applied separately.
  9. Inclusion of oscillation $\Delta\chi^2$ as multiplicative factor (Sec. 3.2, p.6)

    • Claim: Computes $K_{\rm full} = K_{\rm base} \times \exp(\Delta\chi^2/2)$ using $\Delta\chi^2 = 6.1$ from NuFIT.
    • Checks: algebraic transformation, assumption clarity
    • Verdict: UNCERTAIN; confidence: medium; impact: moderate
    • Assumptions/inputs: $\Delta\chi^2$ corresponds to $-2 \log$ likelihood ratio between IH and NH (with a defined sign), Oscillation preference can be separated multiplicatively from the cosmology-driven $K_{\rm base}$
    • Notes: Multiplying by $\exp(\Delta\chi^2/2)$ is only correct for a specific $\Delta\chi^2$ sign convention (typically $\Delta\chi^2 = \chi^2_{\rm IH}-\chi^2_{\rm NH}$). The paper does not define the convention, so the factor could be inverted.
  10. Parameter dependence of cosmological likelihood (Sec. 2.2–2.4, p.3–5)

    • Claim: Uses $\Sigma m_\nu(\theta)$ to connect neutrino mass parameters $\theta$ to the cosmological likelihood term.
    • Checks: mapping consistency
    • Verdict: PASS; confidence: medium; impact: minor
    • Assumptions/inputs: $\Sigma m_\nu$ is computed as a sum of mass eigenstates consistent with the hierarchy constraints, $\theta$ includes enough parameters to define the spectrum
    • Notes: Although $\Sigma m_\nu(\theta)$ is not explicitly written, the mapping is conceptually straightforward and consistent with the rest of the paper’s notation.

Limitations

  • Only three explicit equations are provided; many central computations (evidence integrals, prior normalizations, Jacobians, mapping from oscillation parameters to masses) are described qualitatively without derivations, limiting the scope of symbolic verification.
  • The audit cannot verify any claims requiring omitted intermediate steps (e.g., the HS prior derivation from Fisher information, or the exact construction of $K_{\rm base}$ from the truncated Gaussian and priors).
  • The paper does not specify the prior used in the DESI MCMC analysis for $\Sigma m_\nu$, preventing a purely internal check that the extracted 1D distribution is a likelihood rather than a posterior.

Numerical Results Audit

Numerics Audit by Skepthical

This section audits numerical/empirical consistency: reported metrics, experimental design, baseline comparisons, statistical evidence, leakage risks, and reproducibility.

Out of 12 candidate checks, 11 PASS and 1 is UNCERTAIN. Recomputed Bayes-factor scaling with $\exp(\Delta\chi^2/2)$ matches reported $K_{\rm full}$ values closely for both SJPV and HS priors. Truncated-normal 95% upper limits computed from $(\mu_0,\sigma)$ parameters agree with the reported baseline UL and show the Feldman–Cousins parameterization yields a very similar UL under the same truncation convention. Mass-sum minima for NH and IH are consistent with the stated approximate component masses. The inequality UL$(\Sigma m_\nu) < {\rm IH}$ minimum holds with a $0.0348$ eV margin. The only unresolved item is a qualitative claim about being "almost entirely below" a $5$ meV sensitivity threshold, which cannot be established from endpoints alone.

Checked items

  1. C1_Kfull_from_Kbase_SJPV (Page 6, Section 3.2 (Bayes factors paragraph))

    • Claim: Under the SJPV prior, $K_{\rm base} = 484.5$. Including $\Delta\chi^2 = 6.1$ multiplicatively ($K_{\rm full} = K_{\rm base} \times \exp(6.1/2)$) gives $K_{\rm full} = 10231.4$.
    • Checks: recompute_formula
    • Verdict: PASS
    • Notes: Computed $K_{\rm full,calc} = 10230.3844$ vs reported $10231.4$; relative agreement is very tight, with a rounding-level absolute difference.
  2. C2_Kfull_from_Kbase_HS (Page 6, Section 3.2 (HS prior paragraph))

    • Claim: Under the HS prior, $K_{\rm base} = 22.0$. Including the oscillation preference yields $K_{\rm full} = 464.6$.
    • Checks: recompute_formula
    • Verdict: PASS
    • Notes: Computed $K_{\rm full,calc} = 464.5376$ vs reported $464.6$; consistent with one-decimal rounding.
  3. C3_truncated_normal_UL_from_mu_sigma (Page 5, Section 3.1 and Figure 1 caption)

    • Claim: Baseline $\Lambda {\rm CDM}$ posterior for $\Sigma m_\nu$ modeled as truncated Gaussian with $\mu_0 = -0.009$ eV and $\sigma = 0.036$ eV, yielding 95% upper limit $\Sigma m_\nu < 0.0642$ eV.
    • Checks: recompute_quantile_truncated_distribution
    • Verdict: PASS
    • Notes: Computed truncated-normal $\rm UL_{95} = 0.0648869$ eV vs reported $0.0642$ eV; within tolerance given possible convention details.
  4. C4_truncated_normal_UL_from_FC_mu_sigma (Page 5, Section 3.1 and Figure 1 caption)

    • Claim: Feldman-Cousins profile likelihood parameters $\mu_0 = -0.036$ eV and $\sigma = 0.043$ eV are shown; check implied 95% upper limit under truncation at $\Sigma m_\nu \geq 0$ for consistency with being 'close agreement' to baseline.
    • Checks: recompute_quantile_truncated_distribution
    • Verdict: PASS
    • Notes: Computed truncated-normal $\rm UL_{95}$ (FC params) $= 0.0639334$ eV vs baseline $0.0642$ eV; supports the stated qualitative 'close agreement'.
  5. C5_NH_mass_sum_vs_reported_min (Page 10, Section 3.5 and Figure 6 caption)

    • Claim: For NH, $m_2 \approx 0.008$ eV and $m_3 \approx 0.050$ eV, with $m_1$ nearly massless ($<0.01$ eV), forcing $\Sigma m_\nu$ near theoretical minimum $\sim 0.059$ eV.
    • Checks: sum_components_vs_total
    • Verdict: PASS
    • Notes: $m_2+m_3 = 0.058$ eV vs reported $\sim 0.059$ eV; consistent with rounding/approximation.
  6. C6_IH_mass_sum_vs_reported_min (Page 12, Figure 7 caption and Page 13, Figure 8 caption)

    • Claim: IH requires $m_1 \approx m_2 \approx 0.05$ eV with $m_3 \approx 0$, implying $\Sigma m_\nu$ minimum $\sim 0.099$ eV.
    • Checks: sum_components_vs_total
    • Verdict: PASS
    • Notes: $0.05+0.05+0 = 0.10$ eV vs reported $\sim 0.099$ eV; consistent given rounded inputs.
  7. C7_UL_below_IH_min_tension_margin (Page 1 abstract; Page 5 Section 3.1; Page 13 Figure 8 caption)

    • Claim: DESI DR2 gives 95% UL $\Sigma m_\nu < 0.0642$ eV while IH minimum is $\sim0.099$ eV (or $\geq 0.099$ eV), implying UL is below IH minimum.
    • Checks: inequality_consistency
    • Verdict: PASS
    • Notes: Computed gap $IH_{\rm min} - UL = 0.0348$ eV ($>0$); ratio $UL/IH_{\rm min} = 0.6485$.
  8. C8_w0wa_UL_relaxation_factor (Page 13, Section 3.7)

    • Claim: In $w_0 w_a{\rm CDM}$, 95% UL on $\Sigma m_\nu$ relaxes to $0.163$ eV from baseline $0.0642$ eV.
    • Checks: ratio_change
    • Verdict: PASS
    • Notes: Computed relaxation factor $= 2.53894$; absolute increase $= 0.0988$ eV (153.9% increase).
  9. C9_HS_Kfull_drop_factor_w0wa_vs_LCDM (Page 13, Section 3.7)

    • Claim: HS prior $K_{\rm full}$ drops from $464.6$ ($\Lambda {\rm CDM}$ baseline) to $42.6$ ($w_0 w_a{\rm CDM}$).
    • Checks: ratio_change
    • Verdict: PASS
    • Notes: Computed ratio $= 0.09169$; percent change $= -90.83\%$ (drop of $422.0$ in absolute terms).
  10. C10_mbb_NH_CI_contains_median (Page 11, Section 3.6; Figure 9 caption)

    • Claim: NH $m_{\beta\beta}$ 95% credible interval is $[0.95, 11.55]$ meV with median $3.28$ meV.
    • Checks: interval_contains_point
    • Verdict: PASS
    • Notes: Median lies within the stated 95% CI.
  11. C11_mbb_IH_CI_contains_median (Page 12, Section 3.6; Figure 9 caption)

    • Claim: IH median $m_{\beta\beta}$ is $37.03$ meV with 95% CI $[18.36, 49.51]$ meV.
    • Checks: interval_contains_point
    • Verdict: PASS
    • Notes: Median lies within the stated 95% CI.
  12. C12_compare_experimental_sensitivities_vs_CI_NH (Page 11-12, Section 3.6)

    • Claim: NH 95% CI $[0.95, 11.55]$ meV is 'far below' $\sim 20$ meV and 'almost entirely below' $\sim 5$ meV.
    • Checks: threshold_comparison
    • Verdict: UNCERTAIN
    • Notes: Endpoints confirm $CI_{\rm high}$ (11.55) $<$ 20 meV, but $CI_{\rm high}$ (11.55) is not $<$ 5 meV. The phrase 'almost entirely below 5 meV' cannot be verified without distributional information; an endpoint-based crude fraction above 5 meV using only endpoints is $\sim$0.6179.

Limitations

  • Checks are limited to explicit numbers present in the provided PDF text; no external datasets, MCMC chains, or plot-value extraction are used.
  • Some claims use approximate symbols ($\approx$, $\sim$) and qualitative phrasing (e.g., 'close agreement', 'almost entirely'); only coarse numerical sanity checks can be performed without full distributions.
  • Upper limits and credible intervals may depend on one-sided vs two-sided conventions and on truncation; the PDF does not fully specify these conventions, so tolerances are provided.

Paper Ratings

Dimension Score
Overall 5/10 █████░░░░░
Soundness 4/10 ████░░░░░░
Novelty 6/10 ██████░░░░
Significance 6/10 ██████░░░░
Clarity 5/10 █████░░░░░
Evidence Quality 4/10 ████░░░░░░

Justification: While the topic is timely and the high-level Bayesian framing is appropriate, the Mathematical Audit flags a critical flaw: the DESI 1D marginalized posterior is treated as an effective likelihood without deconvolving priors (FAIL, medium confidence), risking prior double counting and undermining the Bayes factor interpretation. Additional uncertainties arise from ambiguous incorporation of oscillation Δχ², under-specified priors (especially HS) and evidence computation, and limited robustness to cosmological extensions—issues that directly affect the central ‘decisive’ claim. The Numerical Audit confirms internal arithmetic (e.g., Kfull from Kbase, truncated-normal ULs), and the DESI limit being below the IH minimum does support a qualitative NH preference, but the current methodology and reporting leave substantial gaps in rigor and reproducibility.