-
Identifiability vs. prediction is not cleanly separated in the claims and narrative. The paper correctly demonstrates severe non-uniqueness under collinearity (null-space exploitation) (Sec. 3.2–3.2.1), but still frequently refers to “the discovered PDEs,” “true dynamical operator,” and implies physical identification. Given redundancy and scaling/centering effects, the output is better described as learning a predictive operator $f(u) \approx \partial_t u$ on the data manifold, not a uniquely identified physical PDE decomposition (Sec. 3.3.2, Sec. 4).
Recommendation: Reframe key statements in Sec. 3.3.2 and Sec. 4 to explicitly distinguish (i) learning a predictive right-hand-side representation $\Theta\xi$ that matches $\partial_t u$ on the sampled data, from (ii) identifying a unique, physically interpretable PDE form. Add an explicit statement of an equivalence class: many $\xi$ yield nearly identical $\Theta\xi$ due to collinearity; therefore, individual coefficients/term selections are not identifiable without additional constraints. Moderate wording accordingly throughout (Abstract, Sec. 3.3.2, Sec. 4).
-
The provenance and physical nature of the dataset are insufficiently specified (Sec. 1, Sec. 2.1, Sec. 3.1). Without knowing whether the data come from DNS of a known PDE (e.g., compressible/incompressible Navier–Stokes, Boussinesq, etc.) and which parameters/forcing are used, it is hard to evaluate physical plausibility of the learned operators, interpret the density dynamics, or understand what “should” be recovered.
Recommendation: Augment Sec. 2.1 (and briefly Sec. 1) with a concise description of the data source: the generating equations (if known), solver type, forcing, viscosity/diffusivity, nondimensionalization/units, and boundary conditions beyond periodicity. If the generating PDE is unknown/proprietary, state this explicitly and enumerate what is known. In Sec. 3.2–3.3, briefly compare discovered dominant operator families against what one would expect for the stated physical system.
-
Library specification is qualitative rather than explicit, preventing replication and weakening the collinearity diagnosis (Sec. 2.5). The core claim (redundancy between composite operators and primitives) depends on exactly which terms (cross-products, derivative orders, mixed derivatives, component-wise curls/divergences, etc.) are included and how they are scaled.
Recommendation: In Sec. 2.5, report the total number of features (columns of $\Theta$) and counts per category (constant, linear, quadratic, derivatives by order, products-with-derivatives, composite operators). Provide an appendix table listing every feature with a symbolic definition, including which cross-terms and mixed derivatives are included/excluded. For each learned equation, report the number of nonzero terms and group them into interpretable operator families (e.g., net advection, net diffusion) where possible.
-
The paper’s central collinearity analysis is largely descriptive and limited to a single highly redundant library and a single sparse regression scheme; quantitative diagnostics and mitigation/ablation experiments are missing (Sec. 3.2–3.2.1). As written, the paper strongly demonstrates the problem but only partially “addresses” it.
Recommendation: Strengthen Sec. 3.2–3.2.1 with (i) quantitative collinearity diagnostics: numerical rank, singular value spectrum of $\Theta$, condition number (or spectrum) of $\Theta^\top \Theta$, and representative correlations between composite operators and their constituent primitives; and (ii) at least one ablation/mitigation experiment, e.g.: remove composite operators (primitives-only) vs. composite-only, or apply ridge/elastic-net / sequentially-thresholded ridge / group sparsity with physically linked groups (e.g., the 3 advection components). Report impacts on sparsity, coefficient stability/magnitude, and predictive metrics. If additional runs are infeasible, narrow claims in Sec. 4 to “demonstrates” rather than “addresses,” and clearly state limitations.
-
Validation is too narrow to support claims about dynamical fidelity or “extended time horizons.” Forward integration is only shown for one step ($t = 4 \rightarrow 5$) within a dataset of only 10 snapshots (Sec. 3.3.2, Sec. 4). High one-step $R^2$ can occur even when long-horizon dynamics drift, especially when the learned operator is non-identifiable and tuned on the same data manifold.
Recommendation: In Sec. 3.3.2, add multi-step rollouts over as many steps as the dataset permits (e.g., $t = 4\rightarrow 6\rightarrow 7\rightarrow 8\rightarrow 9$), reporting error growth ($R^2$/RMSE vs horizon). Track basic physical diagnostics during rollout (e.g., mean density, kinetic energy, and divergence statistics) to assess drift. If you cannot add experiments, revise Sec. 3.3.2 and Sec. 4 to explicitly limit conclusions to short-horizon (one-step) predictive accuracy.
-
Potential train/test leakage and temporal-derivative construction are under-specified given only 10 time slices (Sec. 2.4–2.6, Sec. 3.3.1). Central differencing uses neighboring times; if evaluation points share neighbors with training points, derivative targets can couple train/test. Also, spatiotemporal correlation makes random point splits overly optimistic.
Recommendation: Clarify in Sec. 2.4–2.6 and Sec. 3.3.1 exactly which time indices contribute to $\partial_t$ targets (central difference implies only 8 usable times) and how train/test splits are formed (spatial only, temporal blocks, or both). Prefer blocked temporal evaluation (e.g., train on a subset of time indices and test on held-out times whose $\partial_t$ targets do not use training-time neighbors). Report the number of sampled space–time points used for regression and evaluation and justify independence assumptions.
-
Reproducibility-critical algorithmic details are missing or incomplete: spectral filter definition/cutoffs, FFT conventions and dealiasing, temporal-difference stencil specifics, sparse regression hyperparameters and stopping criteria, and integration/CFL settings (Sec. 2.3–2.6, Sec. 3.3.2).
Recommendation: Expand Sec. 2.3–2.6 with concrete specifications: (i) filter type (sharp/Gaussian/exponential), cutoff wavenumber(s), and whether filtering is per-field; (ii) FFT normalization and k-grid definition (including $2\pi/L$); (iii) aliasing/dealiasing treatment; (iv) temporal stencil used and how endpoints are handled; (v) sparse regression pseudocode (initial solve, threshold rule, refit procedure, stopping criteria, any ridge term); (vi) hyperparameter selection protocol (search range, validation criterion) and the chosen values per equation; and (vii) RK4/CFL details (CFL number, max $dt$, typical substeps).
-
Feature scaling / intercept handling is internally unclear and affects both null-space claims and coefficient interpretation (Sec. 2.5, Sec. 3.2). The text states standard-scaling $\Theta$ (mean 0, std 1) while including a constant feature (1). A constant column has zero variance and cannot be standard-scaled without special handling; additionally, exact linear identities among raw features generally become affine relations after centering unless an intercept is handled consistently.
Recommendation: Explicitly state how the constant/intercept is handled: whether the constant column is excluded from scaling, whether an intercept is fit separately, and whether targets ($\partial_t$ fields) are centered/scaled. Provide the exact forward transform to scaled $\Theta$ and the back-transform for coefficients (including any intercept correction). Clarify whether null-space relations are claimed for the raw library or the scaled design matrix actually used in regression, and show the transformed dependence (linear vs affine) accordingly.
-
Some numerical “cancellation” examples in the null-space discussion appear arithmetically inconsistent with the stated near-zero residual interpretation (Sec. 3.2–3.2.1). Reported residuals such as $-291.5015$, $-171.66317$, $-45.06$, and $+38.545$ are not “near zero” under typical tolerances, suggesting transcription/sign/grouping errors or a mismatch between what is being summed and what is being claimed.
Recommendation: Audit the cancellation examples in Sec. 3.2–3.2.1: confirm the exact grouping (e.g., composite operator coefficient vs sum of constituent coefficients), ensure consistent sign conventions, and update the numbers and/or the explanatory text so the arithmetic matches the intended point. If the cancellation is approximate rather than near-exact, quantify it (e.g., relative residual) and explain why it is not closer to zero (e.g., scaling/centering effects).