ML Paper Publication Criteria: Do Weak Results Count?

A researcher heading to their first ML conference is worried their stock index forecasting paper won't cut it — the model has weak predictive power and stumbles on regime shifts. The r/MachineLearning community's verdict: that's probably fine, and maybe even the point.

What's happening

The work uses a random forest with SHAP explainability to forecast a stock index from macroeconomic variables. The researcher correctly handled non-stationarity, but the model's predictive lift is modest. More interestingly, SHAP analysis revealed the model fails on regime shifts — like oil flipping from asset to liability across different economic periods — because it never learned the inverted relationship. The researcher is questioning whether that's worth presenting at a local conference.

Why it matters

This is a recurring tension in applied ML research: the field structurally rewards benchmark-beating results, but interpretability work, negative findings, and honest diagnostics are genuinely underrepresented in the literature. A paper that clearly scopes its claims — "here's what this model reveals about market regimes, not a trading signal" — is a legitimate contribution. Local and workshop-tier venues in particular tend to reward intellectual honesty over inflated benchmarks. The SHAP regime-shift finding is actually the most interesting result here, and it points directly toward future work on temporally-aware or regime-conditional models.

What to watch

For early-career researchers, the lesson is that framing drives acceptance as much as raw results do. Reviewers at smaller venues are looking for clear problem definition, methodological rigor, and honest discussion of limitations — not state-of-the-art numbers. If the contribution is "we applied interpretability tools to a hard forecasting problem and found a specific, explainable failure mode," that's a defensible paper. Overselling it as a predictive breakthrough would be the actual mistake.