Can a model predict the market?
A common question: "what if we feed all of candlestick and indicator theory into a model and ask for the probability of a rise?" We did exactly that — honestly. 19 price-action features + derivatives, logistic regression and gradient boosting, walk-forward out-of-sample validation (train on the past, test on the future, no shuffling). Here's what came out.
Result: ≈coinflip
| Feature set | Samples | AUC (OOS) | Accuracy (OOS) |
|---|---|---|---|
| Candles only · Logistic | 20 923 | 0.489 | 50.0% |
| Candles only · Boosting | 20 923 | 0.491 | 50.3% |
| Candles + derivatives · Logistic | 4 819 | 0.499 | 50.5% |
AUC 0.5 = coinflip. The trivial baseline (always 'down', since only 47.8% rise) = 52.2% — not beaten by any model.
The trap walk-forward exists to catch
In-sample (on the same history) single features look tempting — AUC up to 0.55. But out-of-sample it all collapses to 0.49. That's overfitting: it's easy to 'find' an edge on the past that vanishes on new data. Had we shuffled the data or looked only in-sample, we'd have fooled ourselves with a '0.55 discovery'. That's why the only honest judge is testing on the future.
Candles and derivatives added nothing
Candlestick patterns (engulfing, hammer, star) carry ≈0 weight in the model — no signal. Derivatives (OI, funding, crowd skew) on their short window didn't help either (AUC 0.50 → 0.50). It's not about missing one magic feature — the predictable part of the market is simply tiny.
The only lead — positioning
What does show up: extreme crowd skew (crowd_z) as a contrarian gives ~57% both ways — crowd piled into longs → more often down; into shorts → more often up. But that's measured on ~6 days: a lead, NOT proof. We journal it live as a 'crowd skew → contrarian' signal and honestly watch whether the edge holds. Selling flow (CVD), meanwhile, does not help — buyers absorb it.
Takeaway
Even a model doesn't predict direction better than a coinflip. That's not a model defect — it's a property of a near-efficient market. Any 'holy grail of indicators' is 50/50 out-of-sample. The only thing worth measuring further is rare conditional states (positioning extremes), and only on future data, not on faith.
→ Live method accuracy · → Signal screener