Back to Research
REGIME DETECTION
Part 5: Bull, Bear, or Sideways — Let the Models Decide
Omega Arena • February 2026 • IN PROGRESS
Abstract. Part 4 trained models to predict price direction. Here a different approach is taken:
regime detection. Instead of asking "will price go up tomorrow?", the question becomes "what kind of market is this?"
Three models—HMM (unsupervised), Random Forest (supervised), and BiLSTM with Attention—work together to classify
market conditions as Bull, Bear, or Sideways. With 100% hindsight-accurate labels and 203 features,
these models learn from perfect historical truth.
1. WHY REGIME DETECTION?
Part 4's price prediction models achieved AUC scores around 0.52-0.57. Better than random, but modest. The problem? Markets behave differently in different conditions.
A strategy that works in a bull market may fail catastrophically in a bear market. The same signal that means "buy the dip" in an uptrend means "catch a falling knife" in a downtrend.
Regime detection provides context. Instead of one model trying to predict everything,
specialized models first identify the market state. Then other models can adapt their behavior accordingly.
The Three Regimes
| Regime | Daily Label | Weekly/Monthly Label | Definition |
| BULL | UP | BULL | Positive % change |
| BEAR | DOWN | BEAR | Negative % change |
| SIDEWAYS | SAME | SIDEWAYS | Zero % change |
2. THE DATASET: ENTERPRISE-GRADE
No corners were cut. This dataset represents months of data engineering.
Data Sources
| Source | Features | Description |
| Level 1 | 43 | RSI, MACD, Bollinger Bands, ATR, ADX, Ichimoku, etc. |
| Level 2 | 18 | Sharpe, Sortino, VaR, CVaR, Max Drawdown, etc. |
| Level 3 | 52 | Volatility regimes, trend strength, fear/greed, exhaustion |
| Level 5 | 19 | VIX, DXY, SPY correlation, macro signals |
| Level 6 | 21 | COT data, yield curve, credit spreads |
| Level 7 | 15 | Context-aware: ATH %, 52-week range, halving cycle |
| FRED | 13 | Economic indicators from Federal Reserve |
| TOTAL | 203 | After processing: 235 (with one-hot encoding) |
Temporal Split
| Split | Date Range | Rows | Purpose |
| Train | 2014-09-17 → 2023-12-31 | 163,574 | Model learning |
| Validation | 2024-01-01 → 2024-12-31 | 34,402 | Hyperparameter tuning |
| Test | 2025-01-01 → 2026-01-26 | 35,531 | Final evaluation |
No data leakage. Strict temporal ordering ensures models never see future data during training.
Test data is truly unseen—from a year the models have never encountered.
Why Level 7 Context Features?
Standard technical indicators (Levels 1-3) capture short-term patterns: RSI, MACD, Bollinger Bands operate on windows of 14-50 days. But regime detection requires longer-term context. A price at $50K means something completely different when it's the all-time high versus when it's 50% below ATH.
The problem: Existing features couldn't answer questions like "Where is the market in the bigger picture?"
Level 7 was engineered specifically to provide this missing context for regime classification.
| Feature Category | What It Captures | Why It Matters for Regimes |
| ATH Context |
Distance from all-time high, days since ATH |
Bull markets push ATHs; bear markets drift away from them |
| 52-Week Range |
Position within yearly high/low range |
Near yearly lows = possible accumulation; near highs = possible distribution |
| Period Returns |
YTD return, yearly return, multi-period momentum |
Regime persistence: bull years stay bullish, bear years stay bearish |
| Seasonality |
Day of week, month, quarter, quarter-end flags |
Historical patterns: "Sell in May", Q4 rallies, weekend effects |
| Halving Cycle |
Days since/until Bitcoin halving, cycle position % |
Crypto-specific: halvings historically correlate with bull market onsets |
Without Level 7, models would see identical feature patterns at completely different market contexts. With Level 7, a -5% daily drop at ATH looks different from -5% drop at yearly lows—because it IS different.
3. THE LABELS: 100% HINDSIGHT ACCURACY
This is the key insight that makes regime detection different from price prediction.
Hindsight is 20/20. When labeling historical data, exactly what happened is known.
If the price went up, it's UP. If it went down, it's DOWN. No thresholds. No guessing.
The model's job is to learn which feature patterns correspond to which outcomes.
Label Distribution (Train Set)
| Label Type | UP/BULL | DOWN/BEAR | SAME/SIDEWAYS |
| Daily Direction | ~50% | ~49% | ~1% |
| Weekly Regime | ~51% | ~48% | ~1% |
| Monthly Regime | ~52% | ~47% | ~1% |
4. MODEL 1: HIDDEN MARKOV MODEL (HMM)
Unsupervised learning. HMM doesn't see the labels. It discovers hidden states purely from the feature patterns.
Why HMM?
- Natural regime switching: Markets transition between states—HMM models this explicitly
- No label bias: Discovers patterns that might not have been thought to label
- Interpretable: Each state has clear statistical properties
Configuration
| Parameter | Value | Reason |
| Features | 219 (numeric only) | HMM requires continuous features |
| States to try | 2, 3, 4, 5 | Find optimal number via BIC |
| Covariance | Diagonal | Numerical stability with many features |
| Iterations | 300 | Ensure convergence |
| Initializations | 10 | Avoid local minima |
HMM Status
| Step | Status |
| Data preparation | DONE |
| Training script | DONE |
| Training execution | PENDING |
| State analysis | PENDING |
5. MODEL 2: RANDOM FOREST
Supervised learning. Random Forest sees the hindsight-accurate labels and learns to predict them from features.
Why Random Forest?
- Handles tabular data excellently: Gold standard for structured data
- Feature importance: Tells us which metrics matter most
- Robust to noise: Ensemble of trees averages out errors
- No scaling required: Tree-based models don't need normalized data
Configuration
| Parameter | Search Range |
| n_estimators | 500 - 1,250 |
| max_depth | 15 - 35, None |
| min_samples_split | 2 - 15 |
| min_samples_leaf | 1 - 6 |
| max_features | sqrt, log2, 0.3, 0.5 |
| class_weight | balanced, balanced_subsample |
Training Approach
- RandomizedSearchCV: 100 hyperparameter combinations
- 5-fold cross-validation: Robust performance estimation
- Scoring metric: F1-macro (balanced across classes)
- Three separate models: Daily, Weekly, Monthly predictions
Random Forest Status
| Step | Status |
| Data preparation | DONE |
| Training script | DONE |
| Training execution | PENDING |
| Feature importance analysis | PENDING |
6. MODEL 3: BIDIRECTIONAL LSTM + ATTENTION
Deep learning for sequences. LSTM processes 90 consecutive days and predicts the next day's regime.
Why LSTM with Attention?
- Sequence modeling: Learns temporal patterns across 90 days
- Bidirectional: Reads sequences forward and backward
- Attention mechanism: Learns which days in the sequence matter most
- Multi-task: One model predicts daily, weekly, AND monthly simultaneously
Architecture
| Component | Configuration |
| Input | (batch, 90 days, 235 features) |
| LayerNorm | Normalize inputs |
| BiLSTM | 3 layers, hidden_size=256 |
| Attention | Learn important timesteps |
| Shared Dense | 256 → 128 with dropout |
| Output Heads | 3 separate heads (daily/weekly/monthly) |
Training Configuration
| Parameter | Value |
| Sequence length | 90 days |
| Batch size | 128 |
| Epochs | 150 (with early stopping) |
| Learning rate | 1e-3 |
| Dropout | 0.3 |
| Early stopping patience | 20 epochs |
| Optimizer | AdamW with weight decay |
| Scheduler | ReduceLROnPlateau |
| Class weighting | Inverse frequency |
LSTM Data
| Split | Sequences | Shape | Size |
| Train | 154,844 | (154K, 90, 235) | 13.1 GB |
| Validation | 25,942 | (26K, 90, 235) | 2.2 GB |
| Test | 27,020 | (27K, 90, 235) | 2.3 GB |
LSTM Status
| Step | Status |
| Data preparation (90-day sequences) | DONE |
| Model architecture | DONE |
| Training script | DONE |
| Training execution (GPU) | PENDING |
7. DATA PIPELINE: 10 VERIFICATION CHECKS
Enterprise-grade data preparation means verifying everything multiple times.
| # | Check | Result |
| 1 | DB columns match dataset | ✓ PASS |
| 2 | All table features present | ✓ PASS |
| 3 | Sample data matches DB | ✓ PASS |
| 4 | Row counts match | ✓ PASS |
| 5 | No empty columns | ✓ PASS |
| 6 | Data types correct | ✓ PASS |
| 7 | No duplicates, price integrity | ✓ PASS |
| 8 | Date continuity | ✓ PASS |
| 9 | Random sample cross-validation | ✓ PASS |
| 10 | Label consistency (sign = direction) | ✓ PASS |
10/10 checks passed. The dataset is verified, cleaned, and ready for training.
8. INFRASTRUCTURE
Training requires significant compute resources.
| Model | Compute | Memory | Est. Time |
| HMM | CPU | ~4 GB RAM | 1-2 hours |
| Random Forest | CPU (all cores) | ~8-16 GB RAM | 2-4 hours |
| LSTM | GPU (A100/L40) | ~30-40 GB VRAM | 4-8 hours |
9. CURRENT STATUS
| Component | Status | Notes |
| Dataset | DONE | 233K rows × 203 features × 97 assets |
| ML Final Datasets | DONE | train/val/test splits, ~18 GB total |
| HMM Data | DONE | 219 numeric features, 137 MB |
| RF Data | DONE | 235 features, 212 MB |
| LSTM Data | DONE | 90-day sequences, 17.6 GB |
| HMM Training | PENDING | Ready to run on RunPod CPU |
| RF Training | PENDING | Ready to run on RunPod CPU |
| LSTM Training | PENDING | Ready to run on RunPod GPU |
| Ensemble | PENDING | After individual models complete |
10. NEXT STEPS
- Upload 18 GB dataset to RunPod network volume
- Train HMM (CPU pod, ~2 hours)
- Train Random Forest (CPU pod, ~4 hours)
- Train LSTM (GPU pod A100/L40, ~6 hours)
- Analyze results and compare models
- Build ensemble voting system
- Integrate with Part 6 (Claude Opus 4.5 decision layer)
Part 5 Status: IN PROGRESS
Data preparation complete. Training scripts ready. Awaiting execution on RunPod infrastructure.
© 2026 Omega Arena