5 ONNX Models for MT5 Trading: LSTM, XGBoost, CNN, Random Forest, Transformer
Five machine learning architectures, five real trading use cases, all exportable to ONNX and consumable in MT5 Expert Advisors. For each: what it’s for, when it makes sense, how to export, minimal example.
⚠️ Honest disclaimer: No model, however sophisticated, guarantees profit. The difference between professional and amateur trader is not the model — it’s feature engineering, rigorous validation, risk management and discipline.
Quick comparison
| Model | When to use | Complexity | ONNX export |
|---|---|---|---|
| Random Forest | Baseline, simple classification | Low | skl2onnx |
| XGBoost | Rich tabular features | Medium | onnxmltools |
| 1D CNN | Local patterns in time series | Medium | torch.onnx.export |
| LSTM/GRU | Long temporal dependencies | High | torch.onnx.export |
| Transformer | Multi-asset, multi-feature | Very high | torch.onnx.export |
1. Random Forest — the honest baseline
Framework: Sklearn · Training: minutes · Model size: ~1-10 MB · Inference: < 1 ms Binary direction classification (up/down next candle) or multi-class. The first model every AI-in-trading project should try — if RF with 200 trees doesn’t beat random (50%), your problem is in the features, not the model. Trains in 1 minute. No hyperparameters to lose sleep over.
from sklearn.ensemble import RandomForestClassifier
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
model = RandomForestClassifier(n_estimators=200, max_depth=8)
model.fit(X_train, y_train)
initial_type = [('input', FloatTensorType([None, n_features]))]
onnx_model = convert_sklearn(model, initial_types=initial_type, target_opset=15)
with open("rf.onnx", "wb") as f:
f.write(onnx_model.SerializeToString())
2. XGBoost — Kaggle winner applied to markets
Framework: XGBoost · Training: minutes-hours · Inference: ~1-3 ms
Same as Random Forest but typically 2-5% more accurate on tabular problems. Best when you have many features (50+ technical indicators, orderbook data, sentiment). Warning: more prone to overfitting on time series — use early_stopping_rounds.
import xgboost as xgb
from onnxmltools.convert import convert_xgboost
from onnxconverter_common.data_types import FloatTensorType
model = xgb.XGBClassifier(n_estimators=300, max_depth=6, learning_rate=0.05,
early_stopping_rounds=20, eval_metric='logloss')
model.fit(X_train, y_train, eval_set=[(X_val, y_val)])
initial_type = [('input', FloatTensorType([None, n_features]))]
onnx_model = convert_xgboost(model, initial_types=initial_type)
with open("xgb.onnx", "wb") as f:
f.write(onnx_model.SerializeToString())
3. 1D CNN — finds visual patterns in series
Framework: PyTorch/TF · Inference: ~1-5 ms
Identifies local patterns in time series — like a ‘scanner that looks for candle formations’ but learns the patterns itself instead of hard-coded rules. Works well in 5-30 candle short-term patterns, especially on synthetic indices (Deriv V75). Faster to train than LSTM, exports cleanly to ONNX.
import torch.nn as nn
class CNN1D(nn.Module):
def __init__(self, in_channels=4, seq_len=30, n_classes=2):
super().__init__()
self.conv1 = nn.Conv1d(in_channels, 32, kernel_size=5, padding=2)
self.conv2 = nn.Conv1d(32, 64, kernel_size=3, padding=1)
self.pool = nn.MaxPool1d(2)
self.fc = nn.Linear(64 * (seq_len // 4), n_classes)
def forward(self, x):
x = self.pool(torch.relu(self.conv1(x)))
x = self.pool(torch.relu(self.conv2(x)))
return self.fc(x.flatten(1))
torch.onnx.export(model, dummy_input, "cnn1d.onnx",
input_names=['input'], output_names=['output'], opset_version=15)
4. LSTM / GRU — memory for long dependencies
Framework: PyTorch/TF · Training: hours · Inference: ~2-10 ms
The ‘standard’ architecture for time series before Transformers. LSTM keeps internal state (memory) carrying information over dozens/hundreds of timesteps. Useful when distant context matters — volatility regimes that change slowly. Good for swing trading on H1/H4/D1. Warning: easily suffers from data leakage if trained badly — always use strict temporal split, never shuffle. GRU is the simpler LSTM cousin, often performs equal/better with fewer parameters.
5. Transformer — when you need more
Framework: PyTorch · Training: hours-days · Inference: ~10-50 ms
The architecture that dominated NLP, adapted for time series (Informer, Autoformer, PatchTST). Learns ‘attention’ relations across all timesteps simultaneously — not sequential like LSTM. Best for multi-asset (predict EURUSD using EURUSD + USD index + gold + bonds + VIX simultaneously).
⚠️ Reality check: The paper ‘Are Transformers Effective for Time Series Forecasting?’ (Zeng et al., AAAI 2023) showed simple linear MLPs sometimes beat Transformers. Don’t use Transformer for hype — use when the problem really justifies it.
Which to choose for your project?
| Your situation | Start with |
|---|---|
| Never trained AI before | Random Forest |
| Have RF, want improvement | XGBoost |
| Suspect local visual patterns | 1D CNN |
| Swing trading, long context | LSTM or GRU |
| Multi-asset, complex features | Transformer |
| In doubt | Random Forest (always) |
Standard pipeline for any model
- Data collection via
mt5.copy_rates_from_pos() - Feature engineering — lagged returns, indicators, temporal context
- Temporal split 80/20 — never shuffle
- Train on train, validate on test holdout
- Export to ONNX with framework exporter
- Validate with onnxruntime in Python
- Embed in EA via #resource + OnnxCreateFromBuffer
- Backtest in MT5 Strategy Tester
- Demo trade 30+ days with risk management
- Live with minimal stake, gradual scaling
Universal principles:
1. Simple working model > complex model that might work
2. Feature engineering beats architecture 9/10 times
3. Temporal validation is sacred — never shuffle financial series
4. Professional backtest uses 6+ months out-of-sample
5. Risk management kills more bots than bad models
🚀 To test ONNX EAs, free Deriv MT5 demo ($10,000 virtual):
