Overview
This application helps uncover potential predictive relationships between commodities and financial assets by analyzing historical lag correlations and testing them for predictive strength. By selecting a target asset (such as coffee) and using other assets (like oil, gold, or gas) as features, the tool explores whether past price movements of these assets — lagged by 3, 6, 9, or 12 months — show meaningful statistical relationships with the current price of the target asset.
Data Pipeline
Asset Pool Setup:
A predefined asset pool includes commodities and financial instruments such as gold, oil, gas, coffee, and others. The user selects a target asset and a custom set of feature assets from this pool.
Data Preparation & Lagging:
For each selected feature asset, lag values of 3, 6, 9, and 12 months are calculated using weekly price data. The lagged features are merged with the target asset’s price data, creating a combined feature set that allows investigation of time-shifted relationships.
Correlation Analysis:
The tool analyzes correlations between the target asset and each feature asset at each lag interval, highlighting which assets and time delays show potential statistical relationships.
Granger Causality Testing:
Correlations are only part of the story — the application then applies Granger causality tests to these relationships to assess whether lagged price movements of one asset may help predict future movements of the target asset. The strongest Granger outcomes (with their respective optimal lag structures) are identified and incorporated into the feature set.
Model Architecture & Training
Once features are finalized and scaled, a custom-built neural network (MLP) is trained to model the target asset’s price movement, using the identified lag-based relationships as predictive signals.
- Layer Structure: The model uses a multi-layer perceptron with dense layers of 128, 64, and 32 neurons, each employing ReLU activations to capture complex, non-linear patterns between features and the target.
- Batch Normalization: Batch normalization layers follow each dense layer, ensuring stable, balanced activations — much like constantly adjusting the clarity and brightness of a screen so that learning remains sharp and consistent.
- Dropout Regularization: Dropout layers are applied after each batch normalization step to reduce overfitting by randomly deactivating nodes during training, forcing the model to learn deeper, more generalised structures rather than simply memorising noise.
- Output Layer: The model concludes with a single dense output unit for regression, predicting the target asset’s price.
Hyperparameters, dropout rates, and layer sizing have been carefully tuned through iterative testing to ensure robust, smooth convergence and reliable predictive outcomes. The app then displays an actual-versus-predicted price chart to show predictive alignment and model confidence.
Model Evaluation & Success
This tool is not built for historical backtesting, as the sheer number of possible asset-lag combinations makes that impractical. Instead, it’s designed as a discovery engine to open your mind to new possibilities — revealing hidden relationships between asset prices and lagged market movements that might otherwise go unnoticed.
It’s not an oracle, but if it were, we'd be retired on a yacht, sipping cocktails, having cornered the commodities market! Until then, this app exists to help you explore and think differently about predictive market relationships.
Deployment
The app is fully deployed on Streamlit Cloud. Users can select their target asset, choose feature assets, run correlation and Granger tests, scale and train the model, and see actual-versus-predicted results — all via a clean, interactive interface.
Scalability
Currently focused on a core set of commodities and financial instruments, the system is designed to grow — allowing for easy expansion into additional asset classes and markets as demand and curiosity evolve.