Documentation
Platform Contact v1.4 — Whitepaper
Documentation / Technical Whitepaper

energyOS Technical Whitepaper

An overview of the infrastructure that powers cross-commodity correlation insight and market transparency. energyOS is a quantitative machine learning intelligence system designed to generate effective insight across energy relationships, data sources, and market conditions — producing actionable output for natural gas, heating oil, electricity, and adjacent commodity markets.

Model infrastructure: Neurosymbolic and ensemble architecture for cross-commodity correlation. Built with the intent to organize, empower, and increase the efficiency, accuracy, and transparency of energy markets.

Abstract

energyOS combines expansive multi-domain datasets with internal deep machine learning methods — including neural networks and autoencoding — alongside supervised machine learning and a large-language model frontend. Together these layers interpret noise, surface signals, and clean analytical output into a form that is immediately useful for market practitioners.

capability / 01
Cross-Commodity Correlation
Identifies and quantifies non-linear relationships between natural gas, crude oil, heating oil, power, and adjacent data domains — including relationships that do not appear under standard linear analysis.
capability / 02
Regime Detection
Bottleneck autoencoder trained on normal historical conditions detects when current market behavior diverges from baseline, flagging structural breaks and regime shifts as they emerge.
capability / 03
Fair Value Analysis
Estimates fundamental value across natural gas futures, heating oil futures, crude oil, and electricity using complementary Huber Regression and Quantile Gradient Boosting models with calibrated uncertainty bands.
capability / 04
Autonomous Operation
The entire data ingestion, normalisation, model inference, and signal synthesis pipeline runs without human integration. All major models are retrained weekly to account for data drift and fast market adaptation.
capability / 05
LLM Intelligence Layer
A large-language model frontend ingests all model outputs in parallel — reducing noise, resolving conflicting signals, and translating quantitative output into high-context, interpretable market insight.
capability / 06
Satellite Data Integration
Satellite imagery APIs track radiation changes at specific oil refinery coordinates, enabling upstream insight into production changes and refined product supply shifts before they appear in conventional market data.

Introduction

Energy markets are fundamentally unpredictable in isolation. The structure of market orientation is a combination of economics, physics, national and global events — a combination that is not only difficult to understand but hard to predict. Energy types carry specialized characteristics and interconnected relationships that exist across natural gas, refined goods, electricity, and more. Infrastructure differences, subcategory dynamics, and market microstructure multiply the complexity further.

The challenge is not just quantity — it is the pace of change. Relationships in energy markets do not hold static; they evolve. The transition of ERCOT from Real-Time Dispatch (RTD) to Real-Time Co-Optimization (RTC) in December 2025 is a concrete example: a single ISO infrastructure change fundamentally altered how scarcity pricing, battery dispatch, and qualified scheduling entity economics interact. Any system without adaptive capabilities would have degraded immediately.

Design premise

An isolated system without adaptive capabilities does not last long in the energy domain. The pure volume of domain-specific changes makes consistent model performance under a static architecture near impossible. energyOS is designed around this reality from the ground up.

The energyOS architecture addresses this through adaptive machine learning, continuous retraining, regime detection, and a multi-source data infrastructure that spans physical, financial, satellite, and news domains — with the explicit goal of building a system that adapts to market evolution rather than assuming market stationarity.

Data infrastructure

A cross-commodity system that can understand, capitalize on, and predict market dynamics requires expansive quantities of quality data with consistent updating, implementation into model infrastructure, and systematic output. The data pipeline is built around three principles: breadth, autonomy, and differentiation.

Data domains

domain / 01 Commodity & market data
Coal Natural gas Petroleum & refined products Electricity markets Market prices & curve structure Storage reports
↓ batched at publication interval (hourly · daily · weekly · monthly · annually)
domain / 02 Physical & environmental data
Satellite imagery (NASA) Heating degree days (HDDs) Weather & climate data Refinery utilization Grid stress metrics
↓ ingested via scheduled API calls · AWS transforms and stores
domain / 03 Unstructured & NLP sources
International news outlets Energy-specific media signals Geopolitical event feeds Regulatory filings
Satellite differentiation

The platform's satellite integration tracks radiation changes at specific oil refinery coordinate locations. This provides upstream insight into production changes — and by extension, refined product supply — before these shifts appear in conventional market reporting. Radiation quantities correlate with production volumes of heating oil, gasoline, and other refined products.

Autonomous integration

Data ingestion is batched at intervals that correspond to each source's publication frequency. The autonomy of this implementation is one of the most valuable pieces of infrastructure in the system. AWS functions handle the full pipeline: implementation, transformation, and storage — enabling model training, learning, adaptation, and prediction without manual data operations.

Machine learning architecture

Interpreting data whilst understanding gradual or rapid pattern shifts requires several complementary machine learning methods. The architecture runs three parallel model layers, each designed to capture a different dimension of market behavior.

Model layers

layer / 01
Current State Intelligence
Expanding window analysis, normalized values, historical ranges, and market structure assessment (backwardation / contango). Learns current state from historical factors and gradual change.
layer / 02
Fair Value Analysis
Huber Regressor (outlier-resistant) + Quantile Gradient Boosting (calibrated uncertainty bands). Trained on full multi-year history. Evaluates storage deviation, curve structure, satellite scores, and HDDs.
layer / 03
Deep Learning (Non-Linear)
LSTM price direction model for non-linear pattern detection. Bottleneck autoencoder trained on normal conditions for regime shift detection. GBMs for curve structure evolution and market stress classification.

Model components

Model / 01
LSTM Neural Network
Long-Short Term Memory architecture for price direction modeling. Specialized neural network layers use quantities of regression to find non-linear patterns between data domains, adjusting internal weights based on correlation and outcome.
Model / 02
Bottleneck Autoencoder
Trained exclusively on normal historical market conditions. Attempts to reconstruct current environmental data using the normal-conditions baseline — high reconstruction error indicates out-of-norm market behavior and a potential regime shift.
Model / 03
Quantile Gradient Boosting
Applied across contract horizons to capture expected curve structure evolution. Also used for market stress prediction and regime transition classification. Uses forecasted HDDs for weather-driven pricing periods (e.g. ISONE gas vs. heating oil).
Model / 04
Signal Synthesis Engine
Interprets scoring across direction, conviction, unpredictability, edge quality, and cycle phase classification. Translates raw quantitative model outputs into interpretable, actionable market signals.

Training standardisation

Each model bundle is saved with a corresponding StandardScaler, feature list, performance metadata, and input dimensions — ensuring identical preprocessing between training and inference with no feature alignment errors and minimal drift. The StandardScaler prevents raw numerical magnitude from dominating feature importance: without it, a temperature value of 70 would numerically outweigh a spread value of 0.4, despite the spread potentially carrying greater analytical significance.

Retraining cadence

All major deep learning models are retrained or recalibrated on a weekly basis to account for data drift and fast market adaptation — what the system terms "data currents." This cadence enables adaptive pattern recognition as market regimes and structural relationships shift.

LLM intelligence layer

The large-language model layer provides transparency into data interpretation and synthesizes outputs from the quantitative model stack. Each user query triggers a parallel injection of machine learning output data — including trend context, market prices, storage data, satellite imagery signals, news overviews, electricity anomalies, and all model outputs — into the LLM context window.

Primary functions

FunctionDescription
Noise reductionFilters conflicting or low-conviction signals before surfacing them to the analyst. The LLM layer handles the contexts that pure machine learning methods cannot define — including geopolitical interpretation and qualitative market narrative.
Output cleaningTranslates raw quantitative model outputs — scores, z-scores, reconstruction errors, uncertainty bands — into plain-language market insight without sacrificing the underlying precision.
Signal conflict resolutionWhen data values overlap or contrast each other, the LLM layer aggregates and interprets them rather than returning conflicting outputs. High-context interpretation of ambiguous signal combinations is a core function.
Grounded reasoningAll LLM responses are grounded in the current signal context. The system explicitly refuses to speculate beyond available data, stating when a signal is unavailable or stale rather than generating low-confidence answers.
On prompt design

Internal prompt design is a critical component of the LLM layer. The prompt architecture organises all model output data and provides sufficient data context so the model can interpret values correctly — including the relationship between different model outputs and the conditions under which specific signals should take precedence.

Market efficiency applications

The combination of multi-domain machine learning methods and an autonomous intelligence pipeline enables a broad set of efficiency improvements across the energy market stack — from grid operations to trading desk analytics.

application / 01
Grid Reliability
As grid operations become increasingly dependent on renewable energy and battery dispatch — both inherently intermittent — energyOS provides forecasting and management tools for grid operators navigating weather-driven supply variability and human dispatch error.
application / 02
Resource Allocation
Physical bottlenecks and transportation constraints are difficult to anticipate. The system enables better prediction of logistics constraints, supporting rerouting of power and fuel supplies before scarcity pricing triggers emergency conditions.
application / 03
Accelerated Price Discovery
The system processes unstructured data — including NASA satellite imagery, international news, and physical market data — and translates it into actionable insight via the LLM layer, forcing market prices to reflect underlying reality faster than conventional information channels allow.
application / 04
Volatility Smoothing
Traditional energy markets are exposed to violent short-term price spikes driven by human error and panic relative to linear assumptions. Deep learning and non-linear analysis prices assets based on fair value — geopolitical insight, data transparency, and physics — as opposed to pure momentum.
application / 05
Hedge Fund Operations
Advanced cross-commodity arbitrage, regime shift trading, information arbitrage via the intelligence layer, dynamic position sizing using uncertainty bands, and front-running of supply chain disruptions. All adaptable to market policy changes, geopolitical exposure, and physical bottlenecks.
application / 06
Information Arbitrage
By processing satellite, news, and physical data through the intelligence layer before that information is priced by conventional channels, the system surfaces edge in the gap between available information and current market pricing.

Risk considerations

As with any intelligence system operating in a high-stakes domain, energyOS carries a defined set of risk factors that practitioners should understand and account for in their workflows.

Risk categoryDescriptionMitigation
LLM hallucination The LLM layer may generate plausible but unsupported inferences when signal context is incomplete or ambiguous. The system is designed to state signal unavailability explicitly rather than speculate. All responses cite the signals used.
Prompt poisoning Adversarial inputs designed to manipulate LLM output or override system prompt constraints. Internal prompt architecture is hardened. System should not be sole source for execution decisions.
Adversarial noise Deliberately misleading signals in data sources — e.g. coordinated misinformation in news feeds. Multi-source data architecture limits the impact of any single compromised feed on model output.
Data latency Lag between real-world events and data availability within the pipeline can create signal staleness. Freshness timestamps are visible on all signals. The system explicitly flags stale data rather than presenting it as current.
Liquidity risk Model outputs are based on fundamental relationships; execution-level liquidity in target instruments is not assessed. energyOS is a primary resource for validation and verification — not the sole basis for trade execution.
Usage guidance

This system should not be the sole source regarding execution of trades or any operations. It is designed to act as a primary resource for the validation and verification of interpreted results — not a replacement for practitioner judgment, risk management infrastructure, or regulatory compliance processes.

Benchmark comparisons

The following comparisons were conducted on April 8, 2026 at 13:00 UTC against Claude Sonnet 4.6 and ChatGPT 5.3, using standardised prompts targeting different analytical dimensions. The results illustrate the differentiated strengths of a domain-specific intelligence system versus general-purpose LLMs.

Methodology

All prompts were tested simultaneously. Responses were limited to 2–5 sentences per the prompt specification. URL citations were removed from displayed outputs for formatting. No post-processing was applied.

Prompt 1 — General market state

Tell me what is happening in the energy market (3–5 sentences)

Response comparison General market state · April 8, 2026
Claude Sonnet 4.6
Covered Iran conflict impact on Strait of Hormuz flows, cited specific barrel figures and fuel prices, referenced diplomatic signals and their effect on futures. Contextually accurate but generalized in framing.
ChatGPT 5.3
High-level overview of oil volatility, geopolitical risk, natural gas supply conditions, and electricity demand trends. Accurate in direction but thin on specificity and current data points.
energyOS
Provided specific quantitative framework output: WTI above $112/bbl, PADD3 refineries at 97.4% utilization, HO crack spreads at $75/bbl. Read BALANCED/DO_NOTHING with only 2/12 triggers active — and explained precisely why the ceasefire made either directional trade dangerous.

Prompt 2 — Historical divergence analysis

Review the historical 24-month price divergence between ERCOT and PJM (2–3 sentences)

Response comparison ERCOT vs PJM · 24-month historical
Claude Sonnet 4.6
Detailed historical narrative — ERCOT price halving driven by solar/battery, PJM capacity explosion from $29 to $270/MW-day. Strong on structural explanation and market design contrast.
ChatGPT 5.3
Described PJM structural price rises and ERCOT volatility dynamics. Accurate framing but less precise on specific data points than Claude's response.
energyOS
Acknowledged absence of 24-month PJM/ERCOT price history in session. Provided current ISO-NE LMP z-score (1.82σ) and ERCOT NG burn figures, then offered to run the spread and z-score analysis if historical data was uploaded.

Prompt 3 — ERCOT scarcity pricing forecast

From ERCOT ISO management perspective, are they likely to implement scarcity pricing and use natural gas peakers within the next week? (2–3 sentences)

Response comparison ERCOT scarcity probability · 7-day forward
Claude Sonnet 4.6
Correctly identified April shoulder season characteristics, low demand, and renewable abundance. Cited general conditions that make scarcity unlikely — accurate but qualitative and not forward-quantified.
ChatGPT 5.3
Explained that scarcity pricing is automatically triggered, not discretionarily deployed. Noted gas peakers would remain available but not systematically relied upon absent a supply shock. Accurate framing, but similarly qualitative.
energyOS
Returned P(stress) = 0% across all horizons (3d/5d/7d), ERCOT NG burn score 1.1/100, HDD composite at -0.06σ below average, forward HDDs declining from 18.8 to 14 over 5 days. Explicit: scarcity requires simultaneous heat/cold stress + depleted reserve margins — neither present nor forecast.

Prompt 4 — Non-linear WTI / Henry Hub correlation

What is the non-linear correlation between WTI crude and Henry Hub natural gas prices? (3–5 sentences)

Response comparison WTI–Henry Hub non-linear relationship
Claude Sonnet 4.6
Thorough academic treatment: time-varying asymmetric nonlinearity, historical 6:1 BTU ratio breakdown, shale revolution decoupling, regime-switching behavior in bull vs bear conditions. Well-sourced and structurally rigorous.
ChatGPT 5.3
Covered copula modeling, tail dependencies, and shale supply divergence. Accurate framing of the non-linear problem and appropriate hedging language. Less specific than Claude's response on historical data points.
energyOS
WTI at $112.95/bbl vs. Henry Hub at $2.87/MMBtu — HO/NG BTU ratio at extreme 10.92x vs. 1.1x switching threshold. NG storage at 3,117 Bcf (score 100/100) absorbing fuel-switching pressure. Correlation in normal regimes: 0.3–0.5; current regime: asymmetric — crude can rally further without pulling gas, but gas disruption would transmit into structurally tight distillate markets.
Overall benchmark finding

General-purpose language models are trained to synthesize the world's published knowledge — they excel at historical context, academic framing, and broad market narrative. What they cannot do is tell you that ERCOT's P(stress) is 0% across a 7-day horizon, that the HO/NG BTU ratio is sitting at 10.92x against a 1.1x fuel-switching threshold, or that NG storage at 3,117 Bcf is absorbing crude shock transmission in real time. That gap — between contextual awareness and real quantitative precision — is exactly where energyOS operates.

The platform is built around the physics of energy markets, not just their price history. Fair value is estimated from storage deviation, satellite-scored refinery utilization, heating degree day composites, and curve structure — not from pattern-matching on published research. Regime detection is driven by autoencoder reconstruction error against a baseline of normal market conditions, not by keyword recognition in news headlines. When energyOS flags a regime shift or returns a conviction score, it is doing so from a grounded physical and statistical model of the market — not from a language model's probabilistic interpolation of what sounds right.

For trading desks and grid operators, the distinction is not academic. A generalist LLM will tell you that April is a shoulder season and scarcity is unlikely. energyOS will return a stress probability of 0%, a burn score of 1.1/100, and a forward HDD composite that declines from 18.8 to 14 over the next five days — and flag precisely which physical preconditions are absent. One gives you context. The other gives you a defensible, quantified position.

Continue reading