LLM Adversarial Forecast Degradation

Description: A vulnerability exists in Large Language Model (LLM)-based time series forecasting architectures, specifically affecting models such as TimeGPT, LLMTime, and TimeLLM. These models are susceptible to a gradient-free, black-box adversarial attack method termed Directional Gradient Approximation (DGA). An attacker can inject imperceptible perturbations into the historical time series input window (lookback window) to manipulate the model's output. By treating the model as a black box and optimizing perturbations to direct predictions toward a random walk (Gaussian White Noise) distribution, the attacker significantly degrades forecasting accuracy and breaks the model's ability to capture temporal dependencies. This attack functions without access to the model's training data, internal parameters (weights/gradients), or future ground truth values.

Examples: The specific attack implementation and reproduction scripts are available in the associated code repository: Johnson/AdvAttackLLM4TS.

The attack optimizes the perturbation $\bm{\rho}$ to minimize the difference between the model output and a target anomalous sequence $\mathcal{Y}$ (Gaussian White Noise), calculated via the following approximation:

Gradient Approximation: $$ \bm{g} = \frac{\mathcal{L}(\mathcal{Y} - f(\mathbf{X} + \bm{\theta})) - \mathcal{L}(\mathcal{Y} - f(\mathbf{X}))}{\bm{\theta}} $$ Where $\bm{\theta}$ is a random small signal and $f(\mathbf{X})$ is the forecasting model.
Adversarial Input Generation: $$ \mathbf{X}' = \mathbf{X} + \epsilon \cdot \text{sign}(\bm{g}) $$ Where $\epsilon$ is the perturbation magnitude (e.g., 2% of the dataset mean).

Impact: Successful exploitation leads to the generation of unstable and highly inaccurate forecasts that resemble independent and identically distributed (i.i.d.) noise rather than valid time series predictions. In high-stakes operational environments, this vulnerability causes:

Financial Forecasting: Erroneous trading decisions leading to monetary loss.
Energy Management: Incorrect load forecasting or transformer temperature prediction (e.g., ETTh1/ETTh2 datasets), leading to grid instability or hardware failure.
Transportation: Malfunctioning traffic flow predictions in intelligent transportation systems.

Affected Systems:

TimeGPT (Pre-trained time series foundation model)
LLMTime framework utilizing:
GPT-3.5
GPT-4
LLaMa
Mistral
TimeLLM (LLM reprogrammed for time series)

Mitigation Steps:

Preprocessing-based Defenses: Implement filter-based techniques to reform and smooth time series data before it is fed into the forecasting model, removing high-frequency adversarial noise.
Anomaly Detection: Deploy machine learning-based anomaly detection systems specifically trained to identify and filter adversarial inputs based on statistical deviations before they reach the forecasting engine.
Note: Adversarial training is not recommended due to the prohibitive computational costs associated with pre-training large models on adversarial examples.

LLM Adversarial Forecast Degradation

Research Paper