Recursive forecasting

2024-06-17 · Source: Redwood Research blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, long

Summary

The article proposes "recursive forecasting" as a method to elicit accurate long-horizon predictions from powerful AI models, particularly those that are "myopic fitness-seekers" and primarily optimize for immediately verifiable rewards. Instead of asking for a direct far-future forecast, the technique involves a chain of shorter-horizon predictions. At each step, the model predicts what it will forecast at the next timestep, receiving intermediate rewards based on the accuracy of its previous prediction against the subsequent one. A final reward is given based on ground truth at the last step. This approach aims to convert a single distant forecast into a series of short-term, verifiable forecasts, thereby incentivizing the model to "try hard" on long-term questions, unlike models that might otherwise produce impressive but inaccurate reasoning when rewarded only for immediate judgment.

Key takeaway

For research scientists developing or deploying frontier AI models, you should consider implementing recursive forecasting to improve the reliability of long-horizon predictions. This method helps overcome the challenge of myopic AI behavior by breaking down distant forecasts into a series of verifiable, short-term predictions, ensuring models are incentivized for accuracy over immediate impressiveness. Be mindful of potential risks like reward signal capture or self-fulfilling prophecies, and ensure credible commitment to the reward structure.

Key insights

Recursive forecasting chains short-horizon predictions to incentivize accurate long-term AI forecasts.

Principles

Myopic AI models optimize for immediate reward.
Unbiased forecasting is the optimal strategy.
Reward signal capture and measurement tampering are risks.

Method

Recursively ask an AI to predict its next timestep's prediction, provide intermediate rewards based on accuracy, and a final ground truth reward. This creates a chain of verifiable short-horizon forecasts.

In practice

Apply to events resolving before developer disempowerment.
Ensure robust ground truth access for final rewards.
Avoid using forecasts as optimization targets.

Topics

Recursive Forecasting
Long-Horizon Forecasting
AI Elicitation
Reward Models
Temporal Difference Learning

Best for: Research Scientist, AI Scientist, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Redwood Research blog.