Documentation
Bunsen is an experiment runner for agentic systems: give an agent an environment, run it reproducibly, capture everything, and evaluate the result. New here? Start with Introduction, then Getting Started.
Start Here
Orientation and the fastest path to a first run and result.
Concepts
How a run is composed: experiments, agents, and the container.
Authoring
Write the config files and recipes that define experiments and agents.
Evaluation
Define scoring criteria and choose where and how scorers run.
Suites
Consume published benchmark suites and author your own.
Reference
Command, project-config, run-output, platform, and cost references.