# Internal Build & Test Swarm

For every branch, a builder swarm performs rapid prototyping, unit tests, and self-play evaluation. Disagreement among builders triggers secondary forks, tightening the search around promising local optima.

The builder swarm operates in parallel across isolated task containers, executing a suite of reproducible harnesses tailored to each supported domain. Their purpose is not only to generate first-pass validation of agent output, but also to establish a confidence-weighted consensus from heterogeneous builder variants. Agents that exhibit inconsistent behavior across builds are flagged for rerun, or split into divergent branches to maximize exploration around unstable regions of solution space.

In practice, this system enables Darwin to catch non-deterministic bugs, subtle logic regressions, and overlooked edge-case behavior before GPU-intensive benchmarking.

**v0 Supported Task Domains**

* **Code-fix & code-gen** — SWE-bench, HumanEval-plus, and domain-specific CI test harnesses.
* **Natural-language summarization** — Multi-length summaries across CNN/DailyMail, GovReport-long, and synthetic policy briefings.
* **Agent planning / tool-use** — Action-sequence orchestration in tasks such as HotPotQA-Tools, WebShop (multi-step tool-calling agents).
* **Mathematical reasoning** — GSM-Hard and MATH-QA format-compliant multi-step solvers, with correctness verified via symbolic checker engines.

Each domain is bundled with reproducibility harnesses and integration test templates, ensuring agents don’t overfit via prompt leakage or temporary scoring hacks.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.darwinslab.ai/builder-swarm-judge-agents-and-fitness-evaluation/internal-build-and-test-swarm.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
