# Initialisation & Evaluation

At task launch, we random-seed a population P₀ of size N (default = 64) by:

* **Model heterogeneity** – random choice among the registryʼs base checkpoints.
* **Prompt genotype** – shuffled system + user templates.
* **Hyper-gene vector** – temperature, top-p, context-window, RAG-source toggles.
* **Structural genes** – optional tool-use abilities enabled/disabled.

***

Evaluator agents are responsible for benchmarking new agents on:

* Accuracy, coherence, novelty
* Task-specific performance
* Resource efficiency
* Goal generalization

These evaluators evolve themselves, closing the loop.

Evaluators therefore act as an adaptive fitness landscape, co-evolving with the agent population exactly as GA literature prescribes for dynamic optimisation problems.
