Initialisation & Evaluation
At task launch, we random-seed a population P₀ of size N (default = 64) by:
Model heterogeneity – random choice among the registryʼs base checkpoints.
Prompt genotype – shuffled system + user templates.
Hyper-gene vector – temperature, top-p, context-window, RAG-source toggles.
Structural genes – optional tool-use abilities enabled/disabled.
Evaluator agents are responsible for benchmarking new agents on:
Accuracy, coherence, novelty
Task-specific performance
Resource efficiency
Goal generalization
These evaluators evolve themselves, closing the loop.
Evaluators therefore act as an adaptive fitness landscape, co-evolving with the agent population exactly as GA literature prescribes for dynamic optimisation problems.
Last updated