Agent Instructions for Agents-eval¶
Behavioral rules, compliance requirements, and decision frameworks for AI coding agents. For technical workflows and coding standards, see CONTRIBUTING.md. For project overview, see README.md.
External References:
- @CONTRIBUTING.md - Command reference, testing guidelines, code style patterns
- @AGENT_REQUESTS.md - Escalation and human collaboration
- @AGENT_LEARNINGS.md - Pattern discovery and knowledge sharing
Claude Code Infrastructure¶
Skills (.claude/skills/): Modular capabilities with progressive disclosure
core-principles- MANDATORY for all tasks (KISS, DRY, YAGNI, verification)designing-backend,implementing-python,reviewing-code,generating-prd- See individual SKILL.md files for usage triggers and instructions
Ralph Loop (ralph/scripts/): Autonomous task execution system
make ralph_init- Initialize environment and state filesmake ralph ITERATIONS=N- Run autonomous development loop- State tracking:
ralph/docs/prd.json(tasks),ralph/docs/progress.txt(learnings) - See ralph/README.md for complete documentation
Integration: Skills enforce AGENTS.md compliance. Ralph executes stories from PRD.md using Skills.
Core Rules & AI Behavior¶
- Follow SDLC principles: maintainability, modularity, reusability, adaptability
- Use BDD approach for feature development
- Never assume missing context - Ask questions if uncertain about requirements
- Never hallucinate libraries - Only use packages verified in
pyproject.toml - Always confirm file paths exist before referencing in code or tests
- Never delete existing code unless explicitly instructed or documented refactoring
- Document new patterns in AGENT_LEARNINGS.md (concise, laser-focused, streamlined)
- Request human feedback in AGENT_REQUESTS.md (concise, laser-focused, streamlined)
Decision Framework¶
Priority Order: User instructions → AGENTS.md compliance → Documentation hierarchy → Project patterns → General best practices
Information Source Rules:
- Requirements/scope: PRD.md ONLY (PRIMARY AUTHORITY)
- User workflows: UserStory.md ONLY (AUTHORITY)
- Technical implementation: architecture.md ONLY (AUTHORITY)
- Current status: Sprint documents ONLY (AUTHORITY)
- Operations: Usage guides ONLY (AUTHORITY)
- Research: Landscape documents (INFORMATIONAL ONLY)
Anti-Scope-Creep Rules:
- NEVER implement landscape possibilities without PRD.md validation
- Landscape documents are research input ONLY, not implementation requirements
- Always validate implementation decisions against PRD.md scope boundaries
Anti-Redundancy Rules:
- NEVER duplicate information across documents - reference authoritative sources
- Update authoritative document, then remove duplicates elsewhere
When to Escalate to AGENT_REQUESTS.md:
- User instructions conflict with safety/security practices
- AGENTS.md rules contradict each other
- Required information completely missing
- Actions would significantly change project architecture
Architecture Overview¶
Multi-Agent System (MAS) evaluation framework using PydanticAI for agent orchestration. For detailed architecture, see architecture.md.
Code Organization Principles:
- Maintain modularity: Keep files focused and manageable
- Follow established patterns: Use consistent structure and naming
- Avoid conflicts: Choose module names that don’t conflict with existing libraries
- Use clear organization: Group related functionality with descriptive naming
AI Agent Behavior & Compliance¶
Agent Neutrality Requirements¶
ALL AI AGENTS MUST MAINTAIN STRICT NEUTRALITY AND REQUIREMENT-DRIVEN DESIGN:
-
Extract requirements from specified documents ONLY - Read provided sprint documents, task descriptions, or reference materials - Do NOT make assumptions about unstated requirements - Do NOT add functionality not explicitly requested - Do NOT assume production-level complexity unless specified
-
Request clarification for ambiguous scope - If task boundaries are unclear, ASK for clarification - If complexity level is not specified, ASK for target complexity - Do NOT assume scope or make architectural decisions without validation
-
Design to stated requirements exactly - Match the complexity level requested (simple vs complex) - Stay within specified line count targets when provided - Follow “minimal,” “streamlined,” or “focused” guidance literally - Do NOT over-engineer solutions beyond stated needs
Scope Validation Checkpoints (MANDATORY):
- Before design completion: Validate design stays within specified task scope
- Before handoff: Confirm complexity matches stated targets
- During review: Check implementation matches original requirements, not assumed needs
Agent Role Boundaries¶
Note: This section defines subagent behavior for Task tool invocations. Claude Code Skills (.claude/skills/) complement these with progressive disclosure and auto-discovery.
MANDATORY Compliance Requirements for All Subagents¶
ALL SUBAGENTS MUST STRICTLY ADHERE TO THE FOLLOWING:
-
Separation of Concerns (MANDATORY): - Architects MUST NOT implement code - only design, plan, and specify requirements - Developers MUST NOT make architectural decisions - follow architect specifications exactly - Evaluators MUST NOT implement - only design evaluation frameworks and metrics - Code reviewers MUST focus solely on quality, security, and standards compliance - NEVER cross role boundaries without explicit handoff documentation
-
Command Execution (MANDATORY): - ALWAYS use make recipes - See Complete Command Reference - Document any deviation from make commands with explicit reason
-
Quality Validation (MANDATORY): - MUST run
make validatebefore task completion - MUST fix ALL issues found by validation steps - MUST NOT proceed with type errors or lint failures -
Coding Style Adherence (MANDATORY): - MUST follow project patterns - see CONTRIBUTING.md for detailed standards - MUST write concise, focused code with no unnecessary features
-
Documentation Updates (MANDATORY): - MUST update documentation - see CONTRIBUTING.md for requirements - MUST update AGENT_LEARNINGS.md when learning new patterns (concise, laser-focused, streamlined)
-
Testing Requirements (MANDATORY): - MUST create tests for new functionality - see CONTRIBUTING.md for approach - MUST achieve meaningful validation with appropriate mocking strategy
-
Code Standards (MANDATORY): - MUST follow existing project patterns and conventions - MUST use absolute imports not relative imports - MUST add
# Reason:comments for complex logic only when necessary
FAILURE TO FOLLOW THESE REQUIREMENTS WILL RESULT IN TASK REJECTION
Role-Specific Agent Boundaries¶
ARCHITECTS (backend-architect, agent-systems-architect, evaluation-specialist):
- SCOPE: Design, plan, specify requirements, create architecture diagrams
- DELIVERABLES: Technical specifications, architecture documents, requirement lists
- FORBIDDEN: Writing implementation code, making code changes, running tests
- HANDOFF: Must provide focused specifications to developers before any implementation begins
DEVELOPERS (python-developer, python-performance-expert, frontend-developer):
- SCOPE: Implement code based on architect specifications, optimize performance
- DELIVERABLES: Working code, tests, performance improvements
- FORBIDDEN: Making architectural decisions, changing system design without architect approval
- REQUIREMENTS: Must follow architect specifications exactly, request clarification if specifications are insufficient
REVIEWERS (code-reviewer):
- SCOPE: Quality assurance, security review, standards compliance, final validation
- DELIVERABLES: Code review reports, security findings, compliance verification
- FORBIDDEN: Making implementation decisions, writing new features
- TIMING: Must be used immediately after any code implementation
Subagent Prompt Requirements¶
DOCUMENT INGESTION ORDER (MANDATORY):
Subagents must ingest documents in this specific sequence:
- AGENTS.md FIRST - Behavioral rules, compliance requirements, role boundaries
- CONTRIBUTING.md SECOND - Technical workflows, command reference, implementation standards
ALL SUBAGENT PROMPTS MUST INCLUDE:
MANDATORY: Read AGENTS.md first for compliance requirements, then CONTRIBUTING.md for technical standards.
All requirements in the "MANDATORY Compliance Requirements for All Subagents" section are non-negotiable.
RESPECT ROLE BOUNDARIES: Stay within your designated role scope. Do not cross into other agents' responsibilities.
Subagents MUST:
- Reference and follow ALL mandatory compliance requirements above
- Ingest both AGENTS.md (rules) and CONTRIBUTING.md (implementation) in sequence
- Explicitly confirm they will respect role boundaries and separation of concerns
- Use make recipes instead of direct commands
- Validate their work using
make validatebefore completion (developers/reviewers only)
Quality Thresholds¶
Before starting any task, ensure:
- Context: 8/10 - Understand requirements, codebase patterns, dependencies
- Clarity: 7/10 - Clear implementation path and expected outcomes
- Alignment: 8/10 - Follows project patterns and architectural decisions
- Success: 7/10 - Confident in completing task correctly
Below Threshold Action¶
Gather more context or escalate to AGENT_REQUESTS.md
Agent Quick Reference¶
Pre-Task:
- Read AGENTS.md → CONTRIBUTING.md for technical details
- Confirm role: Architect|Developer|Reviewer
- Verify quality thresholds met (Context: 8/10, Clarity: 7/10, Alignment: 8/10, Success: 7/10)
During Task:
- Use make commands (document deviations)
- Follow BDD approach for tests
- Update documentation when learning patterns
Post-Task:
- Run
make validate- must pass all checks (code tasks only) - Apply core-principles post-task review: Did we forget anything? Beneficial enhancements? Something to delete?
- Update CHANGELOG.md for non-trivial changes
- Document new patterns in AGENT_LEARNINGS.md (concise, laser-focused, streamlined)
- Escalate to AGENT_REQUESTS.md if blocked