Use Cases & Scenarios

See how Flezi AgentBox works in practice. Detailed flow diagrams showing real-world agent scenarios — from personal dev mentors to enterprise team deployments.

DevMentor Kai — Agent with Soul

Personal coding mentor that remembers you, adapts to your mood, and grows with your skills

Interaction Flow

Developeragentbox run devmentor-kaiCLI

PackageLoaderLoad persona.md + memory + knowledgeAgentExecutor

MemorySelectorRecall: "Minh prefers TypeScript, uses Vim"Prompt

EmotionDetectorDetect: frustrated (3 failed builds today)PersonaEngine

Kai"Hey Minh, rough day? Let me check your build..."Terminal

ToolExecutorbash: npm run build (in NemoClawLite)Sandbox

Kai"Found it — missing return type on line 42"Terminal

MemoryManagerSave: "Minh struggles with TypeScript return types"SQLite

Agent Package

devmentor-kai/

Fagent.yamlversion: 1.0, tools: [bash, git, npm]

Fpersona.mdSenior dev, sarcastic, patient mentor

Dknowledge/TypeScript patterns, React best practices

Dpolicies/No rm -rf, no git push --force

Dmemory/user_preferences, lessons_learned, relationship

Memory Evolution

Day 1"Minh uses VS Code, prefers dark mode"preference

Week 2"Fixed useEffect cleanup — teaching moment"lesson

Month 1"Minh now handles hooks well — intermediate level"growth

Month 3"Inside joke: the infamous useEffect incident"relationship

▶View full conversation session

Kai: Hey Minh! I see you just opened the project. How's it going?

Minh: Terrible. Build keeps failing and I can't figure out why.

Kai: Rough day, huh? I noticed you had 3 failed builds in the last hour. Let me take a look...

$ npm run build

Error: Type 'string | undefined' is not assignable to type 'string'

at src/components/UserCard.tsx:42

Kai: Found it! Line 42 in UserCard.tsx — you're passing `user.name` which can be undefined, but the prop expects a string. Quick fix:

name={user.name ?? "Anonymous"}

Kai: I remember you had a similar issue last week with optional chaining. Want me to add a section about TypeScript strict null checks to your learning notes?

Minh: Yes please. And thanks for not judging the 3 failed builds...

Kai: Ha! Remember the useEffect incident? We all have those days. You're improving fast though — last month you wouldn't have caught this was a type error at all.

Intelligent MCP Server — IDE Integration

Run agents as MCP servers inside VS Code / Cursor / Claude Code. Unlike raw tools, these have brain + memory.

MCP Server Flow

Developeragentbox mcp serve --agent code-guardianstdio

VS CodeMCP: tools/list → discovers agent toolsMCP Client

Claude CodeMCP: tools/call "security-guard:scan"Agent

SecurityGuardTHINKS: analyze code, check OWASP patternsLLM

ToolExecutorbash: grep -r "eval(" src/NemoClawLite

AgentReturns: 3 XSS risks found with fix suggestionsVS Code

IDE Configuration

// .vscode/mcp.json

{

"mcpServers": {

"security-guard": {

"command": "agentbox",

"args": ["mcp", "serve", "--agent", "code-guardian"]

}

Raw MCP vs Intelligent MCP

Raw MCP Tool

Stateless function call
No context or memory
grep "pattern" → raw output
Developer interprets results

AgentBox Intelligent MCP

Brain-powered tool calls
Remembers past scans + your codebase
THINKS before acting, explains findings
Suggests fixes based on your patterns

Supported IDEs

VS Code

stdio

Cursor

stdio

Claude Code

stdio

▶View MCP tool interaction example

MCP Protocol Exchange:

IDE → Agent: tools/list

Response: ["security-guard:scan", "security-guard:explain", "security-guard:fix"]

IDE → Agent: tools/call "security-guard:scan"

Args: { "target": "src/routes/auth.ts", "category": "owasp" }

Agent's internal reasoning (invisible to IDE):

"This file handles authentication. I should check for: SQL injection in login query, timing attacks on password comparison, JWT token expiration, CSRF protection..."

"I remember from last scan: this team uses bcrypt correctly but has weak session timeout (8h). Let me check if that was fixed..."

Agent → IDE: tool result

{ findings: 3, severity: "medium", details: [...] }

Key insight: The agent THINKS, REMEMBERS, and REASONS — unlike a raw grep tool.

Team Agent Deployment — Enterprise Workflow

Create once locally, pack, push to registry, deploy to team via NemoClawLite platform

Deployment Pipeline

Create

agentbox init code-reviewer

Test

agentbox test (local)

Pack

agentbox pack → .tar.gz

Push

agentbox push (registry)

Deploy

Team pulls & runs

Self-Host (Free)

Team member runs agentbox pull code-reviewer && agentbox run — local NemoClawLite sandbox, Ollama LLM

Platform (Pro)

Deploy to AgentBox Platform — cloud-hosted NemoClawLite, Claude LLM, team dashboard

Enterprise

Private registry, custom policies, audit logs, SSO integration, SIEM export

▶View full deployment workflow example

Step-by-step: Publishing "code-reviewer" to your team

$ agentbox init code-reviewer --template reviewer

Creates scaffold: agent.yaml, persona.md, knowledge/, policies/, tests/

$ agentbox test

Runs 12 test scenarios in NemoClawLite sandbox. All pass.

$ agentbox pack

Creates: code-reviewer-1.0.0.agentbox.tar.gz (4.2MB)

Includes: SHA256 checksum, signed manifest, dependency lockfile

$ agentbox push @acme/code-reviewer

Uploaded to registry. Runs automated trust checks:

✓ No dangerous syscalls in tools/ ✓ Policies restrict network access

✓ Tests cover all declared capabilities ✓ Signed by verified publisher

Team member on the other side:

$ agentbox pull @acme/code-reviewer

$ agentbox run @acme/code-reviewer --context ./src

Agent runs locally with NemoClawLite isolation — can only access ./src

Agent Registry — The npm for AI Agents

Publish, discover, and install agent packages. Creators earn revenue. Enterprise gets private registries.

Creator Flow

Creatoragentbox init my-agentScaffold

CreatorWrite persona.md + knowledge + testsPackage

CLIagentbox lint (validate schema)Check

CLIagentbox test (run scenarios)Verify

CLIagentbox pack → .agentbox.tar.gzBundle

CLIagentbox push (@creator/my-agent)Registry

Consumer Flow

UserBrowse registry on agentbox.devDiscover

Useragentbox pull @creator/my-agentInstall

CLIVerify signature + trust levelSecurity

Useragentbox run @creator/my-agentExecute

AgentRuns in NemoClawLite (isolated)Sandbox

AgentRemembers user across sessionsMemory

▶View trust & verification system

Agent Trust Levels:

L0 Untrusted — Unknown publisher. Runs with maximum sandbox restrictions: no network, readonly filesystem, 256MB memory limit.

L1 Community — Verified email. Passed automated security scan. Can access network (allowlisted endpoints only).

L2 Verified — Publisher identity verified. Code audited. Can access broader filesystem with declared permissions.

L3 Enterprise Certified — Full security audit passed. SOC 2 compliant. Custom policy integration. SIEM logging.

Revenue Model for Creators:

• Free agents: open source, community contributions

• Premium agents: creator sets price, platform takes 15% commission

• Enterprise agents: custom licensing, volume discounts, SLA support

• Creators receive monthly payouts via Stripe Connect

SDLC Use Cases

How AI agents integrate into every phase of the Software Development Lifecycle

Sprint Planning Agent — From PRD to Stories

An agent that reads your PRD, architecture docs, and codebase to break features into implementable stories with accurate estimates

Planning Flow

PMagentbox run sprint-planner --context ./prd.mdCLI

AgentRead PRD, architecture.md, existing epicsKnowledge

AgentScan codebase: identify affected files & modulesNemoClawLite

AgentBreak feature into stories with acceptance criteriaOutput

AgentEstimate complexity: LOC impact + dependency analysisEstimate

AgentGenerate sprint-plan.yaml with prioritiesFile

MemoryManagerSave: "Team velocity ~8 stories/sprint"SQLite

Output: Sprint Plan

sprint-plan.yaml

sprint: 14

goal: "User authentication + OAuth"

stories:

- id: AUTH-001

title: "Implement JWT token service"

points: 5

files: [auth/jwt.ts, middleware/auth.ts]

acceptance_criteria: 4

- id: AUTH-002

title: "OAuth2 provider integration"

points: 8

depends_on: [AUTH-001]

Why This Matters

• Agent reads actual code, not just descriptions — estimates are grounded in reality

• Remembers past sprints — learns team velocity and adjusts over time

• Identifies hidden dependencies by scanning import graphs

• All analysis runs in NemoClawLite sandbox — agent reads your code safely

▶View full sprint planning session

PM: We need to add OAuth2 login. Can you plan the sprint?

Agent: Let me scan your codebase first...

✓ Found existing auth: src/lib/supabase/auth.ts (email/password)

✓ Found middleware: src/middleware.ts (JWT verification)

✓ Found 4 protected routes, 2 server actions using auth

Agent: Based on codebase analysis, I recommend 4 stories:

Story 1: JWT Token Service (5 pts)

Files: auth/jwt.ts, middleware/auth.ts | AC: 4 criteria

Story 2: OAuth2 Provider Config (3 pts)

Files: auth/oauth.ts, .env | AC: 3 criteria | Depends: Story 1

Story 3: OAuth Callback Handler (5 pts)

Files: routes/auth/callback.ts, auth/oauth.ts | AC: 5 criteria

Story 4: UI Login Buttons (3 pts)

Files: components/LoginForm.tsx | AC: 3 criteria | Depends: Story 2,3

Agent: Total: 16 points. Your team velocity is ~18 pts/sprint, so this fits with buffer. I also identified a risk: the existing session middleware needs refactoring to support OAuth tokens alongside email auth.

Code Review Agent — CI/CD Integrated

Automated, intelligent code review that runs on every PR. Understands your codebase patterns, catches bugs, and suggests improvements.

CI/CD Integration Flow

Developergit push → opens Pull RequestGitHub

CI PipelineTrigger: agentbox run code-reviewer --diff PR#42Agent

AgentLoad project conventions from knowledge/Context

AgentAnalyze diff: patterns, security, performanceNemoClawLite

MemorySelectorRecall: "This team prefers explicit error handling"Context

AgentPost inline comments on PR with suggestionsGitHub API

MemoryManagerSave: "PR#42 had SQL injection risk in user input"SQLite

Review Categories

Security

SQL injection
XSS
Auth bypass

Performance

N+1 queries
Missing index
Memory leak

Quality

Dead code
Naming
DRY violations

Convention

Style guide
Error handling
Testing gaps

vs Traditional Linters

• Linters check syntax — this agent understands intent and architecture

• Learns your codebase patterns over time via persistent memory

• Can reason about cross-file impacts (e.g., changing an API contract)

• Explains why something is problematic, not just flagging it

▶View PR review example

PR #42: Add user search endpoint

🔴 Critical: SQL Injection Risk

File: src/routes/search.ts:28

const result = db.query(`SELECT * FROM users WHERE name LIKE '%${q}%'`)

Fix: Use parameterized query to prevent SQL injection:

const result = db.query('SELECT * FROM users WHERE name LIKE $1', [`%${q}%`])

🟠 Performance: Missing Database Index

The name column is used in LIKE query but has no index.

For 50k+ users, this will cause full table scan (~200ms → ~2ms with index)

Suggest: CREATE INDEX idx_users_name ON users USING gin(name gin_trgm_ops)

🔵 Convention: Missing Error Handling

Your team convention (from knowledge/conventions.md): all route handlers should have try/catch with structured error responses. This handler returns raw database errors to the client.

Summary: 1 critical, 1 performance, 1 convention issue. Blocking merge until critical is fixed.

Testing Agent — Automated QA Pipeline

Generate, execute, and maintain tests. The agent reads your code, writes test cases, runs them in a sandbox, and reports coverage gaps.

Testing Pipeline

Analyze

Read source code + existing tests

Plan

Identify untested paths

Generate

Write test cases

Execute

Run in NemoClawLite

Report

Coverage + failures

Fix

Auto-fix flaky tests

Agent Flow

QA Leadagentbox run test-agent --target src/auth/CLI

AgentParse AST: find functions, branches, edge casesAnalysis

AgentGenerate test files: unit + integrationOutput

ToolExecutornpm test (runs inside NemoClawLite sandbox)Sandbox

Agent3 tests failed → analyze errors → fix assertionsSelf-heal

AgentFinal report: 94% coverage, 2 bugs foundReport

Test Types Generated

Unit Tests

Individual functions with mocked dependencies

auth.service.test.ts

Integration Tests

Module interactions with real DB (test container)

auth.integration.test.ts

Edge Cases

Boundary values, null inputs, race conditions

auth.edge-cases.test.ts

Regression Tests

Tests for previously reported bugs

auth.regression.test.ts

All test execution happens inside NemoClawLite sandbox — test code cannot access files outside the project or make network calls unless explicitly allowed.

▶View generated test example

// auth.service.test.ts (generated by Testing Agent)

describe('AuthService', () => {

describe('signIn', () => {

it('should return JWT for valid credentials', async () => {

const result = await authService.signIn('test@mail.com', 'pass123');

expect(result.token).toMatch(/^eyJ/);

expect(result.user.email).toBe('test@mail.com');

});

it('should throw on invalid password', async () => {

await expect(authService.signIn('test@mail.com', 'wrong'))

.rejects.toThrow('Invalid credentials');

});

// Edge case: timing attack prevention

it('should take constant time regardless of user existence', async () => {

const t1 = await measure(() => authService.signIn('exists@mail.com', 'x'));

const t2 = await measure(() => authService.signIn('noone@mail.com', 'x'));

expect(Math.abs(t1 - t2)).toBeLessThan(50); // <50ms variance

});

// Agent identified 14 test cases across 3 files:

// 8 unit + 4 integration + 2 edge cases = 94% coverage

Documentation Agent — Living Docs from Code

Automatically generate and maintain documentation that stays in sync with your codebase. API docs, architecture diagrams, changelogs.

Documentation Flow

Developeragentbox run doc-agent --watch src/CLI

AgentScan: exports, types, JSDoc, README, CHANGELOGAnalysis

AgentDiff against existing docs — find stale sectionsCompare

AgentGenerate: API reference, architecture overview, guidesOutput

AgentCreate Mermaid diagrams from code structureDiagrams

AgentUpdate CHANGELOG.md from git logChangelog

MemoryManagerSave: "Last full sync: 2026-03-26, 42 pages"SQLite

Output Types

API Reference

Auto-generated from TypeScript types + JSDoc. Includes request/response examples.

Architecture Diagrams

Mermaid C4/sequence diagrams generated from import graphs and service dependencies.

Developer Guides

How-to guides inferred from common patterns in the codebase.

Changelog

Semantic changelog from git commits, grouped by feature/fix/breaking.

Watch Mode

With --watch, the agent monitors file changes and incrementally updates only affected documentation pages. Runs as an MCP server so your IDE shows real-time doc previews.

▶View generated documentation example

Auto-generated API Reference (from Fastify routes + Zod schemas):

POST /api/agents

Create a new agent in the marketplace.

Request Body:

{ name: string, description: string, category: "dev" | "ops" | "qa", visibility: "public" | "private" }

Response (201):

{ id: uuid, name: string, created_at: timestamp, trust_level: 0 }

Errors:

400 — Validation error (name too short, invalid category)

401 — Not authenticated

409 — Agent name already exists for this creator

Auto-generated Mermaid diagram:

sequenceDiagram

Client->>API: POST /api/agents

API->>Auth: Verify JWT

API->>DB: INSERT INTO agents

DB-->>API: agent record

API-->>Client: 201 Created

Incident Response Agent — On-Call Assistant

Monitors alerts from PagerDuty/Grafana, reads logs, correlates with past incidents, identifies root cause, and suggests a fix — all within seconds.

Response Flow

PagerDutyWebhook: API latency spike (p99 > 2s)Agent

AgentQuery Grafana: identify affected serviceMetrics

ToolExecutorkubectl logs api-server --since=5mNemoClawLite

AgentPattern: OOM kills on pod api-server-3Analysis

MemorySelectorRecall: "Similar OOM 2 weeks ago — memory leak in cache"Context

AgentRoot cause: unbounded Redis cache + new feature PR#128Diagnosis

AgentSuggest: kubectl rollout restart + add cache evictionFix

AgentCreate incident report in Confluence + Slack alertReport

Agent Capabilities

Log Analysis

Reads structured logs (JSON), unstructured logs, and stack traces. Identifies error patterns across distributed services.

Metric Correlation

Queries Prometheus/Grafana. Correlates latency spikes with CPU/memory/network metrics to narrow root cause.

Incident Memory

Remembers past incidents. When a similar pattern appears, suggests the same fix that worked before — with caveats about what changed since.

Runbook Execution

Has access to team runbooks in knowledge/. Can execute safe remediation steps (restart, scale, toggle feature flags) in sandbox.

▶View full incident response example

Timeline:

00:00 — PagerDuty alert: API p99 latency > 2000ms

00:03 — Agent queries Grafana, identifies api-server pods

00:08 — Agent reads last 5min of logs, finds 12 OOMKilled events

00:15 — Memory search: similar incident 14 days ago (cache leak)

00:20 — Agent diffs recent deployments: PR#128 merged 45min ago

00:25 — Agent reads PR#128 diff: new endpoint without cache TTL

00:30 — Diagnosis: unbounded cache growth from new /search endpoint

00:35 — Agent executes: kubectl rollout restart deploy/api-server

00:40 — Metrics confirm: latency dropping, pods healthy

00:45 — Agent posts incident report to #incidents Slack channel

00:50 — Agent creates Jira ticket: "Add cache TTL to /search endpoint"

Total resolution time: 50 seconds (vs 15-30 min manual)

Database Migration Agent — Safe Schema Evolution

Reads your current schema, generates migration SQL, validates with dry-run, creates rollback scripts, and checks for breaking changes.

Migration Flow

Developeragentbox run db-agent "add soft-delete to users"CLI

AgentRead: schema.prisma / supabase migrations/Context

AgentAnalyze: foreign keys, indexes, existing data volumeAnalysis

AgentGenerate: migration SQL (up + down)Output

ToolExecutorDry-run migration on test DB (NemoClawLite)Validate

AgentCheck: backwards compatibility with current APICompat

AgentCreate PR with migration + rollback + docsPR

▶View generated migration SQL

-- migrate_up.sql

ALTER TABLE users

ADD COLUMN deleted_at TIMESTAMPTZ DEFAULT NULL;

-- Add partial index for active users (performance)

CREATE INDEX idx_users_active

ON users (id) WHERE deleted_at IS NULL;

-- Update RLS policies for soft-delete awareness

CREATE POLICY users_visible ON users

FOR SELECT USING (deleted_at IS NULL);

-- Agent also generates: data backfill, API filter updates,

-- cascade rules, and test queries for each scenario.

-- migrate_down.sql (rollback)

DROP POLICY IF EXISTS users_visible ON users;

DROP INDEX IF EXISTS idx_users_active;

ALTER TABLE users DROP COLUMN deleted_at;

Safety Checks

Schema Diff Analysis

Compares before/after schema. Identifies columns added, removed, renamed. Flags data loss risks (DROP COLUMN) with warnings.

Backwards Compatibility

Checks if existing API queries still work. Detects: column renames that break SELECT *, type changes that lose precision, NOT NULL on populated columns.

Dry-Run Validation

Runs the migration on a test database inside NemoClawLite. Verifies: syntax correctness, index creation time, lock duration, data integrity.

Rollback Script

Auto-generates matching DOWN migration. Validates rollback independently. Ensures zero data loss on revert.

▶View compatibility report

Compatibility Report:

✓ API GET /users — compatible (deleted_at has DEFAULT NULL)

✓ API POST /users — compatible (column is nullable)

⚠ API GET /users?active=true — needs filter update

✓ Supabase RLS — new policy handles soft-delete

✓ Foreign keys — CASCADE rules verified

✓ Rollback — tested successfully, 0 data loss

Estimated migration time: ~200ms (table has 12,340 rows)

Lock duration: ~5ms (ADD COLUMN with DEFAULT is non-blocking in PG 11+)

Security Audit Agent — Continuous Vulnerability Scanning

Runs OWASP Top 10 checks, scans dependencies for CVEs, detects hardcoded secrets, and generates compliance reports with remediation guidance.

Audit Flow

CI PipelineTrigger on PR merge to mainAgent

Agentnpm audit + snyk test (in NemoClawLite)Deps

AgentScan source: regex + AST for OWASP patternsCode

AgentDetect: eval(), innerHTML, SQL string concatVulns

AgentScan .env, configs for hardcoded secretsSecrets

AgentCross-reference with known CVE databaseCVE

AgentGenerate report: severity, location, fix suggestionReport

MemoryManagerTrack: "3 XSS vulns fixed since last audit"Trend

Scan Categories

OWASP Top 10

Injection (SQL/XSS/Command), Broken Auth, Sensitive Data Exposure, XXE, Broken Access Control, Security Misconfiguration, Insecure Deserialization.

Dependency Vulnerabilities

Scans package-lock.json/yarn.lock against NVD + GitHub Advisory. Reports CVE ID, severity (CVSS), affected version range, and fixed version.

Secret Detection

Regex + entropy analysis for API keys, passwords, tokens in code and config files. Checks .env files, CI configs, and committed history.

Compliance Mapping

Maps findings to SOC 2 / GDPR / HIPAA requirements. Generates audit-ready reports with evidence links.

▶View sample audit report

Security Audit Report — 2026-03-26

CRITICAL (1)

SQL Injection in src/routes/search.ts:42

Pattern: db.query(`SELECT * FROM users WHERE name = '${input}'`)

Fix: Use parameterized query: db.query('SELECT * FROM users WHERE name = $1', [input])

HIGH (2)

1. XSS via dangerouslySetInnerHTML in Comment.tsx:18

2. CVE-2025-1234 in lodash@4.17.20 (prototype pollution)

MEDIUM (4)

Missing rate limiting on /api/login, CORS wildcard, 2 missing CSP headers

Total: 1 Critical, 2 High, 4 Medium, 8 Low | Trend: -3 vs last audit

DevOps Agent — Infrastructure as Code

Generates Terraform/Kubernetes manifests, validates with dry-run, monitors deployments, and auto-rollbacks on failure.

Deployment Pipeline

Describe

Natural language → IaC

Generate

Terraform + K8s YAML

Validate

terraform plan (dry-run)

Deploy

Apply + monitor rollout

Observe

Health checks + alerts

Agent Flow

DevOps"Deploy Redis cluster for caching layer"Agent

AgentRead existing infra: VPC, subnets, security groupsContext

AgentGenerate: Terraform module + K8s StatefulSetIaC

ToolExecutorterraform plan (in NemoClawLite sandbox)Validate

AgentReport: 3 resources to create, estimated $12/moReview

AgentAfter approval: terraform apply + kubectl applyDeploy

AgentMonitor: health checks every 30s for 10minObserve

Capabilities

Multi-Cloud Support

Generates Terraform for AWS, GCP, Azure. Reads existing provider configs and follows team conventions.

Cost Estimation

Estimates monthly cost of proposed infrastructure changes before deployment. Compares with current spend.

Auto-Rollback

Monitors deployment health metrics. If error rate exceeds threshold, automatically rolls back to last known good state.

▶View generated Terraform output

terraform plan output:

+ aws_elasticache_cluster.redis

engine: "redis"

node_type: "cache.t3.medium"

num_cache_nodes: 3

subnet_group: "app-private"

+ aws_security_group.redis_sg

ingress: port 6379 from app-sg only

+ kubernetes_stateful_set.redis

replicas: 3, persistent_volume: 10Gi

Plan: 3 to add, 0 to change, 0 to destroy

Estimated cost: $38.40/month

API Design Agent — Contract-First Development

Generates OpenAPI specs from your code, validates backwards compatibility, creates SDK clients, and runs contract tests.

Design Flow

Developeragentbox run api-agent --scan src/routes/CLI

AgentParse route handlers: params, body, response typesAST

AgentGenerate OpenAPI 3.1 spec with examplesSpec

AgentDiff with previous spec: detect breaking changesCompat

AgentGenerate TypeScript SDK client from specSDK

AgentCreate contract tests (request/response validation)Tests

MemoryManagerSave: "v2.3.0 added /search, deprecated /find"History

▶View generated OpenAPI spec

openapi: 3.1.0

paths:

/api/agents:

get:

summary: List marketplace agents

parameters:

- name: category

in: query

schema: { type: string, enum: [dev, ops, qa] }

responses:

200:

content: application/json

schema: { $ref: '#/components/schemas/AgentList' }

// Auto-generated from Fastify route + Zod schemas

Capabilities

Breaking Change Detection

Compares current spec with previous version. Flags: removed endpoints, changed response shapes, narrowed parameter types, required field additions.

SDK Generation

Generates typed clients (TypeScript, Python, Go) from OpenAPI spec. Includes request/response types, error handling, and auth configuration.

Contract Testing

Creates test suites that validate API responses match the spec. Catches schema drift before it reaches production.

Documentation

Generates interactive API docs (Swagger UI / Redoc). Includes live examples, auth flows, and error catalogs.

▶View breaking change report

API Compatibility Report (v2.2 → v2.3)

BREAKING (1):

DELETE /api/agents/find — endpoint removed (use /api/agents/search)

DEPRECATED (2):

GET /api/agents?sort=rating → use ?order_by=rating

Response field "agent.owner" → renamed to "agent.creator_id"

ADDITIONS (3):

POST /api/agents/search — new full-text search endpoint

GET /api/agents/:id/versions — version history

Response field "agent.trust_level" added

Recommendation: Bump major version (v3.0) due to breaking change

Onboarding Agent — New Developer Guide

Walks new team members through the codebase, explains architecture, helps with environment setup, and guides their first contribution.

Onboarding Flow

New Devagentbox run onboarding-buddyCLI

AgentLoad: architecture.md, README, team conventionsKnowledge

Agent"Welcome! Let me explain our project structure..."Terminal

AgentWalk through: monorepo layout, key modules, data flowGuide

ToolExecutorRun setup: npm install, supabase start, seed DBNemoClawLite

AgentVerify: all services running, tests passingHealth

AgentSuggest first task: good-first-issue from backlogTask

AgentGuide through first PR: branch, commit, push, reviewMentor

Onboarding Phases

Phase 1: Architecture Overview

Explains: project structure, key directories, tech stack, database schema, deployment model. Uses diagrams from knowledge/ and real code examples.

Phase 2: Environment Setup

Runs setup scripts in NemoClawLite sandbox. Installs dependencies, starts services, seeds test data. Troubleshoots common issues (port conflicts, missing env vars).

Phase 3: Codebase Walkthrough

Interactive tour of key code paths. Explains: how a request flows from frontend → API → database. Shows: authentication, error handling, state management.

Phase 4: First Contribution

Picks a good-first-issue. Guides: create branch, make changes, write tests, open PR. Reviews code before submission. Explains team review process.

▶View full onboarding session transcript

Agent: Welcome to the team! I'm your onboarding buddy. Let me walk you through our project.

Agent: We're a monorepo with 3 main packages:

• apps/frontend/ — Next.js 16, Clean Architecture

• apps/orchestrator/ — Fastify 5, Plugin Architecture

• packages/shared/ — Zod schemas shared between both

New Dev: How does auth work?

Agent: Great question! Auth flows through Supabase. Let me show you:

1. User submits login form → app/login/page.tsx

2. Calls supabase.auth.signInWithPassword()

3. Middleware in middleware.ts checks JWT on protected routes

4. Server components use createServerClient() for user context

Agent: I've set up your local environment. All services are running. Here's a good first task: adding a tooltip to the sidebar icons. Want me to guide you through it?