Flezi AgentBox

Use Cases & Scenarios

See how Flezi AgentBox works in practice. Detailed flow diagrams showing real-world agent scenarios — from personal dev mentors to enterprise team deployments.

DevMentor Kai — Agent with Soul

Personal coding mentor that remembers you, adapts to your mood, and grows with your skills

Interaction Flow

Developeragentbox run devmentor-kaiCLI
PackageLoaderLoad persona.md + memory + knowledgeAgentExecutor
MemorySelectorRecall: "Minh prefers TypeScript, uses Vim"Prompt
EmotionDetectorDetect: frustrated (3 failed builds today)PersonaEngine
Kai"Hey Minh, rough day? Let me check your build..."Terminal
ToolExecutorbash: npm run build (in NemoClawLite)Sandbox
Kai"Found it — missing return type on line 42"Terminal
MemoryManagerSave: "Minh struggles with TypeScript return types"SQLite

Agent Package

devmentor-kai/

Fagent.yaml
Fpersona.md
Dknowledge/
Dpolicies/
Dmemory/

Memory Evolution

Day 1"Minh uses VS Code, prefers dark mode"preference
Week 2"Fixed useEffect cleanup — teaching moment"lesson
Month 1"Minh now handles hooks well — intermediate level"growth
Month 3"Inside joke: the infamous useEffect incident"relationship
View full conversation session

Kai: Hey Minh! I see you just opened the project. How's it going?

Minh: Terrible. Build keeps failing and I can't figure out why.

Kai: Rough day, huh? I noticed you had 3 failed builds in the last hour. Let me take a look...

$ npm run build

Error: Type 'string | undefined' is not assignable to type 'string'

at src/components/UserCard.tsx:42

Kai: Found it! Line 42 in UserCard.tsx — you're passing `user.name` which can be undefined, but the prop expects a string. Quick fix:

name={user.name ?? "Anonymous"}

Kai: I remember you had a similar issue last week with optional chaining. Want me to add a section about TypeScript strict null checks to your learning notes?

Minh: Yes please. And thanks for not judging the 3 failed builds...

Kai: Ha! Remember the useEffect incident? We all have those days. You're improving fast though — last month you wouldn't have caught this was a type error at all.

Intelligent MCP Server — IDE Integration

Run agents as MCP servers inside VS Code / Cursor / Claude Code. Unlike raw tools, these have brain + memory.

MCP Server Flow

Developeragentbox mcp serve --agent code-guardianstdio
VS CodeMCP: tools/list → discovers agent toolsMCP Client
Claude CodeMCP: tools/call "security-guard:scan"Agent
SecurityGuardTHINKS: analyze code, check OWASP patternsLLM
ToolExecutorbash: grep -r "eval(" src/NemoClawLite
AgentReturns: 3 XSS risks found with fix suggestionsVS Code

IDE Configuration

// .vscode/mcp.json

{

"mcpServers": {

"security-guard": {

"command": "agentbox",

"args": ["mcp", "serve", "--agent", "code-guardian"]

}

}

}

Raw MCP vs Intelligent MCP

Raw MCP Tool

  • Stateless function call
  • No context or memory
  • grep "pattern" → raw output
  • Developer interprets results

AgentBox Intelligent MCP

  • Brain-powered tool calls
  • Remembers past scans + your codebase
  • THINKS before acting, explains findings
  • Suggests fixes based on your patterns

Supported IDEs

VS Code

stdio

Cursor

stdio

Claude Code

stdio

View MCP tool interaction example

MCP Protocol Exchange:

IDE → Agent: tools/list

Response: ["security-guard:scan", "security-guard:explain", "security-guard:fix"]

IDE → Agent: tools/call "security-guard:scan"

Args: { "target": "src/routes/auth.ts", "category": "owasp" }

Agent's internal reasoning (invisible to IDE):

"This file handles authentication. I should check for: SQL injection in login query, timing attacks on password comparison, JWT token expiration, CSRF protection..."

"I remember from last scan: this team uses bcrypt correctly but has weak session timeout (8h). Let me check if that was fixed..."

Agent → IDE: tool result

{ findings: 3, severity: "medium", details: [...] }

Key insight: The agent THINKS, REMEMBERS, and REASONS — unlike a raw grep tool.

Team Agent Deployment — Enterprise Workflow

Create once locally, pack, push to registry, deploy to team via NemoClawLite platform

Deployment Pipeline

1

Create

agentbox init code-reviewer

2

Test

agentbox test (local)

3

Pack

agentbox pack → .tar.gz

4

Push

agentbox push (registry)

5

Deploy

Team pulls & runs

Self-Host (Free)

Team member runs agentbox pull code-reviewer && agentbox run — local NemoClawLite sandbox, Ollama LLM

Platform (Pro)

Deploy to AgentBox Platform — cloud-hosted NemoClawLite, Claude LLM, team dashboard

Enterprise

Private registry, custom policies, audit logs, SSO integration, SIEM export

View full deployment workflow example

Step-by-step: Publishing "code-reviewer" to your team

$ agentbox init code-reviewer --template reviewer

Creates scaffold: agent.yaml, persona.md, knowledge/, policies/, tests/

$ agentbox test

Runs 12 test scenarios in NemoClawLite sandbox. All pass.

$ agentbox pack

Creates: code-reviewer-1.0.0.agentbox.tar.gz (4.2MB)

Includes: SHA256 checksum, signed manifest, dependency lockfile

$ agentbox push @acme/code-reviewer

Uploaded to registry. Runs automated trust checks:

✓ No dangerous syscalls in tools/ ✓ Policies restrict network access

✓ Tests cover all declared capabilities ✓ Signed by verified publisher

Team member on the other side:

$ agentbox pull @acme/code-reviewer

$ agentbox run @acme/code-reviewer --context ./src

Agent runs locally with NemoClawLite isolation — can only access ./src

Agent Registry — The npm for AI Agents

Publish, discover, and install agent packages. Creators earn revenue. Enterprise gets private registries.

Creator Flow

Creatoragentbox init my-agentScaffold
CreatorWrite persona.md + knowledge + testsPackage
CLIagentbox lint (validate schema)Check
CLIagentbox test (run scenarios)Verify
CLIagentbox pack → .agentbox.tar.gzBundle
CLIagentbox push (@creator/my-agent)Registry

Consumer Flow

UserBrowse registry on agentbox.devDiscover
Useragentbox pull @creator/my-agentInstall
CLIVerify signature + trust levelSecurity
Useragentbox run @creator/my-agentExecute
AgentRuns in NemoClawLite (isolated)Sandbox
AgentRemembers user across sessionsMemory
View trust & verification system

Agent Trust Levels:

L0 Untrusted — Unknown publisher. Runs with maximum sandbox restrictions: no network, readonly filesystem, 256MB memory limit.

L1 Community — Verified email. Passed automated security scan. Can access network (allowlisted endpoints only).

L2 Verified — Publisher identity verified. Code audited. Can access broader filesystem with declared permissions.

L3 Enterprise Certified — Full security audit passed. SOC 2 compliant. Custom policy integration. SIEM logging.

Revenue Model for Creators:

• Free agents: open source, community contributions

• Premium agents: creator sets price, platform takes 15% commission

• Enterprise agents: custom licensing, volume discounts, SLA support

• Creators receive monthly payouts via Stripe Connect

SDLC Use Cases

How AI agents integrate into every phase of the Software Development Lifecycle

Sprint Planning Agent — From PRD to Stories

An agent that reads your PRD, architecture docs, and codebase to break features into implementable stories with accurate estimates

Planning Flow

PMagentbox run sprint-planner --context ./prd.mdCLI
AgentRead PRD, architecture.md, existing epicsKnowledge
AgentScan codebase: identify affected files & modulesNemoClawLite
AgentBreak feature into stories with acceptance criteriaOutput
AgentEstimate complexity: LOC impact + dependency analysisEstimate
AgentGenerate sprint-plan.yaml with prioritiesFile
MemoryManagerSave: "Team velocity ~8 stories/sprint"SQLite

Output: Sprint Plan

sprint-plan.yaml

sprint: 14

goal: "User authentication + OAuth"

stories:

- id: AUTH-001

title: "Implement JWT token service"

points: 5

files: [auth/jwt.ts, middleware/auth.ts]

acceptance_criteria: 4

- id: AUTH-002

title: "OAuth2 provider integration"

points: 8

depends_on: [AUTH-001]

Why This Matters

• Agent reads actual code, not just descriptions — estimates are grounded in reality

Remembers past sprints — learns team velocity and adjusts over time

• Identifies hidden dependencies by scanning import graphs

• All analysis runs in NemoClawLite sandbox — agent reads your code safely

View full sprint planning session

PM: We need to add OAuth2 login. Can you plan the sprint?

Agent: Let me scan your codebase first...

✓ Found existing auth: src/lib/supabase/auth.ts (email/password)

✓ Found middleware: src/middleware.ts (JWT verification)

✓ Found 4 protected routes, 2 server actions using auth

Agent: Based on codebase analysis, I recommend 4 stories:

Story 1: JWT Token Service (5 pts)

Files: auth/jwt.ts, middleware/auth.ts | AC: 4 criteria

Story 2: OAuth2 Provider Config (3 pts)

Files: auth/oauth.ts, .env | AC: 3 criteria | Depends: Story 1

Story 3: OAuth Callback Handler (5 pts)

Files: routes/auth/callback.ts, auth/oauth.ts | AC: 5 criteria

Story 4: UI Login Buttons (3 pts)

Files: components/LoginForm.tsx | AC: 3 criteria | Depends: Story 2,3

Agent: Total: 16 points. Your team velocity is ~18 pts/sprint, so this fits with buffer. I also identified a risk: the existing session middleware needs refactoring to support OAuth tokens alongside email auth.

Code Review Agent — CI/CD Integrated

Automated, intelligent code review that runs on every PR. Understands your codebase patterns, catches bugs, and suggests improvements.

CI/CD Integration Flow

Developergit push → opens Pull RequestGitHub
CI PipelineTrigger: agentbox run code-reviewer --diff PR#42Agent
AgentLoad project conventions from knowledge/Context
AgentAnalyze diff: patterns, security, performanceNemoClawLite
MemorySelectorRecall: "This team prefers explicit error handling"Context
AgentPost inline comments on PR with suggestionsGitHub API
MemoryManagerSave: "PR#42 had SQL injection risk in user input"SQLite

Review Categories

Security

  • SQL injection
  • XSS
  • Auth bypass

Performance

  • N+1 queries
  • Missing index
  • Memory leak

Quality

  • Dead code
  • Naming
  • DRY violations

Convention

  • Style guide
  • Error handling
  • Testing gaps

vs Traditional Linters

• Linters check syntax — this agent understands intent and architecture

Learns your codebase patterns over time via persistent memory

• Can reason about cross-file impacts (e.g., changing an API contract)

• Explains why something is problematic, not just flagging it

View PR review example

PR #42: Add user search endpoint

🔴 Critical: SQL Injection Risk

File: src/routes/search.ts:28

const result = db.query(`SELECT * FROM users WHERE name LIKE '%${q}%'`)

Fix: Use parameterized query to prevent SQL injection:

const result = db.query('SELECT * FROM users WHERE name LIKE $1', [`%${q}%`])

🟠 Performance: Missing Database Index

The name column is used in LIKE query but has no index.

For 50k+ users, this will cause full table scan (~200ms → ~2ms with index)

Suggest: CREATE INDEX idx_users_name ON users USING gin(name gin_trgm_ops)

🔵 Convention: Missing Error Handling

Your team convention (from knowledge/conventions.md): all route handlers should have try/catch with structured error responses. This handler returns raw database errors to the client.

Summary: 1 critical, 1 performance, 1 convention issue. Blocking merge until critical is fixed.

Testing Agent — Automated QA Pipeline

Generate, execute, and maintain tests. The agent reads your code, writes test cases, runs them in a sandbox, and reports coverage gaps.

Testing Pipeline

1

Analyze

Read source code + existing tests

2

Plan

Identify untested paths

3

Generate

Write test cases

4

Execute

Run in NemoClawLite

5

Report

Coverage + failures

6

Fix

Auto-fix flaky tests

Agent Flow

QA Leadagentbox run test-agent --target src/auth/CLI
AgentParse AST: find functions, branches, edge casesAnalysis
AgentGenerate test files: unit + integrationOutput
ToolExecutornpm test (runs inside NemoClawLite sandbox)Sandbox
Agent3 tests failed → analyze errors → fix assertionsSelf-heal
AgentFinal report: 94% coverage, 2 bugs foundReport

Test Types Generated

Unit Tests

Individual functions with mocked dependencies

auth.service.test.ts

Integration Tests

Module interactions with real DB (test container)

auth.integration.test.ts

Edge Cases

Boundary values, null inputs, race conditions

auth.edge-cases.test.ts

Regression Tests

Tests for previously reported bugs

auth.regression.test.ts

All test execution happens inside NemoClawLite sandbox — test code cannot access files outside the project or make network calls unless explicitly allowed.

View generated test example

// auth.service.test.ts (generated by Testing Agent)

describe('AuthService', () => {

describe('signIn', () => {

it('should return JWT for valid credentials', async () => {

const result = await authService.signIn('test@mail.com', 'pass123');

expect(result.token).toMatch(/^eyJ/);

expect(result.user.email).toBe('test@mail.com');

});

it('should throw on invalid password', async () => {

await expect(authService.signIn('test@mail.com', 'wrong'))

.rejects.toThrow('Invalid credentials');

});

// Edge case: timing attack prevention

it('should take constant time regardless of user existence', async () => {

const t1 = await measure(() => authService.signIn('exists@mail.com', 'x'));

const t2 = await measure(() => authService.signIn('noone@mail.com', 'x'));

expect(Math.abs(t1 - t2)).toBeLessThan(50); // <50ms variance

});

});

});

// Agent identified 14 test cases across 3 files:

// 8 unit + 4 integration + 2 edge cases = 94% coverage

Documentation Agent — Living Docs from Code

Automatically generate and maintain documentation that stays in sync with your codebase. API docs, architecture diagrams, changelogs.

Documentation Flow

Developeragentbox run doc-agent --watch src/CLI
AgentScan: exports, types, JSDoc, README, CHANGELOGAnalysis
AgentDiff against existing docs — find stale sectionsCompare
AgentGenerate: API reference, architecture overview, guidesOutput
AgentCreate Mermaid diagrams from code structureDiagrams
AgentUpdate CHANGELOG.md from git logChangelog
MemoryManagerSave: "Last full sync: 2026-03-26, 42 pages"SQLite

Output Types

API Reference

Auto-generated from TypeScript types + JSDoc. Includes request/response examples.

Architecture Diagrams

Mermaid C4/sequence diagrams generated from import graphs and service dependencies.

Developer Guides

How-to guides inferred from common patterns in the codebase.

Changelog

Semantic changelog from git commits, grouped by feature/fix/breaking.

Watch Mode

With --watch, the agent monitors file changes and incrementally updates only affected documentation pages. Runs as an MCP server so your IDE shows real-time doc previews.

View generated documentation example

Auto-generated API Reference (from Fastify routes + Zod schemas):

POST /api/agents

Create a new agent in the marketplace.

Request Body:

{ name: string, description: string, category: "dev" | "ops" | "qa", visibility: "public" | "private" }

Response (201):

{ id: uuid, name: string, created_at: timestamp, trust_level: 0 }

Errors:

400 — Validation error (name too short, invalid category)

401 — Not authenticated

409 — Agent name already exists for this creator

Auto-generated Mermaid diagram:

sequenceDiagram

Client->>API: POST /api/agents

API->>Auth: Verify JWT

API->>DB: INSERT INTO agents

DB-->>API: agent record

API-->>Client: 201 Created

Incident Response Agent — On-Call Assistant

Monitors alerts from PagerDuty/Grafana, reads logs, correlates with past incidents, identifies root cause, and suggests a fix — all within seconds.

Response Flow

PagerDutyWebhook: API latency spike (p99 > 2s)Agent
AgentQuery Grafana: identify affected serviceMetrics
ToolExecutorkubectl logs api-server --since=5mNemoClawLite
AgentPattern: OOM kills on pod api-server-3Analysis
MemorySelectorRecall: "Similar OOM 2 weeks ago — memory leak in cache"Context
AgentRoot cause: unbounded Redis cache + new feature PR#128Diagnosis
AgentSuggest: kubectl rollout restart + add cache evictionFix
AgentCreate incident report in Confluence + Slack alertReport

Agent Capabilities

Log Analysis

Reads structured logs (JSON), unstructured logs, and stack traces. Identifies error patterns across distributed services.

Metric Correlation

Queries Prometheus/Grafana. Correlates latency spikes with CPU/memory/network metrics to narrow root cause.

Incident Memory

Remembers past incidents. When a similar pattern appears, suggests the same fix that worked before — with caveats about what changed since.

Runbook Execution

Has access to team runbooks in knowledge/. Can execute safe remediation steps (restart, scale, toggle feature flags) in sandbox.

View full incident response example

Timeline:

00:00 — PagerDuty alert: API p99 latency > 2000ms

00:03 — Agent queries Grafana, identifies api-server pods

00:08 — Agent reads last 5min of logs, finds 12 OOMKilled events

00:15 — Memory search: similar incident 14 days ago (cache leak)

00:20 — Agent diffs recent deployments: PR#128 merged 45min ago

00:25 — Agent reads PR#128 diff: new endpoint without cache TTL

00:30 — Diagnosis: unbounded cache growth from new /search endpoint

00:35 — Agent executes: kubectl rollout restart deploy/api-server

00:40 — Metrics confirm: latency dropping, pods healthy

00:45 — Agent posts incident report to #incidents Slack channel

00:50 — Agent creates Jira ticket: "Add cache TTL to /search endpoint"

Total resolution time: 50 seconds (vs 15-30 min manual)

Database Migration Agent — Safe Schema Evolution

Reads your current schema, generates migration SQL, validates with dry-run, creates rollback scripts, and checks for breaking changes.

Migration Flow

Developeragentbox run db-agent "add soft-delete to users"CLI
AgentRead: schema.prisma / supabase migrations/Context
AgentAnalyze: foreign keys, indexes, existing data volumeAnalysis
AgentGenerate: migration SQL (up + down)Output
ToolExecutorDry-run migration on test DB (NemoClawLite)Validate
AgentCheck: backwards compatibility with current APICompat
AgentCreate PR with migration + rollback + docsPR
View generated migration SQL

-- migrate_up.sql

ALTER TABLE users

ADD COLUMN deleted_at TIMESTAMPTZ DEFAULT NULL;

-- Add partial index for active users (performance)

CREATE INDEX idx_users_active

ON users (id) WHERE deleted_at IS NULL;

-- Update RLS policies for soft-delete awareness

CREATE POLICY users_visible ON users

FOR SELECT USING (deleted_at IS NULL);

-- Agent also generates: data backfill, API filter updates,

-- cascade rules, and test queries for each scenario.

-- migrate_down.sql (rollback)

DROP POLICY IF EXISTS users_visible ON users;

DROP INDEX IF EXISTS idx_users_active;

ALTER TABLE users DROP COLUMN deleted_at;

Safety Checks

Schema Diff Analysis

Compares before/after schema. Identifies columns added, removed, renamed. Flags data loss risks (DROP COLUMN) with warnings.

Backwards Compatibility

Checks if existing API queries still work. Detects: column renames that break SELECT *, type changes that lose precision, NOT NULL on populated columns.

Dry-Run Validation

Runs the migration on a test database inside NemoClawLite. Verifies: syntax correctness, index creation time, lock duration, data integrity.

Rollback Script

Auto-generates matching DOWN migration. Validates rollback independently. Ensures zero data loss on revert.

View compatibility report

Compatibility Report:

✓ API GET /users — compatible (deleted_at has DEFAULT NULL)

✓ API POST /users — compatible (column is nullable)

⚠ API GET /users?active=true — needs filter update

✓ Supabase RLS — new policy handles soft-delete

✓ Foreign keys — CASCADE rules verified

✓ Rollback — tested successfully, 0 data loss

Estimated migration time: ~200ms (table has 12,340 rows)

Lock duration: ~5ms (ADD COLUMN with DEFAULT is non-blocking in PG 11+)

Security Audit Agent — Continuous Vulnerability Scanning

Runs OWASP Top 10 checks, scans dependencies for CVEs, detects hardcoded secrets, and generates compliance reports with remediation guidance.

Audit Flow

CI PipelineTrigger on PR merge to mainAgent
Agentnpm audit + snyk test (in NemoClawLite)Deps
AgentScan source: regex + AST for OWASP patternsCode
AgentDetect: eval(), innerHTML, SQL string concatVulns
AgentScan .env, configs for hardcoded secretsSecrets
AgentCross-reference with known CVE databaseCVE
AgentGenerate report: severity, location, fix suggestionReport
MemoryManagerTrack: "3 XSS vulns fixed since last audit"Trend

Scan Categories

OWASP Top 10

Injection (SQL/XSS/Command), Broken Auth, Sensitive Data Exposure, XXE, Broken Access Control, Security Misconfiguration, Insecure Deserialization.

Dependency Vulnerabilities

Scans package-lock.json/yarn.lock against NVD + GitHub Advisory. Reports CVE ID, severity (CVSS), affected version range, and fixed version.

Secret Detection

Regex + entropy analysis for API keys, passwords, tokens in code and config files. Checks .env files, CI configs, and committed history.

Compliance Mapping

Maps findings to SOC 2 / GDPR / HIPAA requirements. Generates audit-ready reports with evidence links.

View sample audit report

Security Audit Report — 2026-03-26

CRITICAL (1)

SQL Injection in src/routes/search.ts:42

Pattern: db.query(`SELECT * FROM users WHERE name = '${input}'`)

Fix: Use parameterized query: db.query('SELECT * FROM users WHERE name = $1', [input])

HIGH (2)

1. XSS via dangerouslySetInnerHTML in Comment.tsx:18

2. CVE-2025-1234 in lodash@4.17.20 (prototype pollution)

MEDIUM (4)

Missing rate limiting on /api/login, CORS wildcard, 2 missing CSP headers

Total: 1 Critical, 2 High, 4 Medium, 8 Low | Trend: -3 vs last audit

DevOps Agent — Infrastructure as Code

Generates Terraform/Kubernetes manifests, validates with dry-run, monitors deployments, and auto-rollbacks on failure.

Deployment Pipeline

1

Describe

Natural language → IaC

2

Generate

Terraform + K8s YAML

3

Validate

terraform plan (dry-run)

4

Deploy

Apply + monitor rollout

5

Observe

Health checks + alerts

Agent Flow

DevOps"Deploy Redis cluster for caching layer"Agent
AgentRead existing infra: VPC, subnets, security groupsContext
AgentGenerate: Terraform module + K8s StatefulSetIaC
ToolExecutorterraform plan (in NemoClawLite sandbox)Validate
AgentReport: 3 resources to create, estimated $12/moReview
AgentAfter approval: terraform apply + kubectl applyDeploy
AgentMonitor: health checks every 30s for 10minObserve

Capabilities

Multi-Cloud Support

Generates Terraform for AWS, GCP, Azure. Reads existing provider configs and follows team conventions.

Cost Estimation

Estimates monthly cost of proposed infrastructure changes before deployment. Compares with current spend.

Auto-Rollback

Monitors deployment health metrics. If error rate exceeds threshold, automatically rolls back to last known good state.

View generated Terraform output

terraform plan output:

+ aws_elasticache_cluster.redis

engine: "redis"

node_type: "cache.t3.medium"

num_cache_nodes: 3

subnet_group: "app-private"

+ aws_security_group.redis_sg

ingress: port 6379 from app-sg only

+ kubernetes_stateful_set.redis

replicas: 3, persistent_volume: 10Gi

Plan: 3 to add, 0 to change, 0 to destroy

Estimated cost: $38.40/month

API Design Agent — Contract-First Development

Generates OpenAPI specs from your code, validates backwards compatibility, creates SDK clients, and runs contract tests.

Design Flow

Developeragentbox run api-agent --scan src/routes/CLI
AgentParse route handlers: params, body, response typesAST
AgentGenerate OpenAPI 3.1 spec with examplesSpec
AgentDiff with previous spec: detect breaking changesCompat
AgentGenerate TypeScript SDK client from specSDK
AgentCreate contract tests (request/response validation)Tests
MemoryManagerSave: "v2.3.0 added /search, deprecated /find"History
View generated OpenAPI spec

openapi: 3.1.0

paths:

/api/agents:

get:

summary: List marketplace agents

parameters:

- name: category

in: query

schema: { type: string, enum: [dev, ops, qa] }

responses:

200:

content: application/json

schema: { $ref: '#/components/schemas/AgentList' }

// Auto-generated from Fastify route + Zod schemas

Capabilities

Breaking Change Detection

Compares current spec with previous version. Flags: removed endpoints, changed response shapes, narrowed parameter types, required field additions.

SDK Generation

Generates typed clients (TypeScript, Python, Go) from OpenAPI spec. Includes request/response types, error handling, and auth configuration.

Contract Testing

Creates test suites that validate API responses match the spec. Catches schema drift before it reaches production.

Documentation

Generates interactive API docs (Swagger UI / Redoc). Includes live examples, auth flows, and error catalogs.

View breaking change report

API Compatibility Report (v2.2 → v2.3)

BREAKING (1):

DELETE /api/agents/find — endpoint removed (use /api/agents/search)

DEPRECATED (2):

GET /api/agents?sort=rating → use ?order_by=rating

Response field "agent.owner" → renamed to "agent.creator_id"

ADDITIONS (3):

POST /api/agents/search — new full-text search endpoint

GET /api/agents/:id/versions — version history

Response field "agent.trust_level" added

Recommendation: Bump major version (v3.0) due to breaking change

Onboarding Agent — New Developer Guide

Walks new team members through the codebase, explains architecture, helps with environment setup, and guides their first contribution.

Onboarding Flow

New Devagentbox run onboarding-buddyCLI
AgentLoad: architecture.md, README, team conventionsKnowledge
Agent"Welcome! Let me explain our project structure..."Terminal
AgentWalk through: monorepo layout, key modules, data flowGuide
ToolExecutorRun setup: npm install, supabase start, seed DBNemoClawLite
AgentVerify: all services running, tests passingHealth
AgentSuggest first task: good-first-issue from backlogTask
AgentGuide through first PR: branch, commit, push, reviewMentor

Onboarding Phases

Phase 1: Architecture Overview

Explains: project structure, key directories, tech stack, database schema, deployment model. Uses diagrams from knowledge/ and real code examples.

Phase 2: Environment Setup

Runs setup scripts in NemoClawLite sandbox. Installs dependencies, starts services, seeds test data. Troubleshoots common issues (port conflicts, missing env vars).

Phase 3: Codebase Walkthrough

Interactive tour of key code paths. Explains: how a request flows from frontend → API → database. Shows: authentication, error handling, state management.

Phase 4: First Contribution

Picks a good-first-issue. Guides: create branch, make changes, write tests, open PR. Reviews code before submission. Explains team review process.

View full onboarding session transcript

Agent: Welcome to the team! I'm your onboarding buddy. Let me walk you through our project.

Agent: We're a monorepo with 3 main packages:

apps/frontend/ — Next.js 16, Clean Architecture

apps/orchestrator/ — Fastify 5, Plugin Architecture

packages/shared/ — Zod schemas shared between both

New Dev: How does auth work?

Agent: Great question! Auth flows through Supabase. Let me show you:

1. User submits login form → app/login/page.tsx

2. Calls supabase.auth.signInWithPassword()

3. Middleware in middleware.ts checks JWT on protected routes

4. Server components use createServerClient() for user context

Agent: I've set up your local environment. All services are running. Here's a good first task: adding a tooltip to the sidebar icons. Want me to guide you through it?

Detailed Guides

Step-by-step guides for specific agent types

Ready to Build?

Try a demo agent or start building your own in seconds.