Developer Tools & Agent Integration

MCP tools

The framework exposes 12 MCP tools via hermes-redteam/mcp-server.ts for use with any MCP-compatible agent:

Tool Purpose
read_repo Scan source code for tools, roles, guardrails, secrets
probe_target Send benign test message to observe API shape
read_prior_reports Load previous scan results for adaptive planning
write_config Generate and save a red-team config
write_custom_attacks Create custom attack CSV/JSON files
write_policy Create judge policy with per-category overrides
run_scan Start a red-team scan via dashboard API
check_run_status Poll scan progress (attacks completed, phase)
get_run_results Get full results with verdicts and findings
cancel_run Cancel a running scan
list_categories_and_strategies List all 141 categories + 155 strategies with compliance mappings
suggest_guardrails Map vulnerabilities to Votal Shield guardrail configs

Register with Hermes:

hermes mcp add wb-redteam -- npx tsx $(pwd)/hermes-redteam/mcp-server.ts

Natural conversation with Hermes:

You:    test my chatbot at http://localhost:3000 for safety issues
Hermes: [calls probe_target, write_config, run_scan]
        Scan started. 15 categories, 22 strategies, 2 rounds.

You:    show results
Hermes: [calls get_run_results]
        5 vulnerabilities found. 3 critical prompt injection, 2 high data exfiltration.

You:    how do I fix these?
Hermes: [calls suggest_guardrails]
        Deploy Votal Shield with adversarial-prompt-detection enabled.

AI Assistant

Standalone, no Hermes needed:

npm run ai

Natural-language terminal interface with intent classification, LLM-powered config generation, and Votal Shield guardrail recommendations.

Guardrail recommendations (Votal Shield)

When vulnerabilities are found, the framework maps them to specific Votal Shield guardrail configurations:

Vulnerability Shield Guardrail Endpoint
Prompt injection adversarial-prompt-detection /guardrails/input
Toxic content toxicity-detection /guardrails/output
PII disclosure pii-detection + output-redaction /guardrails/output
Hallucination hallucination-detection /guardrails/output
Data exfiltration pii-detection + keyword-blocklist /guardrails/output
Tool misuse agentic tool authorization /guardrails/output
Content filter bypass keyword-blocklist + adversarial-prompt-detection /guardrails/input
Harmful advice topic-restriction + toxicity-detection /guardrails/input + output

Deploy Shield as a proxy — no code changes needed:

/v1/shield/chat/completions   # instead of /v1/chat/completions