Runtime security is the only approach that scales with agentic development

Kritarth LohomiFebruary 3, 2026

Benchmarks like BaxBench exist for a reason: getting LLMs to generate backends that are both correct and secure is still hard, even in “realistic” settings built specifically to measure this gap. BaxBench alone contains hundreds of security-critical backend tasks, and it evaluates security exposure with end-to-end exploits, not just unit tests. That is the shape of the problem we are walking into as agents move from writing snippets to producing systems.

Prompt-based security (prompt engineering, skills, MCP templates) helps, but it is not enforceable. It is guidance, and guidance can be bypassed, degraded by context, or simply ignored when the agent is under pressure to complete a task. This is not theoretical. Researchers keep demonstrating bypasses of guardrail-style defenses, and major security orgs have warned that prompt injection is rooted in how LLMs treat instructions and data-so you should assume residual risk will remain. MCP and “skills” also expand the attack surface because they influence tool choice and execution at runtime, and tool ecosystems bring their own integrity and context-poisoning risks.

This is why runtime policy enforcement is essential. Long-running agents are increasingly expected to work for hours or days across many steps and context windows. In that world, “security as a next step” does not scale. You cannot rely on developers to notice every unsafe intermediate file, every accidental credential leak, every insecure dependency choice, or every risky code path after it has already landed. Security has to be engraved into the agent loop: every tool call, every file write, every diff, every outbound request.

merci is built for that agentic workflow. We enforce deterministic policies at runtime so unsafe code and unsafe actions get blocked or corrected before they ever reach your repo or production. Instead of showing developers a vulnerability report after the fact, merci catches the issue in the moment the agent is about to introduce it, then fixes it or constrains the agent’s next action. The goal is simple: developers should never have to see bad code to stay safe, and security should keep up as development speed accelerates.