CHATBOT
One codebase, any number of clients across 4 chat platforms + embedded widget — with RAG, escalation, and a self-healing knowledge pipeline.
- Context
- Customer support automation
- Surface area
- 4 platforms · 9 intents · 5 tools
- Cost of alternatives
- Per-client subscription + infra
Restaurants needed chat support automation across messaging channels. Off-the-shelf tools had three blockers — every new client meant a new install, a new subscription, and a new knowledge base to babysit.
Predefined FAQs or raw GPT hallucinations. Binary handoff — bot answers everything or escalates instantly. What was missing was a platform that learned from its own failures.
“A chatbot that forgets its mistakes is a template. One that learns from them is a platform.”
A platform has to behave like infrastructure, not a chatbot.
New client = 1 DB record. New chat platform = 1 adapter. New use case = 1 tool class. Zero duplication, zero code changes per tenant.
Single-tenant SaaS collapses under 10 clients. Template-based FAQs can't handle 6 languages and 9 intents. Binary handoff destroys trust. And everything has to be multi-tenant-safe — one leaked RLS policy = compliance breach across every customer.
Three architectural patterns hold the system together — Adapter, LEGO Orchestrator, Self-Healing.
Platforms are isolated in adapters. Tools route through a priority-based orchestrator keyed on detected intent. Every low-confidence turn feeds a closed feedback loop that proposes KB edits — gated by human-in-the-loop approval.
Result: adding a new platform (Smartsupp, Discord) is one adapter file. Adding a capability (Ordermage API, news feed) is one tool class + a feature flag in clients.features JSONB. The bot that answered yesterday's vague question gets a proposed KB entry tomorrow morning.
Three patterns, one closed feedback loop.
Platform adapters
5 adapters — 4 external chat platforms (HelpCrunch, Telegram, WhatsApp, Chatwoot) plus an embedded web widget. Each implements the PlatformAdapter interface in 150–400 LOC; Edge Function wrappers stay thin (40–72 LOC for chat platforms; the widget carries session + auth at ~240) — all meaningful logic lives in the shared handler factory.
Priority-based tool routing
5 backend tools (AdminBot, Handoff, Ordermage, News, RAG) — ranging 89–635 LOC depending on domain depth, each declared with capability + required feature. Tool router picks highest-priority match. Turn on a feature flag → behavior changes, no code change per tenant.
KB learns from its own failures
Low confidence (<0.5) → gap-detector writes knowledge_gaps → suggestion-generator calls GPT → dashboard admin approves → pattern-analyzer updates intent patterns → weekly-digest emits a health score. Human-in-the-loop by default.
// One-line Edge Function per platformexport const handler = createHandler({ platform: "chatwoot", adapter: chatwootAdapter, verify: verifyHmacSha256WithTimestamp,});// 1,073-line orchestrator behind a thin platform wrapperexport async function createHandler<TEvent>(config: PlatformConfig<TEvent>) { /* intent → tool router → response composer */}// AI Assistant: multi-step agent over 42 toolsconst result = streamText({ model: openai(resolveModel("assistant")), tools: scopedTools, stopWhen: stepCountIs(3), onStepFinish: (s) => logToolCall(s), messages: [smartContext, ...turn.messages],});+ Edge Functions
Under 45 active days, 257 commits, foundation to post-incident evals.
Honest framing — not a 5-day sprint, not a full-time three-month run either. Part-time development from 23 Jan → 15 Apr 2026 — under 45 active dev days across 12 weeks. March alone held 173 of the 257 commits — the core sprint phase.
Set up Supabase, pgvector, the first adapter interface, and the initial HelpCrunch handler. Drafted the multi-tenant contract on paper: new client = 1 DB record, not a deploy.
V2 migrations, Dashboard wiring, per-client intent patterns (9 types, LRU-cached), OrdermageBot (two-step HMAC-SHA256 flow), Analytics V2 with cost + confidence tracking, Design System, 6-language detection (CS/EN/DE/SK/PL/HU).
API Layer (68 routes, 3-step auth pipeline) · MCP Server (auto-generated tools from OpenAPI) · AI Assistant (42 tools, 3-tier approval, SSE streaming) · Production Hardening · Notification Center · Tenant isolation · WhatsApp + Chatwoot adapters · Self-Healing 1–3 · Cost Protection.
Multilingual hardening (DE/PL/SK/HU patterns). security_invoker=true views migration fixing an RLS bypass. Semantic handoff detector. Amici 'podsednik' incident (KB hallucination) → Promptfoo LLM eval suite (40 cases, 40/40 PASS against live).
Watch the AI Assistant triage a morning dashboard.
Results
- R/01Two production clients onboarded in parallel — Amici food delivery (HelpCrunch) and Objedname (Chatwoot). Adding client #2 cost 1 DB record + 0 LOC — the multi-tenant architecture delivered on its promise.
- R/0242 AI Assistant tools with 3-tier approval (23 read auto-execute, 18 write amber banner, 1 destructive red + confirm) and role scoping per user type.
- R/035,400+ tests + 40/40 LLM eval cases PASS against live production — Vitest unit + integration + E2E + Promptfoo LLM eval suite (4 platform datasets + 1 Amici regression).
- R/0489 migrations · 12 CI workflows · zero rollbacks across 257 commits. Multi-tenant by construction — 87 RPC functions, 11 views (security_invoker=true), 121 indexes.
Learnings
- L/01Rule #1 has more leverage than any framework.
“Before shipping: how does this repeat for the next client?” filtered every PR across 257 commits. Sprint 15 review caught 7 CRITICAL bugs — IDOR, missing auth, wrong RPC — before they touched prod.
- L/02security_invoker = true is not optional.
Postgres default `security_invoker = false` means views run as table owner — bypassing RLS. Migration 20260310000002 flipped 8 views. In a multi-tenant setup, always explicit.
- L/03Hallucinations forced eval-driven dev.
The Amici 'podsednik' incident (2026-04-12): scraper wrote nonsense into KB, bot repeated it. Response: Promptfoo suite with platform and regression layers. Every future bug becomes one YAML case.
- L/04Split god modules on schedule, not when they burn.
handler-orchestrator.ts hit 1,146 LOC before Sprint 23 extracted analytics-merger, auto-handoff, ab-manager — down to 932 LOC. Subsequent features pushed it back to 1,073, but with clean responsibilities. Refactor cadence > refactor heroics.