Engineering7 min read

ZvonAI is Agent-Ready: MCP Servers, API-First Architecture, and the AI Phone Receptionist Built for the Agents Era

ZvonAI exposes MCP servers, a full REST API, and webhook events so AI agents — not just humans — can book appointments, query data, and trigger automations. Built for the trillion-agent internet.

The Next Trillion Users Won't Be People

Aaron Epstein, a Partner at Y Combinator, made a point that reframes every software architecture decision you will make in 2026: the next trillion users on the internet will not be humans. They will be agents.

Not people browsing dashboards, not customers tapping through onboarding flows — autonomous AI agents that discover your product, integrate with your API, call your tools, and orchestrate workflows across dozens of services without a human ever touching a keyboard.

This is not speculation. It is already the direction that every major AI lab, every serious enterprise buyer, and every infrastructure bet in the current investment cycle is moving toward. The question is not whether you will have agent customers. The question is whether your product is ready to serve them.

ZvonAI was built with this constraint as a first-class requirement from day one.


What Agent-Ready Actually Means

"API-first" is a phrase that has been diluted to meaninglessness. Every SaaS product claims it. What most mean is: we have some endpoints and a Swagger page nobody reads.

Agent-ready is a higher bar. It means:

  • An AI agent can discover your product's capabilities without reading documentation written for humans
  • An AI agent can call your tools via standard protocols (not bespoke SDKs)
  • Every workflow a human can perform through your UI is equally accessible to an agent through your API
  • Your system produces structured events that downstream automation can consume
  • Your compliance posture accounts for agent-to-human interactions, not just human-to-human

ZvonAI meets all five. Here is exactly what we built.


MCP Servers: Tools Any Agent Can Call

The Model Context Protocol (MCP), introduced by Anthropic and now supported natively by Claude, ChatGPT plugins, and a growing set of agent frameworks, defines a standard way for AI agents to discover and invoke tools.

ZvonAI exposes four MCP servers:

calendar-mcp — Check availability, book appointments, reschedule, or cancel across any Google Calendar connected to a ZvonAI tenant. An agent can query "Is Dr. Nowak free on Thursday at 2pm?" and receive a structured yes/no with the next available slot, then immediately book it — all in a single tool chain.

google-sheets-mcp — Query per-tenant data stores: patient records, pricing tables, service lists, staff schedules. This is how the voice agent answers "How much does a root canal cost at your clinic?" without that answer being hardcoded anywhere — it reads from the clinic's own spreadsheet at runtime.

crm-mcp — Access client history, interaction notes, and preferences. When a returning patient calls, the agent knows they previously requested a female dentist and prefers morning appointments. That context is retrieved via MCP tool call, not baked into the prompt.

sms-mcp — Trigger SMS confirmations via Twilio after a booking, send reminder messages 24 hours before an appointment, or dispatch urgent alerts to the practice owner. Fully programmable via tool call.

Any AI agent that speaks MCP — Claude, a custom LangChain agent, a CrewAI workflow, a GPT-based plugin — can call these tools directly to interact with ZvonAI's data layer.


REST API: Every Feature, Programmatically

ZvonAI exposes a full REST API at /api with OpenAPI 3.1 documentation. Every capability available in the dashboard is available via API.

This includes: tenant configuration, FAQ management, call log retrieval, transcript access, analytics queries, phone number assignment, and billing status checks.

The API uses JWT authentication with scoped tokens — so an orchestrator agent can be granted read-only access to analytics while a booking agent has write access to calendar operations. Least-privilege access is first-class, not an afterthought.


Webhooks: Reactive Automation for Every Call Event

ZvonAI emits structured webhook events for every significant occurrence in the call lifecycle:

Event Payload
call.started caller ID, tenant ID, timestamp, inbound number
call.ended duration, outcome (booked / transferred / FAQ / abandoned), cost
appointment.booked patient name, service, time, calendar event ID
transcript.ready full call transcript, detected intents, sentiment score
sms.sent confirmation message body, delivery status

These events plug directly into Make, n8n, Zapier, or any custom webhook consumer. A dental clinic can automatically create a patient record in their practice management software the moment ZvonAI books an appointment. A law firm can push every call transcript into their document management system. A medical practice can trigger a reminder sequence in their email platform — all without a human intermediary.


llms.txt: Discoverable by AI Systems

ZvonAI serves llms.txt and llms-full.txt at zvonai.ai/llms.txt following the emerging standard for AI-readable site documentation.

When a language model (Claude, GPT, Gemini) is given a task that involves ZvonAI — "integrate ZvonAI into this workflow", "find out what tools ZvonAI exposes", "write code to call ZvonAI's booking API" — it can fetch and parse llms.txt to understand the product's capabilities, API surface, and integration patterns without relying on training data that may be stale or incomplete.

This is the machine-readable equivalent of documentation. It ensures ZvonAI is not just discoverable by search engines, but by AI systems that will increasingly be the ones making integration decisions.


Provider-Agnostic LLM Layer: No Vendor Lock-In

ZvonAI's voice pipeline abstracts the language model behind a provider interface:

class LLMProvider(Protocol):
    async def generate(
        self,
        messages: list[Message],
        tools: list[Tool]
    ) -> AsyncIterator[str]: ...

Today, ZvonAI uses Google Gemini 2.5 Flash via Vertex AI — chosen for its 620ms time-to-first-token, which is the fastest available and critical for sub-1200ms call response latency. Tomorrow, if a better model emerges (or if Gemini pricing changes, or if a Polish-specialised model ships), ZvonAI switches via a config change, not a code rewrite.

This matters for agent operators evaluating ZvonAI as a component in a larger stack. You are not betting on Gemini. You are betting on the abstraction. The model is interchangeable; the business logic and data layer are not.


EU AI Act Art. 50: Compliant for Agent-to-Human Pipelines

When an AI agent interacts with a human — even as an intermediary in an automated pipeline — EU AI Act Article 50 requires disclosure. The AI must identify itself.

ZvonAI's voice agent always opens calls with an explicit identification. This is not a soft preference; it is implemented as a non-overridable behaviour in the system prompt architecture. No tenant configuration can suppress it.

For enterprise buyers and compliance teams evaluating AI voice systems: ZvonAI is built for the regulatory environment that AI Act enforcement will create. Agent pipelines that route through ZvonAI are compliant by default, not by self-attestation.


The Data Moat That Compounds

Every AI voice system in the market — ZvonAI included — starts with the same base model. But base models are commodities. The moat is context.

Every call ZvonAI handles generates proprietary per-tenant context that accumulates over time:

  • FAQ patterns: which questions callers actually ask (not what the owner thinks they ask), how they phrase them, what vocabulary they use
  • Booking behaviour: which time slots fill fastest, which service categories drive the most inbound calls, what percentage of callers book on first contact vs. second
  • Patient/client vocabulary: how patients at a specific dental clinic describe their symptoms, how clients of a specific law firm describe their legal problems — domain-specific language that generic models do not know

This context is stored per-tenant and used to continuously refine each tenant's voice agent. A competitor deploying the same Gemini Flash model on day one would need months of call volume to accumulate the same context quality.

The base model is the same. The context is not. That asymmetry compounds.


How to Integrate ZvonAI into Your AI Agent Stack

Here is a minimal example of calling ZvonAI's calendar-mcp tool from a Python agent:

import anthropic

client = anthropic.Anthropic()

# ZvonAI MCP server registered as a tool source
tools = [
    {
        "name": "zvonai_check_availability",
        "description": "Check appointment availability for a ZvonAI tenant",
        "input_schema": {
            "type": "object",
            "properties": {
                "tenant_id": {"type": "string"},
                "date": {"type": "string", "format": "date"},
                "service_type": {"type": "string"}
            },
            "required": ["tenant_id", "date"]
        }
    },
    {
        "name": "zvonai_book_appointment",
        "description": "Book an appointment via ZvonAI calendar-mcp",
        "input_schema": {
            "type": "object",
            "properties": {
                "tenant_id": {"type": "string"},
                "patient_name": {"type": "string"},
                "datetime_iso": {"type": "string"},
                "service_type": {"type": "string"},
                "send_sms_confirmation": {"type": "boolean"}
            },
            "required": ["tenant_id", "patient_name", "datetime_iso", "service_type"]
        }
    }
]

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    messages=[
        {
            "role": "user",
            "content": (
                "Book an appointment for Anna Kowalska at tenant 'dental-gdynia-01' "
                "for a routine check-up on 2026-06-15 at 10:00. Send an SMS confirmation."
            )
        }
    ]
)

# The agent calls zvonai_check_availability, then zvonai_book_appointment,
# then returns a structured confirmation — no human in the loop.
for block in response.content:
    if block.type == "tool_use":
        # Dispatch to ZvonAI MCP server
        result = zvonai_mcp_dispatch(block.name, block.input)

The full MCP server spec and OpenAPI schema are available at zvonai.ai/api/docs.


Join the Beta on July 8

ZvonAI opens for beta access on July 8, 2026. Early access includes:

  • Full API + webhook access from day one
  • MCP server credentials for your integration
  • Direct line to the founding team for feedback and custom configuration
  • Priority migration support if you are replacing an existing receptionist setup

If you are building an agent stack that needs a voice layer — or if you are running a dental clinic, medical practice, or law firm that wants to automate inbound calls — the beta is the right moment to integrate.

Join the waitlist at zvonai.ai/waitlist →

The agents are already calling. The question is whether your practice's phone line will answer.

Ready to stop missing calls?

Try ZvonAI free for 14 days — no credit card required.

Start for free
ZvonAI is Agent-Ready: MCP Servers, API-First Architecture, and the AI Phone Receptionist Built for the Agents Era — ZvonAI Blog | ZvonAI