Buyer's Guide · Plain-English

How Does an AI Receptionist Work?

Last updated:

ANSWER

An AI receptionist is a four-component voice stack: speech-to-text transcribes the caller (Deepgram, Whisper), a large language model generates the response (Claude Sonnet 4.6, GPT-4o), text-to-speech speaks it back (Cartesia, ElevenLabs), and an orchestration layer (Vapi) ties them together and handles webhooks for booking and CRM hand-off. The premium stack costs more per minute to run than budget alternatives but produces conversations that callers consistently do not realize are AI.

The six-step technical flow.

  1. 1. Inbound call hits a forwarded number
    You forward your existing business line (or a dedicated Twilio number) to the AI receptionist platform. The customer dials your normal number; the call routes to the agent within one to two rings.
  2. 2. Speech-to-text (Deepgram, Whisper, etc.) transcribes
    The caller's audio is streamed to a speech-to-text service like Deepgram Nova-3. Crisp uses Deepgram for sub-300ms transcription latency, which is critical to making the conversation feel real-time.
  3. 3. LLM generates the response (Claude Sonnet 4.6 on Crisp)
    The transcribed text plus a configured system prompt (your hours, services, pricing tiers, FAQs, escalation rules) is sent to a large language model. Crisp uses Claude Sonnet 4.6 via OpenRouter. The model returns the receptionist's next line.
  4. 4. Text-to-speech (Cartesia, ElevenLabs) speaks back
    The LLM response is sent to a TTS engine that renders it as natural-sounding audio. Crisp uses Cartesia Sonic-3, one of the cheapest premium voices that consistently wins blind tests against pricier alternatives.
  5. 5. Orchestration (Vapi) ties it all together
    Vapi orchestrates the round-trip, manages call state, handles interruptions and barge-in, and exposes webhooks for booking, escalation, and CRM hand-off. Without orchestration, the call sounds choppy and the agent can't book appointments.
  6. 6. Webhook fires to your CRM or calendar
    When the agent books an appointment, a webhook fires to Jobber, Buildertrend, Housecall Pro, Google Calendar, or whatever CRM you use. The lead arrives qualified, scoped, and on your calendar without a human touching it.

Premium stack vs budget stack: where the gap shows up.

All AI receptionists use roughly the same architecture (the four components above). The gap between a $49/mo tool and a $425/mo agent is in the components chosen. A budget tool runs an older LLM with a synthetic-sounding TTS voice. A premium agent runs Claude Sonnet 4.6 with Cartesia Sonic-3. The first sounds robotic and gives wooden answers. The second handles unexpected questions, captures nuance, and books bookings the first tool would drop.

Where it shows up most:

  • Emotional callers. A burst pipe at 2am, an anxious dental patient, a homeowner with a flood. Budget LLMs respond robotically; premium LLMs read the emotion and respond calmly.
  • Multi-step bookings. "I need a furnace inspection but also want a quote on a new water heater while you're here." Budget LLMs handle one or the other; premium LLMs handle both.
  • Edge questions. "Do you do permit work?" "What's your hold-harmless policy?" "Are you HCRA-registered?" Budget LLMs guess; premium LLMs check the configured knowledge base and answer correctly or escalate.

What the conversation actually sounds like.

Listen to Demo 2 on the Crisp homepage. The agent picks up, runs the qualifying conversation (scope, budget tier, timeline, address), books a site visit on the configured calendar, and texts you a summary the moment the call ends. The whole call takes 90 seconds to three minutes. The caller does not have to repeat themselves to a human later; the agent captured everything.

Common questions.

What AI models do the best receptionists use?

Premium agents use frontier models like Claude Sonnet 4.6 (Anthropic) or GPT-4o (OpenAI). Budget tools ($49 to $99/mo) typically use older or cheaper LLMs (GPT-3.5, Claude Haiku, open-source models). The difference shows up in how well the agent handles unexpected questions, multi-step bookings, and emotional callers.

How does the AI sound? Is it obviously robotic?

Premium TTS (Cartesia Sonic-3, ElevenLabs Turbo) sounds essentially human in blind tests. Budget TTS (older Google or Amazon Polly voices) sounds noticeably synthetic. Crisp uses Cartesia. You can hear the demo on the Crisp homepage and judge for yourself.

Can it book appointments directly?

Yes, via webhook integration with calendar tools (Cal.com, Google Calendar) or CRM/job-management systems (Jobber, Buildertrend, Housecall Pro, ServiceTitan). Crisp configures this during the 5-business-day onboarding.

What happens if the caller asks something the AI doesn't know?

The agent should escalate gracefully. Common patterns: take a message and notify you by SMS, offer to text a callback link, or transfer to a human if you have one available. The configured fallback behaviour is part of the setup work.

Is it CASL-compliant?

Voice calls are not subject to CASL (which governs commercial electronic messages like SMS and email). However, the SMS follow-ups that an AI receptionist often triggers (booking confirmations, summary texts, missed-info follow-ups) absolutely are CASL-regulated. Crisp builds CASL compliance into the SMS layer by default; US tools typically do not.

Will it replace my human receptionist?

No. It supplements them. Most clients use AI for after-hours, overflow during business hours, and weekend coverage. Human receptionists handle complex calls, regulars they recognize, and anything that needs judgement the AI can't make.

Simple, transparent pricing.

Start with one system or stack all three.

01

AI Voice Agent

Your 24/7 AI receptionist on the premium stack.

$499 CADsetup

$425 CAD/mo

  • Unlimited inbound calls on one business line. No per-call fees, no overages.
  • Books appointments, qualifies leads, handles FAQs 24/7
  • Powered by Claude Sonnet 4.6 and Cartesia voice (premium AI stack)
  • Full call transcripts logged to your dashboard
  • 8-minute call cap with auto-callback offer
WHY THIS PRICE

Setup. A custom voice agent built around your business: your hours, services, pricing, FAQs, escalation rules, and tone. Includes Twilio number setup, CRM/calendar integration (Jobber, Housecall Pro, Google Calendar, etc.), 2 to 3 rounds of voice tuning, and full conversation testing before go-live. Live in 5 business days.

Monthly. Unlimited inbound calls on the premium AI stack: Claude Sonnet 4.6, Cartesia voice, Deepgram speech, Vapi orchestration. 24/7 coverage including after-hours, weekends, and stat holidays. Full transcripts, ongoing agent tuning based on real call data, spam filtering, and CASL-compliant SMS handling for any follow-ups. Most $79/month tools cut corners on the LLM and voice model. We don't.

02

Text Concierge Agent

Most offer text-back. Crisp offers a text concierge.

$299 CADsetup

$275 CAD/mo

  • Fires within 30 seconds of every missed call. 24/7, weekends, holidays.
  • Full conversational range from your FAQ, services, pricing, and hours. Not a script.
  • Forwards any active thread to a primary number of your choosing at any time
  • CASL-compliant: opt-out, sender ID, STOP honoured automatically
  • Hands off to you the moment the lead is booked or qualified
WHY THIS PRICE

Setup. A text concierge trained on your FAQ, pricing, services, and the questions your customers actually ask. Includes Twilio SMS setup (or integration with your existing line), CASL-compliant template design, conversation flow logic, forwarding rules to your primary number, and end-to-end testing before go-live. Live in 5 business days.

Monthly. This isn't a $29 auto-responder. It's a multi-turn AI text concierge that handles customer questions, gives quotes, books appointments by SMS, and forwards the live thread to your primary number the moment you want to take it over. CASL-compliant message infrastructure included: no fines, no manual config. Full SMS threads logged to your dashboard.

03

Review Workflow

Turn every completed job into a 5-star Google review, automatically.

$275 CADsetup

$175 CAD/mo

  • Auto-sends after every completed job, within 30 seconds
  • CASL-compliant: opt-out, sender ID, STOP honoured automatically
  • Google rating and review trend tracked in your dashboard
  • One dashboard with your voice and text systems. No second login.
  • Smart de-duplication: never asks the same customer twice
WHY THIS PRICE

Setup. Custom message templates written in your voice, CASL-compliant SMS infrastructure, integration with your job-completion trigger (CRM, Jobber, Housecall Pro, or manual mark-complete), and Google Business Profile connection. Live in 5 business days.

Monthly. Every completed job becomes a review request within 30 seconds: that timing matters. Smart de-duplication so happy clients aren't asked twice. Google rating tracked in real time. Monthly analytics on response rate and trend. Integrated with the rest of your Crisp stack: one dashboard, one vendor, one bill. NiceJob Pro costs the same and only does this. We're the whole system.

Cancel anytime. No contract. Setup in 5 business days. Not sure which fits? Book 15 minutes and we will figure it out together.