How to Let Your AI Agent Place Phone Calls

How to Let Your AI Agent Place Phone Calls

8 min read
Yanis Mellata
Guides

Your Agent Can't Call Anyone

Your agent browses the web. It sends emails. It queries databases, writes code, manages calendars, and searches the internet. Then it hits a workflow step that requires a phone call.

Book a dental appointment at a clinic that only takes calls. Verify insurance coverage with a provider's phone line. Follow up on a purchase order with a supplier. Check if a restaurant can seat six on Friday.

The agent stops. It doesn't have a phone tool. It either punts back to the user — "here's the number, you'll need to call" — or the workflow stalls entirely.

Building a voice pipeline from scratch — Twilio, Deepgram, ElevenLabs, custom conversation logic — is weeks of work for one phone call. AgentPhone reduces it to one API call.

What AgentPhone Does

AgentPhone is a phone call API for AI agents. The interface is simple:

You send:

  • A phone number to call
  • An objective — what the call should accomplish
  • Optional context — a business name, a website URL, background info

You get back:

  • An outcome — achieved, not_achieved, or partial
  • A summary — 2-3 sentences about what happened
  • The full transcript — every word spoken
  • A recording URL — the raw audio
  • Outcome details — why the outcome was what it was

Your agent doesn't manage telephony. It doesn't configure speech models. It doesn't handle WebSockets or streaming audio. It sends a POST request and polls for results. Same pattern as any other API-based tool.

Making Your First Call

cURL

# 1. Create the call
curl -X POST https://agentphone.app/api/v1/calls \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "to_phone_number": "+14155551234",
    "objective": "Ask about dinner reservations for 2 tonight at 7pm",
    "business_name": "Nopa Restaurant"
  }'

# Response: { "data": { "call_id": "cl_abc123", "status": "queued" }, "credits_remaining": 4 }

# 2. Poll for results
curl https://agentphone.app/api/v1/calls/cl_abc123 \
  -H "x-api-key: YOUR_API_KEY"

# When status is "completed", you get outcome, summary, transcript, recording_url

Python

import httpx
import time

API_KEY = "your_api_key"
BASE = "https://agentphone.app/api/v1"
HEADERS = {"x-api-key": API_KEY, "Content-Type": "application/json"}

# Create the call
response = httpx.post(f"{BASE}/calls", headers=HEADERS, json={
    "to_phone_number": "+14155551234",
    "objective": "Ask about dinner reservations for 2 tonight at 7pm",
    "business_name": "Nopa Restaurant",
})
call_id = response.json()["data"]["call_id"]

# Poll until done
while True:
    result = httpx.get(f"{BASE}/calls/{call_id}", headers=HEADERS).json()["data"]
    if result["status"] in ("completed", "failed"):
        break
    time.sleep(3)

print(result["outcome"])       # "achieved"
print(result["summary"])       # "Successfully booked a table for 2 at 7pm..."
print(result["transcript"])    # Full conversation text

Node.js

const API_KEY = "your_api_key";
const BASE = "https://agentphone.app/api/v1";
const headers = { "x-api-key": API_KEY, "Content-Type": "application/json" };

// Create the call
const { data } = await fetch(`${BASE}/calls`, {
  method: "POST",
  headers,
  body: JSON.stringify({
    to_phone_number: "+14155551234",
    objective: "Ask about dinner reservations for 2 tonight at 7pm",
    business_name: "Nopa Restaurant",
  }),
}).then(r => r.json());

const callId = data.call_id;

// Poll until done
let result;
do {
  await new Promise(r => setTimeout(r, 3000));
  result = await fetch(`${BASE}/calls/${callId}`, { headers }).then(r => r.json());
} while (!["completed", "failed"].includes(result.data.status));

console.log(result.data.outcome);    // "achieved"
console.log(result.data.summary);    // "Successfully booked a table..."

That's it. No Twilio account. No speech model configuration. No webhook server. Create the call, poll for results, use the data.

How the Voice AI Handles the Call

When you create a call, AgentPhone dials the number using a real phone number. On the other end, a voice AI handles the conversation. Here's what it can deal with:

IVR menus — "Press 1 for scheduling, press 2 for billing." The AI listens to all options, selects the right one using DTMF tones, and navigates to the correct department based on the objective.

Hold queues — The AI waits through hold music, queue announcements, and estimated wait times. Your agent doesn't need to stay connected — the call runs independently and returns results when it's done.

Transfers — If the call gets transferred between departments, the AI keeps context and re-explains the objective as needed.

Natural conversation — The AI doesn't follow a rigid script. It responds to questions, provides information from the context you passed in, spells names when asked, and adapts to the conversation flow. If the receptionist says "we're fully booked at 7, how about 7:30?" the AI makes a judgment call based on the objective.

Edge cases — Voicemail, busy signals, wrong numbers, disconnects. These come back as a failed status with an error_code explaining what happened.

Wiring It Into Your Agent

The integration pattern is the same regardless of what framework you use. You define a tool/function that:

  1. Takes a phone number, an objective, and optional context
  2. Calls the AgentPhone API
  3. Polls until the call completes
  4. Returns the structured result to the agent

Here's the generic pattern in Python:

def place_phone_call(to_phone_number: str, objective: str,
                     business_name: str = "", website: str = "") -> dict:
    """Place a phone call to accomplish a specific objective.

    Use this when a workflow step requires calling a business or person
    by phone — scheduling, verifying information, following up on orders,
    or gathering details not available online.

    Returns outcome (achieved/not_achieved/partial), summary, and transcript.
    """
    # Create the call
    response = httpx.post(f"{BASE}/calls", headers=HEADERS, json={
        "to_phone_number": to_phone_number,
        "objective": objective,
        "business_name": business_name,
        "website": website,
    })
    call_id = response.json()["data"]["call_id"]

    # Poll until done
    while True:
        result = httpx.get(f"{BASE}/calls/{call_id}", headers=HEADERS).json()["data"]
        if result["status"] in ("completed", "failed", "canceled"):
            return result
        time.sleep(3)

The docstring matters — it's what the LLM reads to decide when to use the tool. Be specific: "when a workflow step requires calling a business or person by phone."

This function works as:

  • An @function_tool in the OpenAI Agents SDKfull guide
  • A BaseTool subclass in LangChainfull guide
  • A BaseTool subclass in CrewAIfull guide
  • An MCP server tool for Claude Code / Cursorfull guide

The framework-specific posts have complete, copy-paste implementations.

The Polling Pattern

queued → dialing → in_progress → completed | failed | canceled

Poll every 3-5 seconds. Alternatively, configure a webhook URL in your org settings to receive results on completion.

GET /api/v1/calls/cl_abc123

{
  "data": {
    "call_id": "cl_abc123",
    "status": "completed",
    "outcome": "achieved",
    "outcome_details": "Table confirmed for 2 at 7:00 PM tonight under caller's name.",
    "summary": "Called Nopa Restaurant and booked a table for 2 at 7pm tonight.",
    "transcript": "Agent: Hi, I'd like to make a dinner reservation for tonight...\nHost: Sure, for how many guests?\nAgent: Two people, at 7pm if possible.\nHost: Let me check... yes, we have a table at 7. Can I get a name?\nAgent: Sure, it's for Sarah.\nHost: Great, Sarah, party of two at 7pm tonight. You're all set.\nAgent: Thank you!",
    "duration_seconds": 42,
    "recording_url": "https://storage.agentphone.app/recordings/cl_abc123.wav",
    "started_at": "2026-03-11T19:00:05Z",
    "ended_at": "2026-03-11T19:00:47Z"
  }
}

Use Cases With Real Payloads

1. Appointment Scheduling

Your agent manages a user's calendar. The user asks to book a dental cleaning. The dentist doesn't have online booking.

{
  "to_phone_number": "+14155550789",
  "objective": "Book a dental cleaning appointment. Preferred dates: March 20 or 21, morning preferred. Patient name: Sarah Chen. Insurance: Delta Dental PPO.",
  "business_name": "Pacific Heights Dental"
}

The objective field drives the conversation. Be specific — preferred dates, patient name, insurance. The more context, the better the AI handles the call.

2. Insurance Verification

Before a patient visit, the agent needs to verify coverage.

{
  "to_phone_number": "+18005550456",
  "objective": "Verify dental coverage for patient Sarah Chen, DOB 03/15/1990, member ID DC-442891. Check: is the plan active, what's the copay for a cleaning (code D1110), and is pre-authorization required.",
  "business_name": "Delta Dental"
}

Insurance calls are long — IVR menus, hold times, transfers between departments. AgentPhone handles all of it. The outcome includes the specific data points requested.

3. Vendor Follow-Up

The agent tracks purchase orders. One is overdue.

{
  "to_phone_number": "+12125550321",
  "objective": "Follow up on purchase order PO-2847, placed February 28. Ask for current delivery ETA and whether the full quantity of 500 units is shipping together or split.",
  "business_name": "Acme Supply Co",
  "website": "https://acmesupply.com"
}

Passing the website field lets the voice AI scrape the site for context — product details, company info, department structure. This makes the conversation more natural.

4. Restaurant Research

The agent is planning an event and needs info from multiple restaurants.

{
  "to_phone_number": "+14155550567",
  "objective": "Ask about private dining options for a group of 20-25 people on April 5th, Saturday evening. Need: availability, minimum spend requirement, menu options (set menu vs a la carte), and whether they accommodate dietary restrictions.",
  "business_name": "Foreign Cinema"
}

The AI gathers multiple data points in one call. The transcript captures everything, and the summary distills it into the key facts.

What AgentPhone Is Not

AgentPhone is a tool, not a platform. Here's the distinction:

If you're building a voice AI product — an AI receptionist, a call center bot, an interactive voice response system — use Vapi, Retell, or Bland. These are platforms for building voice AI. They give you control over voices, conversation flows, real-time streaming, and telephony infrastructure.

If your agent needs to make a call as one step in a larger workflow — book an appointment, verify information, follow up on an order — use AgentPhone. You don't manage telephony. You don't configure voices. You send an objective and get results.

The analogy: Vapi/Retell/Bland are like building your own email server. AgentPhone is like calling sendEmail().

Getting Started

Grab an API key at agentphone.app (5 free calls, no credit card).

Framework guides:

API reference: agentphone.app/docs

Ready to give your agent a phone?

Get Your API Key →

Written by Yanis Mellata, Founder & CEO