Dunning Emails vs. AI Calls: What the Data Actually Says About Getting Paid
Most AR teams default to email for collections because it's easy and scalable. The data tells a different story. Here's what actually moves the needle when you need to get paid.

Pratheek Adi
Co-Founder & CTO

Table of contents
Share
Ask most AR teams why they lead with email for collections and the answer is usually the same: it's easy, it's scalable, and it creates a paper trail. These are valid operational reasons. They are not, however, evidence that email is the most effective channel for recovering overdue payments.
The collections industry has defaulted to email-first outreach for years, largely out of habit and infrastructure convenience. But the data on email performance has been deteriorating steadily, while the case for voice-based outreach, particularly AI-driven voice, has become harder to ignore. This piece examines what the numbers actually say, where email still works, and where it structurally falls short.
The State of Email: Declining Returns
Cold and outreach email performance across B2B contexts has declined measurably over the past two years. The average B2B cold email reply rate dropped from approximately 7% in 2023 to 5.1% in 2024, according to a study of 16.5 million cold emails analyzed by Belkins. Instantly's 2026 Benchmark Report, covering billions of email interactions, puts the platform-wide average lower still — at 3.43%. The most conservative estimates from large-scale studies place average reply rates between 1 and 4%.
Dunning emails, which carry the additional context of requesting payment rather than initiating a relationship, face compounding challenges. Gmail and Yahoo's February 2024 enforcement of mandatory bulk sender requirements, followed by Microsoft's equivalent rules in May 2025, have tightened inbox placement significantly. Spam complaint thresholds dropped from 0.3% to 0.1%, meaning even marginal deliverability issues now result in suppression. Campaigns that haven't updated their technical infrastructure are underperforming not because of bad messaging, but because a meaningful percentage of their sends never arrive.
There is also a follow-up fatigue problem specific to collections sequences. Research analyzing 16.5 million emails found that sending four or more emails in a sequence more than triples unsubscribe and spam complaint rates. The first email in a sequence achieves the highest reply rate at 8.4%. Performance declines with every subsequent touch. For AR teams relying on three, four, or five-email dunning sequences, the later-stage messages are generating negative outcomes as often as positive ones.
Where Email Still Performs

Before dismissing email outright, it is worth being precise about where it works. Email is genuinely effective for:
• Early-stage reminders on invoices that are not yet overdue. Pre-due-date nudges sent to established clients with clean payment histories see significantly higher engagement than post-due collection attempts.
• Clients who have explicitly indicated email as their preferred contact channel. Preference-matched outreach consistently outperforms channel-default outreach across all communication contexts.
• Documentation and audit trails. Regardless of whether email drives payment, it creates a timestamped record of contact attempts that matters in dispute resolution and, in some cases, legal escalation.
• Low-value invoice follow-up where the cost of a higher-touch channel exceeds the margin on the invoice itself.
Email is a reasonable first touch and a useful documentation layer. It is not, by itself, a collections strategy for invoices that have crossed into delinquency.
The Case for Voice in Collections
The behavioral case for voice contact in collections is straightforward: payment decisions are financial decisions, and financial decisions, particularly uncomfortable ones involving overdue amounts, are more effectively handled through conversation than asynchronous text.
A conversation allows for real-time objection handling. An email cannot respond when a client says their AP process requires a revised invoice format, or that the payment is pending internal approval, or that there is a dispute on one line item. A voice agent can gather that information, adjust the approach, and either resolve the issue or flag it for human follow-up, in a single interaction.
The channel preference data supports this for decision-maker-level contacts. Research consistently shows that C-suite and senior finance contacts prefer phone for conversations involving financial commitment. For collections on invoices above a meaningful threshold, the channel that matches decision-maker preference will outperform the channel that is operationally convenient.
The practical objection to voice has historically been cost and scale, you cannot put a human caller on every overdue invoice. This is where AI voice agents change the calculus entirely.
What AI Voice Actually Changes

AI voice agents remove the cost and scale constraints that made voice impractical for high-volume collections. A well-built AI voice agent can handle simultaneous outbound calls, adjust tone based on conversational cues, gather payment commitments, log outcomes, and escalate to human staff only when a situation requires judgment that the system cannot provide.
The key distinction - and it matters more than most vendor marketing acknowledges, is between AI that reads a script and AI that conducts a conversation. Script-reading agents produce outcomes barely distinguishable from a pre-recorded message. Conversational agents that can detect hesitation, respond to partial objections, and calibrate urgency without aggression produce materially better results.
The performance differential shows up in commitment rates, not just contact rates. Getting a client on the phone is the first step. Getting a specific payment commitment, a date, an amount, a method, is the outcome that reduces DSO. Email rarely produces a specific commitment. A well-handled voice interaction almost always does, or surfaces the specific reason it cannot, which is equally valuable information for the AR team.
A Framework for Channel Decision-Making
The practical question for AR teams is not 'email or calls' - it is 'which channel, at which stage, for which client.' A rational collections workflow uses both, sequenced by invoice age, client history, and invoice value:
• Pre-due date (days -7 to 0): Automated email reminder. Low friction, establishes paper trail, catches the easy wins where a client simply forgot.
• Early delinquency (days 1 to 14): Email follow-up plus AI voice outreach for invoices above a value threshold. The combination covers channel preference variation and increases contact probability.
• Mid delinquency (days 15 to 45): AI voice primary, email secondary. At this stage, non-response to email is established data. Continuing to lead with a channel that has already failed is not persistence, it is repetition of a failed approach.
• Late delinquency (days 45+): Human-in-the-loop escalation for high-value accounts, supported by AI-gathered context from earlier touchpoints. Dispute resolution, payment plan negotiation, and relationship preservation at this stage require judgment that automation should support but not replace.
The Bottom Line
The data does not say email is useless for collections. It says email alone is insufficient, and that defaulting to email-first for all stages of delinquency is a strategic error dressed up as operational efficiency.
Collections performance is a function of channel fit, timing, and conversational quality, not volume of emails sent. AR teams that treat dunning email sequences as a complete collections strategy are leaving recovery rate on the table, particularly in the 15-to-60-day window where the right voice interaction can recover an invoice that a fourth follow-up email would only push further toward write-down.
The shift from email-default to channel-intelligent collections is not a technology upgrade. It is a strategic one. The tools to execute it, AI voice agents that can operate at scale without sacrificing conversational quality, exist today. The question is whether AR teams are willing to measure their channel strategy against actual recovery outcomes, rather than the operational convenience that built the email-first habit in the first place.





