By Oliver · AI Architect, BuildAClaw · May 28, 2026 · 9 min read
How to Build an AI Email Triage Agent That Clears Your Inbox Every Morning
The average knowledge worker spends 2.6 hours per day on email — 13 hours a week gone before a single line of real work happens. Here's how to build a local AI agent that reads, sorts, archives, and drafts replies every morning before you wake up, with no cloud dependency and no email data leaving your machine.
Why Email Is the Highest-ROI First Agent Project
When I talk to solo founders and small operators who want to start with AI agents, I tell them the same thing: don't start with something flashy — start with email. Not because it's easy, but because the ROI math is impossible to argue with.
A McKinsey study put email time at 28% of the workweek for the average knowledge worker. For founders running lean, that number climbs closer to 35%. That's not just frustrating — it's a compounding cost. Every hour in your inbox is an hour not in product, sales, or strategy.
THE EMAIL TAX — ANNUAL COST FOR A SOLO FOUNDER
- 2.6 hrs/day × 250 working days = 650 hours/year in inbox
- ~65% of that inbox is sortable by a rule or pattern the agent can learn
- ~20% of emails need a human reply — but 80% of those follow repeatable templates
- Effective reclaim after AI triage: 8–12 hours/week for the median founder
- Mac Mini M4 hardware cost: ~$599 · Break-even vs. hiring a VA: under 3 months
The other reason email is a great first agent is that it's a bounded, well-scoped problem. You have a defined input (incoming messages), a defined output (categorized and actioned messages), and immediate feedback — you can see exactly what the agent did the moment you open your inbox each morning.
One of the 138 users we've tracked in the OpenClaw community described their setup plainly: "Connected to my 365 account. Deletes, moves, archives, auto-drafts replies. Flags action items." That took one weekend to configure. It now runs every morning at 5:45 AM without any input from them.
What You Need Before You Start
This guide is built around the stack we deploy at BuildAClaw: OpenClaw running on a Mac Mini M4 with a local LLM via Ollama. This gives you full email privacy — no content hits a cloud API unless you explicitly route it there for specific use cases.
Here's the full ingredient list:
- Mac Mini M4 — base model works fine; 16 GB unified RAM handles Llama 4 Scout at 8B without breaking a sweat
- OpenClaw installed and running (defaults to
localhost:3000) - Ollama installed with a model pulled:
ollama pull llama4:scout - Email account with IMAP access enabled — Gmail, Microsoft 365, Fastmail all work
- App password or OAuth token for your email provider (not your main account password)
chmod 600 permissions. Never paste raw credentials into your OpenClaw config file. We'll show the exact pattern in Step 1.
If you're on Microsoft 365, you have a second and more powerful option: the Microsoft Graph API. It gives you structured read/write/flag access — you can query folders, update message metadata, and send drafts through a single authenticated token. We'll cover both paths below.
Step 1: Connect Your Email Account to OpenClaw
Gmail via IMAP
In your Google Account settings, go to Security → 2-Step Verification → App passwords. Generate a new app password for "Mail" and store it on disk:
echo "your_app_password" > ~/.openclaw/gmail_token && chmod 600 ~/.openclaw/gmail_token
In OpenClaw, create a new tool called gmail_imap pointing to imap.gmail.com:993 with SSL enabled. Reference the token file path — not the raw value — in your credentials field. This keeps the secret out of your agent config and out of any config syncs.
Microsoft 365 via Graph API
The Graph API path is more work upfront but gives you substantially more control. Register an app in Azure Active Directory (free), request Mail.ReadWrite and Mail.Send delegated permissions, and generate a client secret. Store it:
echo "your_client_secret" > ~/.openclaw/m365_secret && chmod 600 ~/.openclaw/m365_secret
OpenClaw includes a first-class Microsoft Graph connector in its integration library. Once authenticated, you get access to message categories, folder hierarchies, meeting requests, and the full conversation thread graph — not just raw IMAP message bodies.
Step 2: Define Your Triage Logic
This is where most people over-engineer it on day one. Start with five categories, not fifty. You can refine after you see the agent in action — the goal in week one is a working loop with real data, not a perfectly tuned system.
| Category | Action | Typical Trigger Signal |
|---|---|---|
| Urgent / Action Required | Flag + move to Priority, generate draft reply | "ASAP", direct question to you, deadline language |
| Meeting / Scheduling | Extract date + time, draft confirmation reply | Calendar invites, "are you free Thursday?", meet requests |
| Newsletter / Marketing | Archive immediately, never surface | Unsubscribe link present, bulk sender headers, Promotions-type senders |
| Invoices / Receipts | Move to Finance folder, no reply needed | "Invoice #", "Receipt", "Your order", "Payment confirmation" |
| FYI / CC'd Only | Archive, mark read, no reply generated | You're in CC, no direct question addressed to you |
In OpenClaw, you define this logic as a system prompt paired with a tool schema. The agent reads each message, calls a classify_email function that returns a structured JSON object, then routes to the appropriate action tool: archive_message, flag_message, draft_reply, or move_to_folder.
Here's the core classification prompt I've been iterating on for six months:
You are an email triage assistant. For each email, return a JSON object with: category (one of: urgent, scheduling, newsletter, invoice, fyi), confidence (0.0–1.0), requires_reply (bool), and draft_reply (string or null). Only generate a draft_reply if requires_reply is true AND confidence is above 0.75. If confidence is below 0.75, set category to "needs_review" regardless of other signals.
The confidence threshold is the most important line in that prompt. When the model is uncertain, the message moves to a "Needs Review" folder instead of being auto-actioned. This is your safety valve. You're not delegating blindly — you're delegating with a catch net that surfaces edge cases for human review.
Step 3: Schedule It to Run at 6 AM Every Morning
This is where local-first infrastructure earns its keep. Your Mac Mini M4 is always on, which means you don't need a cloud scheduler, a Lambda function, or a paid cron service. You have macOS, and macOS has launchd.
Create a plist at ~/Library/LaunchAgents/com.openclaw.emailtriage.plist:
<?xml version="1.0" encoding="UTF-8"?>
<plist version="1.0">
<dict>
<key>Label</key><string>com.openclaw.emailtriage</string>
<key>ProgramArguments</key>
<array><string>/usr/local/bin/openclaw</string>
<string>run</string><string>email-triage</string></array>
<key>StartCalendarInterval</key>
<dict><key>Hour</key><integer>6</integer>
<key>Minute</key><integer>0</integer></dict>
</dict></plist>
Load it with launchctl load ~/Library/LaunchAgents/com.openclaw.emailtriage.plist and it fires every morning at 6:00 AM without any further input.
If you'd rather skip the terminal: OpenClaw has a native Schedule tab in its UI where you can configure recurring agent runs with a cron expression (0 6 * * *) and a toggle. This is the path I recommend for most clients because it's easier to pause, adjust, or hand off without editing XML.
Real Results: What 30 Days of Automated Triage Looks Like
I ran this exact setup for 30 days on a real business inbox — 80 to 120 emails per day across multiple domains and threads. Here's the breakdown of what the agent handled:
30-DAY TRIAGE RESULTS — 80–120 EMAILS/DAY INBOX
- Newsletter / Marketing: 61% of total volume — auto-archived, never surfaced
- Invoices / Receipts: 11% — moved to Finance folder with zero manual sorting
- FYI / CC'd: 9% — archived, marked read, no reply generated
- Scheduling: 8% — drafts generated, 94% accepted and sent unchanged
- Urgent / Action Required: 11% — correctly flagged, drafts reviewed manually before send
- Needs Review (low confidence): 3% — ~3 messages/day requiring human judgment
- Daily time saved vs. manual triage: ~1.8 hours → 54 hours reclaimed over 30 days
The number that surprised me most: 94% of scheduling drafts were accepted and sent without editing. That's a local Llama 4 Scout model writing "Thanks, Thursday at 2 PM works great — I'll send a calendar invite" better than I'd manage in a tired morning fog. Once I saw that number stabilize over two weeks, I enabled auto-send for the scheduling category and never looked back.
The 3% "Needs Review" bucket averaged about 3 messages per day. That's my effective inbox now: 3 messages I need to consciously engage with, instead of 100+ I need to scan, categorize, and decide about. The cognitive load difference isn't linear — it's closer to exponential. Scanning 100 messages activates a kind of context-switching overhead that 3 messages simply doesn't trigger.
Scaling Up: What to Add in Month Two
Once the core loop is running reliably, these three upgrades add significant value without requiring a rebuild:
- Sender reputation memory. Add a lightweight SQLite table that tracks which senders you reply to, how fast, and how often. The agent uses this signal to up-weight importance for frequent contacts and deprioritize cold outreach — without you having to define every sender manually.
- Thread summarization for CC'd threads. Instead of just archiving long threads you're CC'd on, have the agent generate a 2-sentence summary and store it in a Notes field. Useful when a thread resurfaces and you need context fast.
- Hybrid routing for complex drafts. When a message requires a nuanced reply — a client complaint, a pricing negotiation, a sensitive HR thread — route it to Claude Sonnet 4.6 via API for a higher-quality draft, then keep everything routine fully local. This typically adds $8–$18/month for a busy inbox. Much cheaper than a VA and with no latency overhead on the 90% of mail that stays local.
This hybrid architecture — local LLM for triage volume, cloud model for high-stakes drafts — is the pattern we deploy for most BuildAClaw clients. You get the privacy and cost efficiency of local inference at scale, with cloud quality reserved for the moments that actually matter. For a deeper look at the cost structure, see How We Cut API Costs by Running AI Agents Locally on Mac Mini M4.
For teams with shared inboxes — support@, sales@, billing@ — the same architecture scales without re-architecting. Each inbox gets its own agent config with role-specific triage rules and escalation paths. We've deployed this for small teams where the agent handles 300–500 emails per day across three inboxes before anyone has had coffee. If you're coordinating multiple agents on one machine, Running Multiple AI Agents on One Mac Mini M4 covers the resource allocation side in detail.
Frequently Asked Questions
Want This Running in Your Inbox Next Week?
BuildAClaw deploys custom OpenClaw email triage agents on your own hardware — or your team's shared infrastructure. We handle the setup, prompt engineering, scheduling, and edge-case tuning so you don't have to spend a weekend figuring out IMAP auth and launchd syntax.
Most clients go from zero to a working triage loop within 5 business days. The first call is free — we'll map your inbox categories, define your triage rules, and tell you exactly what the agent will and won't handle before you commit to anything.
Schedule a Free Strategy Call →