2018: How Smart Compose, Smart Reply, and LLMs Transformed Email

By The EmailCloud Team |
2018 Innovation

For forty-five years, the fundamental interaction model of email did not change. You opened your inbox. You read messages. You decided what to do with each one. You typed replies. You clicked send. The tools improved — better interfaces, better search, better filtering — but the cognitive work remained the same. Every email required a human brain to read it, understand it, decide on a response, and compose that response. At 120+ emails per day for the average knowledge worker, this was a significant daily burden.

Then, starting around 2017, that model began to shift. Not all at once, and not for everyone, but unmistakably. Machine learning systems that could understand natural language began inserting themselves into the email workflow — suggesting replies, completing sentences, prioritizing messages, summarizing threads, and eventually drafting entire responses. The technology progressed from party trick to productivity tool to, by 2026, something approaching an autonomous agent that could manage significant portions of an inbox without human involvement.

The integration of intelligence into email is the most significant change to the email experience since the invention of webmail. And its implications — for how we write, how we read, and how we relate to the medium itself — are still unfolding.

The Simple Version: Starting around 2017, email programs started using machine learning to help people deal with their inboxes. First came suggestions for short replies like “Thanks!” and “Sounds good.” Then came autocomplete that could finish your sentences. Then came tools that could summarize long email threads, sort your inbox by importance, and write full draft replies. By 2026, these tools are sophisticated enough to manage large chunks of your inbox without you reading every message.

Priority Inbox: The First Step (2010)

Google’s Priority Inbox, launched in Gmail in August 2010, was the first widely deployed machine learning feature in email. Rather than showing all emails in chronological order, Priority Inbox used signals — who sent the email, how often the user interacted with that sender, the email’s content, and whether similar emails had been read or ignored — to classify incoming messages as “Important and unread,” “Starred,” or “Everything else.”

The feature was modest by later standards, but the underlying principle was revolutionary: instead of the user manually sorting every message, the system would learn the user’s priorities and sort automatically. Gmail was treating email triage as a classification problem — exactly the kind of problem machine learning excels at.

Priority Inbox’s accuracy was imperfect, and many users turned it off. But it established the concept that email clients could and should use intelligence to reduce the cognitive burden on the user. Every subsequent innovation in email intelligence built on this foundation.

Smart Reply: Three Taps to Done (2017)

In May 2017, Google launched Smart Reply in Gmail — a feature that used neural networks to analyze incoming emails and generate three short, contextually appropriate reply suggestions. An email saying “Can you meet at 3pm?” might produce suggestions like “Sure, that works,” “Sorry, I can’t make it,” and “Can we do 3:30 instead?”

Smart Reply was powered by a sequence-to-sequence neural network trained on anonymized email data. The model learned the patterns of email conversation — what kinds of replies follow what kinds of messages — and generated responses that were grammatically correct, contextually appropriate, and tonally neutral.

The impact on user behavior was immediate and measurable. Google reported that by 2018, Smart Reply was being used to compose roughly 10% of all Gmail replies on mobile devices. The feature was especially popular on phones, where the reduced keyboard made typing full responses more effortful. Tapping a three-word suggestion was faster than typing a ten-word reply, and for the significant percentage of emails that could be adequately answered with a short confirmation or acknowledgment, Smart Reply was a genuine time-saver.

Smart Reply also raised an interesting question that would become increasingly relevant as email intelligence advanced: if a machine generates the words, is it still your reply? The responses were not authored by the user in any meaningful sense — they were statistically generated completions selected from a set of predicted responses. The user’s contribution was selection, not composition. For short, functional replies (“Sounds good,” “Thanks for letting me know”), this distinction felt trivial. But as the generated text grew longer and more nuanced, the question of authorship would become more significant.

Smart Compose: The Autocomplete Revolution (2018)

At Google I/O in May 2018, Google unveiled Smart Compose — a feature that would prove more transformative than Smart Reply. While Smart Reply suggested complete short responses, Smart Compose worked inline, suggesting the next words or phrases as the user typed. Gray text would appear ahead of the cursor, and pressing Tab would accept the suggestion.

Smart Compose was trained to predict text continuations based on the email’s context — the subject line, the message being replied to, the words already typed, and patterns from the user’s own email history. It learned individual writing patterns: if a user consistently signed off with “Best regards,” Smart Compose would suggest that phrase as soon as the user typed “Best.” If a user frequently scheduled meetings on Thursdays, Smart Compose would suggest Thursday when the email context involved scheduling.

The effect was a subtle but meaningful acceleration of email composition. Studies estimated that Smart Compose reduced the time to compose an email by 10-15% for users who adopted it. More importantly, it reduced the cognitive effort of composition — the user did not have to retrieve the right phrase from memory; they only had to recognize it when the system offered it.

The Subject Line Optimization Era (2018-2022)

While Google was embedding intelligence into email composition, the email marketing industry was pursuing its own applications. Subject line optimization — using machine learning to predict which subject lines would generate higher open rates — became a standard feature of enterprise email marketing platforms.

Tools from Phrasee (now Jacquard), Persado, and others used natural language generation to produce subject line variations and predict their performance based on historical engagement data. An email marketer could input a brief description of the email’s content and receive dozens of generated subject line options, each with a predicted open rate based on the platform’s models.

The technology worked, to a degree. Machine-generated subject lines consistently outperformed human-written ones in A/B tests — typically by 5-15% in open rate improvement. The machines were better at identifying the linguistic patterns that drove opens: optimal length, effective use of urgency words, strategic deployment of personalization, and avoidance of spam triggers.

But the technology also revealed a limitation: optimizing for opens is not the same as optimizing for revenue, engagement, or brand perception. A subject line that maximizes opens through curiosity gaps or urgency language might attract clicks but damage brand trust over time. The machine could optimize the metric it was trained on; it could not optimize for the nuanced, long-term brand considerations that human judgment provides.

Neural Network Spam Detection (2019-2023)

The evolution of spam detection from Bayesian filtering to neural network-based classification represents one of the most impactful applications of machine learning in email. By 2019, Gmail’s spam detection system was processing over 100 million additional spam messages per day using deep learning models, achieving a false positive rate of less than 0.05% — meaning fewer than 1 in 2,000 legitimate messages were incorrectly classified as spam.

The neural network approach differed fundamentally from Bayesian filtering. Where Bayesian filters calculated spam probability based on individual word frequencies, neural networks analyzed entire messages in context. They could detect spam that contained no traditional trigger words by recognizing patterns in formatting, structure, sending behavior, and recipient engagement that were invisible to word-level analysis.

Google’s TensorFlow-based spam models could identify new spam campaigns within minutes of their first appearance, adapting to novel techniques without human intervention. This was a critical advantage: spammers who could evade rule-based and Bayesian filters by obfuscating trigger words found that neural network models detected the obfuscation patterns themselves. The arms race between spammers and filters had shifted decisively in the filters’ favor.

Apple Intelligence and Email Summaries (2024-2025)

Apple’s entry into email intelligence, announced at WWDC 2024 and rolled out with iOS 18 and macOS Sequoia, took a different approach than Google’s. Apple Intelligence introduced email summaries — using on-device large language models to generate concise summaries of long emails and email threads, displayed in the inbox list view so users could grasp the content of a message without opening it.

The feature reflected Apple’s privacy-first philosophy: the language models ran on-device (or in Apple’s Private Cloud Compute environment), meaning email content was never sent to third-party servers for processing. This addressed the privacy concern that had shadowed Google’s cloud-based email intelligence from the beginning: the question of who else could see your email when machine learning models processed it.

Apple Intelligence also introduced smart reply suggestions, notification prioritization, and the ability to adjust the tone of a drafted reply (make it more professional, more friendly, more concise). These features positioned the email client not just as a display layer but as an active collaborator in the communication process.

The LLM Agent Era (2025-2026)

The release of increasingly capable large language models in 2024 and 2025 — from Google, OpenAI, Anthropic, and others — catalyzed the most significant shift in email interaction since graphical interfaces replaced command lines. LLMs could not just suggest words or summarize messages; they could understand email in context, reason about appropriate responses, and generate replies that were indistinguishable from human-written text.

Microsoft integrated its Copilot capabilities into Outlook, offering features that could summarize email threads, draft contextual replies, extract action items, schedule follow-ups, and prepare meeting agendas from email discussions. Google’s Gemini integration in Gmail offered similar capabilities, with the added advantage of cross-referencing email content with Google Calendar, Drive, and other Workspace applications.

By 2026, the concept of an “email agent” — a system that could manage an inbox with minimal human oversight — had moved from speculative to practical. These agents could triage incoming mail, draft responses for human approval, flag messages requiring personal attention, extract and track commitments, and handle routine correspondence autonomously. The human role shifted from reading and responding to every email to reviewing and approving the agent’s work.

Implications for Email Marketing

The rise of email intelligence has profound implications for email marketing — implications that the industry is still processing. When a significant percentage of emails are read by machines rather than humans — summarized, triaged, and sometimes responded to without the recipient ever seeing the full message — the traditional metrics and strategies of email marketing require fundamental reconsideration.

Subject lines optimized for human curiosity may not perform the same way when a machine summarizes the email’s content regardless of the subject line. Carefully designed visual layouts lose their impact when a language model extracts the text and presents it as a summary. Personalization tokens that feel warm to a human reader are irrelevant to a machine that processes the semantic content without regard for the sender’s attempt at connection.

The marketers who will thrive in the age of email intelligence are those who optimize for value rather than attention. When machines can filter, summarize, and prioritize with increasing accuracy, the emails that reach human attention will be those that genuinely deserve it — messages that provide useful information, relevant offers, and real value. The death of reliable open rates, accelerated by both Apple Mail Privacy Protection and machine-processed email, is pushing the industry toward metrics that measure actual engagement and value delivery.

The age of getting attention through clever subject lines and FOMO-driven urgency is giving way to an age where the content itself must justify its presence in the inbox. In that sense, email intelligence may be the best thing that ever happened to email marketing — by forcing the industry to be genuinely useful, or be filtered away.

Infographic

Share this visual summary. Right-click to save.

How Smart Compose, Smart Reply, and LLMs Transformed Email — visual summary and key facts infographic

Frequently Asked Questions

What is Gmail Smart Compose and when did it launch?

Gmail Smart Compose is a feature that suggests sentence completions as you type an email, allowing you to accept suggestions by pressing Tab. It launched at Google I/O in May 2018 and rolled out to all Gmail users over the following months. Smart Compose uses machine learning models trained on anonymized email data to predict the next words or phrases a user is likely to type, taking into account the email's context and the user's writing patterns.

How does AI spam detection differ from traditional Bayesian filtering?

Traditional Bayesian spam filtering, popularized by Paul Graham in 2002, calculates spam probability based on individual word frequencies. Modern AI spam detection uses deep neural networks that analyze entire messages in context — understanding sentence structure, semantic meaning, sender behavior patterns, and network-level signals simultaneously. AI models can detect spam that uses no traditional trigger words by recognizing patterns in message structure, formatting, sending patterns, and recipient reactions that rule-based and Bayesian systems would miss.

What are AI email agents and how do they work in 2026?

AI email agents are large language model-powered tools that can read, summarize, categorize, draft replies to, and take action on emails with minimal human input. By 2026, major platforms including Apple Intelligence, Google Gemini in Gmail, and Microsoft Copilot in Outlook offer varying degrees of agentic email management — from summarizing long threads and drafting contextual replies to scheduling meetings and extracting action items. These agents represent a fundamental shift from email as a manual read-and-respond activity to email as a managed workflow with human oversight.

Stay ahead of the inbox

Weekly tips on deliverability, automation, and growing your list. No spam, ever.

No spam. Unsubscribe any time. We respect your inbox.