Microsoft acknowledged a software bug that let Copilot Chat pull in and summarize emails labeled as confidential, even when company data loss prevention rules should have blocked access. The glitch affected the Work tab in Copilot Chat for Microsoft 365, raising fresh questions about how AI features intersect with enterprise compliance controls.
Microsoft explains the bug that exposed confidential emails
According to Microsoft, a coding issue caused items in users’ Sent and Drafts folders to be ingested by Copilot despite being tagged with sensitivity labels. Those labels—managed through Microsoft Purview—are designed to enforce information protection policies across files and emails, including preventing content from being used by AI assistants when rules forbid it.
- Microsoft explains the bug that exposed confidential emails
- Why this matters for DLP controls and compliance risk
- Scope of the issue and organizations that were affected
- What likely went wrong under the hood of Copilot
- Immediate mitigations and checks for Microsoft 365 admins
- Microsoft’s response and what it means for trust
- The bottom line on the Copilot email labeling bug

The company said it began rolling out a fix earlier this month and is monitoring remediation while contacting impacted customers. Reporting by BleepingComputer highlighted the flaw, which surfaced in Copilot Chat’s Work context for Microsoft 365 apps such as Outlook, Word, Excel, PowerPoint, and OneNote.
Why this matters for DLP controls and compliance risk
Enterprises rely on sensitivity labels and DLP policies to restrict how information flows—especially when it contains personal data, financial records, or protected health information. In theory, if an email is tagged “Confidential” or “Highly Confidential,” Copilot should either refuse to summarize it or exclude it from retrieval altogether. The bug effectively short-circuited that promise.
From a governance standpoint, this is an “exfiltration-by-prompt” risk: a user might ask Copilot to summarize their communications and unknowingly surface protected content that should remain off-limits to AI systems. For regulated sectors subject to frameworks like GDPR or HIPAA, even inadvertent exposure via an assistant can carry investigative and reporting obligations.
Scope of the issue and organizations that were affected
Microsoft has not disclosed how many organizations were impacted. Public reports indicate the UK’s National Health Service was among those affected, underscoring the stakes for institutions handling sensitive data at scale. While there’s no indication of an external breach, the incident shows how AI features can widen an internal exposure surface if guardrails fail.
The issue appears tied specifically to how Copilot evaluated sensitivity labels on items in Sent and Drafts during retrieval and summarization. That distinction matters because those folders often contain in-progress negotiations, patient updates, or executive correspondence—exactly the sort of material companies try to fence off with strict labeling policies.
What likely went wrong under the hood of Copilot
Copilot for Microsoft 365 relies on retrieval-augmented generation, pulling context from a user’s data via Microsoft Graph and then generating a response. If the enforcement step that checks sensitivity labels and DLP scopes misfired—particularly for certain folders—Copilot could have included content it should have filtered. These checks must happen consistently at query time; any gap, even a nuanced folder exception, produces outsized risk.

In practice, that means a single logic error in authorization or label evaluation can lead to improper summaries, even if the underlying storage and access permissions were set correctly. It is a reminder that AI-layer policies need the same rigor as identity and access management.
Immediate mitigations and checks for Microsoft 365 admins
Security teams should review audit logs in Microsoft Purview and Outlook to identify Copilot-assisted actions on labeled content, with special attention to Sent and Drafts. Consider temporarily tightening label policies to explicitly block AI-driven summarization of certain scopes until the fix is verified in your tenant.
Admins can also validate Copilot behavior using test mailboxes with dummy confidential content and ensure prompts that should be denied are consistently rejected. Where feasible, enforce Conditional Access for Copilot experiences, review Graph permission assignments for connected apps, and re-educate users on handling sensitive material in draft workflows.
Microsoft’s response and what it means for trust
Microsoft says the fix is rolling out and it is engaging organizations to confirm results. The company has faced scrutiny over other AI-related privacy concerns, including Windows Recall and Copilot Vision, which critics argued collected or exposed more than users expected. This incident adds to calls from CISOs and regulators for stronger “privacy by default” in AI assistants.
Enterprises don’t just need assurances that data is secure at rest—they need verifiable, testable controls on how AI features interpret and transform that data. That includes robust label enforcement at query time, transparent denial messages when content is blocked, and clear tenant-level switches that let admins reduce risk when issues arise.
The bottom line on the Copilot email labeling bug
The Copilot email labeling bug is a narrow technical fault with broad implications. AI assistants amplify productivity, but they also amplify the consequences of small policy gaps. Organizations should treat AI-layer controls as first-class security components, continuously test them, and assume that sensitive drafts and sent mail demand the strictest enforcement.
Until the rollout is fully validated, prudent teams will dial up monitoring, retest denial scenarios, and reinforce user training. Trust in workplace AI will depend less on what these tools can do and more on what they refuse to do—reliably, every time, especially when the data is confidential.
