Apple has developed an in-house ChatGPT-style app called Veritas to test a wide-ranging overhaul of Siri shortly, Bloomberg’s Mark Gurman reports. The tool provides Apple’s AI teams a secure, private sandbox to stress test conversational features, reliability and guardrails before they land in the voice assistant available to hundreds of millions of people.
What Veritas Shows About the Overhaul of Siri
Veritas is not for the public. It serves as a test bench for a new system internally referred to as Linwood that merges Apple’s internally developed large language models with an external third-party model for tasks where the latter excels. This hybrid approach also reflects Apple’s overall “best model for the job” strategy of using on-device inferencing, with workloads forwarded to Private Cloud Compute when local hardware constraints are hit, something we have already heard was planned for Apple Intelligence.

An internal chatbot format provides teams a fast way to iterate on conversational flows, context handling and grounding of personal data, then translate those into Siri experiences that resonate with iPhone, iPad and Mac users.
Look for a greater emphasis on determinism and privacy than is typical with open web chatbots, and expect to find features designed to be explainable, interruptible, and deeply connected with system frameworks.
Key AI features Apple is testing internally with Veritas
One is search on the device and in personal information. As part of the war games, Veritas is being used to test how well models can pull in and summarize information in apps like Mail, Photos, Music, Calendar and Notes without leaking data or exhibiting biases. Imagine commands like “Show the playlist Sam shared after our last road trip” or “Summarize the unread emails from finance and send a response.”
Another frontier is generative assist embedded in an app. Photo edits like object removal or relighting, natural-language document cleanup and context-aware suggestions within Messages or Pages are being tested through the chatbot before they’re wired into Siri’s voice and system UI. Apple’s internal reviewers are said to be emphasizing low hallucination rates, continuous style control and clear user consent for any action that modifies content.
These capabilities depend critically on strong grounding and understanding of intent. Apple has highlighted research around resolving ambiguous references (“this photo,” “that email thread”) and coordinating action between separate apps. Veritas offers a high-throughput means of testing those capabilities with synthetic and real-world scenarios prior to shipping them as Siri intents and APIs for developers.

Why Should You Create a Tool Like ChatGPT Internally?
Having the evaluation environment all to itself allows Apple to tune for three pillars it frequently touts: privacy, latency and reliability. We internally dogfood to establish a baseline for overall response time end-to-end, measure our energy impact on battery life and harden content filters without shipping unpolished code. It also opens the door for specific red-teaming against sensitive domains, such as health, finance or children’s content.
It’s also practical for hybrid model sourcing. Apple’s models have progressed rapidly — earlier reporting referred to an “Ajax” family trained on in-house infrastructure — but there could still be use cases for a third-party model. Apple can attempt to selectively route those and still adhere to the strict principles of data minimization and transparency. The quantified success measures are grounded accuracy, refusal appropriateness, latency and the cost per token, where on-device performance goals effectively co-drive both Neural Engine and memory footprint.
Competitive stakes and the roadmap for Siri’s overhaul
Big Tech competitors have already incorporated generative assistants into flagship products. Google is pushing Gemini further into Android and Workspace, while Microsoft keeps adding Copilot to Windows and Office. Apple’s advantage is distribution and trust: The company says its installed base includes over two billion active devices, and its privacy architecture — leaving as much on-device … using an open cloud compute function — could distinguish how AI assistants manage personal data.
Gurman’s story indicates the revised Siri is a work in progress that has yet to meet an internal schedule, and previous estimates have been missed as Apple aims for stability and hopes to have greater control over the scope of the features it will offer. The company’s pattern is to introduce foundational capabilities, then expand further into deeper app actions and developer hooks once reliability clears a high bar.
What to watch next as Apple retools Siri with Veritas
Signals to watch will be new Siri frameworks in developer betas, expanded App Intents for third-party apps and tighter integration between Siri and Apple Intelligence features, such as writing tools or image creation. There will be hardware considerations, too: the newer chips featuring bigger Neural Engines and faster memory bandwidth will expand what can run locally without needing to be offloaded to the cloud.
As Veritas continues to verify features at scale, I anticipate a Siri that is less of a command parser and more of a contextually aware teammate; one that can search personal information responsibly, carry out multi-step tasks, and demonstrate its own reasoning. Apple’s playbook is a familiar one: move slowly, ship tightly integrated experiences and lean on privacy as a product feature. The significant difference this time around is the assistant itself — and Veritas is its proving ground.
