Qualcomm’s chief executive is mapping out a world where artificial intelligence is not just ambient, but able to feel and taste. The company’s pitch is simple: The next wave of “agentic” AI — machines that perform actions a human would do, but does not need to be explicitly directed to do it — works best when our devices are able to see, hear and comprehend the world around them, and cameras act as the sensors that make this possible. Phones are still the anchor — but smart glasses and connected cars will serve as capacities that could expand AI’s field of view, making routine moments into cues for a helping hand.
It’s a vision born of on-device intelligence. Instead of sending every request to the cloud, Qualcomm is banking on locally run models that sift through what is important, act on context and minimize latency. It’s as much a pitch about pragmatics — battery life, bandwidth, responsiveness — as it is one of reimagining the way we interact with computing.

Agentic AI needs vision to perceive, think and act
Agentic systems function through perceiving, thinking, and acting. A microphone alone doesn’t capture much context, however; cameras fill in the missing dimension. Qualcomm’s prototypes envision glasses that discreetly conjure up reminders as you go about your day, and cars that fully comprehend the same street scene you do and spit out options without a search box. Gesture inputs through rings or touch, along with eye tracking and voice, cement a more natural interface.
This is bigger than one company. Already Meta has demonstrated on-device multimodal AI in its Ray-Ban smart glasses that can identify objects and read scenes aloud. In phones, multiple rear cameras and depth sensors are standard features, resulting in a treasure trove of data for AI to parse. In cars, S&P Global Mobility figures mainstream autos now come with six to eight cameras, while premium trims may have more than a dozen — there to help monitor the driver or park automatically or offer advanced safety features.
Counterpoint Research predicts shipment volumes of generative AI smartphones to exceed 100 million this year and the billionth device threshold by mid-decade. The through-line: perception-rich hardware, combined with models that are able to fuse what the device sees and hears with the questions you ask.
Why on-device AI is important for privacy and speed
Running models locally solves three problems: it keeps sensitive data on your device, removes the round-trip to a data center and saves you cloud costs. One example is Google’s Gemini Nano, which filters out scam calls and proofreads text on Android phones without shipping audio or keystrokes off for processing. When it works as intended, “AI as the interface” seems immediate and private.
The hardware is catching up. Microsoft’s Copilot+ PC effort established a 40+ TOPS NPU baseline for Windows laptops and Qualcomm’s Arm-based PC silicon was one of the first to clear that bar. In phones, mobile NPUs have progressed to dozens of TOPS and make low-bit quantized versions of 7B-parameter models tractable to run interactively. Already, Qualcomm has shown off large language and vision models producing images and summaries right on handsets — no network connection required.
This shift doesn’t make the cloud disappear; it sizes the cloud correctly. Local models will perform routine tasks and deliver immediate feedback, while larger cloud-based models will engage in more complex, episodic work. The result: faster responses and fewer privacy trade-offs.
Cars become context engines with camera-first AI
It’s in automotive that Qualcomm becomes impossible to avoid, with its camera-first AI vision. Driver monitoring together with object detection and parking assistance all rely on arrays of sensors and real-time inference. Already, its Digital Chassis platform provides infotainment and ADAS in cars from dozens of brands, and the demand for compute is only rising as features are heaped onto vehicles.

Safety remains the counterweight. The National Highway Traffic Safety Administration recorded 3,308 U.S. deaths from distracted driving in 2022, a reminder that pasting advice on a windshield can’t be allowed to add to distractions. The job for AI is to decrease the cognitive load — summary, filters and timely suggestions — rather than increase it.
Networks, storage and 6G must scale for agentic AI
And to keep agentic AI in sync across phones, glasses and cars, both connectivity and memory have to scale up.
Faster Wi‑Fi throughput along with 5G Advanced (and, eventually, 6G) will matter; so will speedier storage and RAM to help keep models resident and responsive. Anticipate sustained movement to LPDDR6‑level memory and next-gen UFS, as companies go after lower latency and more bandwidth on battery budgets.
On timelines, the ITU has positioned IMT‑2030 as the roadmap for 6G, with standards forming toward the end of this decade and commercial launch occurring at about the point when we start seeing billboards for 2030. Qualcomm has indicated that it hopes to have test hardware in the field well ahead of that curve. Even then, though, the near-term gains will be in smarter use of today’s 5G and Wi‑Fi — and local caching so your devices can handle tasks without needing a connection for each one.
Privacy policy and social license for everyday AI
Smart AI is only scalable if the public has trust in it. The European Union’s AI Act sets stringent requirements for high‑risk systems and limits how biometric categorisation and remote identification can be used. Already, GDPR and California’s privacy laws mandate minimization and transparency. Much of that is taken for granted by modern consumers who are acquainted with LED indicators, shutter sounds, perhaps — look to goggles for clearer recording cue advice and less forgiving defaults on what gets stored where and for how long.
The way forward is policy combined with product design: more on-device processing, explicit consent flows and secure enclaves that wall off camera data from apps that don’t require it.
The competitive picture in ambient AI and devices
Qualcomm’s gambit is that the winners in ambient AI will be those who can deliver power-efficient NPUs, fit tight thermal envelopes and offer developer tools to make building multimodal, on-device apps easy. Google is promoting an approach to their silicon with Tensor, Gemini across Android; Samsung is doing a ton of committed on-device features; Apple has been leading with private-by-design approaches with Apple Intelligence. The connective tissue is a theme of “convergence”: Phones, PCs, glasses and cars merging and sharing context so that the assistant can decide what you want before you ask.
If that sounds like a future in which AI — and cameras — are ubiquitous, well, that’s the idea. The challenge now is how to make these helpers useful, respectful and invisible enough for them to be experienced as intuition rather than technology.
