Selecting a dish without seeing it first—assuming you do so—can be tricky. Yelp is hoping to eliminate that guesswork with a new artificial intelligence feature that scans through a physical restaurant menu and instantly shows you what each dish looks like, along with snippets of what diners have said about it. The update transforms a paper menu into an interactive, visual guide that can help make ordering faster, smarter, and more fun.
The feature comes amid wider updates to Yelp’s AI assistant, which can now field natural-language questions about restaurants, bars, attractions, and retailers. But the menu-scanning experience is the standout — a practical application of computer vision and retrieval AI that siphons from the platform’s deep well of user photos and reviews.

How the visual menu feature works on Yelp’s app
Aimed at a printed menu, the app calls up item names by reading text with optical character recognition and pattern matching. On-screen bubbles accompany dish titles. Tap a bubble and you will be shown a gallery of actual diner photos and brief review highlights for that same item.
Follow up with the assistant: “Is the cacio e pepe really peppery?” or “How big is this pork katsu?” and it will also scrape details from reviews, business pages, and photos. Since the system leverages context that’s already on the platform, it can resolve ambiguous items — handy if a menu offers multiple tacos or seasonal options.
The feature draws on millions of community photos to fuel its visual matches. That long tail matters: it helps the assistant surface real images of local specialties or chef signatures rather than highly polished marketing shots, so that diners get a better sense of what portion size, plating, and spice to expect.
Why Visual Search Is Significant For Dining
Photos reduce uncertainty. Industry groups like the National Restaurant Association and reservation platforms have long reported that photos, along with clear descriptions, greatly impact ordering decisions. It’s for the same reason that on social media platforms like Instagram and TikTok, where photos are digital currency, restaurant trends can take off — what we see informs our expectations better than what we read.
For diners, witnessing a mole poblano’s thickness or the kiss of char on a birria quesadilla fills in practical questions that words often overlook. For restaurants, surfacing the most-photographed or most-mentioned items could nudge discovery and upsells without actually changing the menu.
Under The Hood And The Challenge of Accuracy
At a technical level, the assistant uses optical character recognition, entity resolution (matching menu text to known dish names), and multimodal retrieval to match items with relevant images and quotes. And it draws on a media signals database that includes “most mentioned” from Yelp, or words and phrases most commonly used across reviews — helping to surface what diners are more likely to care about (crispiness, level of heat, or portion size).
Edge cases remain. Hand-lettered menus, excessive typography, and wild dish names can cause recognition to stumble. Cross-language menus add complexity. And because users’ photos can vary so much in quality, several items might have light visuals to start with. The system will improve with the accumulation of more images and reviews to work from, and as the assistant learns patterns in typical menu layouts between cuisines.

Transparency will matter. Clear labeling that the photos come from the community, methods for out-of-home business owners to propose corrections, and easy ways for users to report bad fits can help ensure a trusted experience.
More Conversational Search Throughout The App
In addition to menu scanning, Yelp’s assistant also now interprets conversational and voice-based searches right from within search. You might say, “Where is a quiet wine bar with outdoor seating?” or “Which salons do the best curly cuts?” The app illuminates “most mentioned” services across more than one hundred categories, allowing that comparison of a mechanic’s most common repairs or a spa’s most requested treatments to be made.
It’s also promoting AI voice agents for companies. Restaurants can divert calls to an AI host that answers questions, books tables, takes special requests, and changes reservations. For service providers, an AI receptionist collects project information and qualifies leads, then delivers summaries to the business’s inbox. Similar phone agents have emerged from companies like Square and Kea, suggesting that broader changes toward AI taking on some routine front-of-house tasks are afoot.
What It Means For Diners And Restaurants
The payoff to diners is transparency. Visual menus reduce ordering time, help groups converge quickly on the same order, and unlock several accessibility benefits when a voice interface is integrated — especially for people with low vision or who aren’t familiar with cuisines.
For operators, the assistant can help surface hero items, cut down on repetitive phone calls, and reduce orders getting lost in translation. It also introduces a new feedback loop: if photos and reviews are repeatedly highlighting certain elements — like, say, a burger with buns that fall apart — owners can tweak things and monitor how sentiment shifts over time.
The competitive context among major tech platforms
Big tech companies have been heading in the same direction. In Google Maps, popular menu items are already highlighted and, with features like Google Lens, you can have menus translated or annotated. Apple’s Visual Look Up and the camera-first tools of social platforms all suggest that discovery is increasingly visual and conversational. Yelp has an advantage in its huge reservoir of food photos attributed to specific dishes at specific places — a data set that is well-suited for this use case.
And if the execution is there — speedy recognition, correct matches, and respectful privacy practices — it will become the feature that people show their friends at the table. It makes the hush once menus land feel less about flip-flopping indecision and more like a guided tour of what’s worth eating, fueled by the pictures and opinions of those who’ve preceded you.