FindArticles FindArticles
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
FindArticlesFindArticles
Font ResizerAa
Search
  • News
  • Technology
  • Business
  • Entertainment
  • Science & Health
  • Knowledge Base
Follow US
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
FindArticles © 2025. All Rights Reserved.
FindArticles > News > Technology

Anthropic Releases Claude Sonnet 4.5 with Safer Coding

Bill Thompson
Last updated: October 28, 2025 6:06 pm
By Bill Thompson
Technology
7 Min Read
SHARE

Anthropic’s newest model, Claude Sonnet 4.5, comes ready with a clear message: take coding further while dialing back behaviors that users don’t want—from sucking up to outright serious misdirection. The release focuses on practical developer workflows, more robust safety defenses, and new agent-building tools intended to take Claude from being a helpful assistant to a reliable teammate for actual engineering work.

What’s New for Builders in Claude Sonnet 4.5

Sonnet 4.5 has brought in something called checkpoints, a versioning-style feature that enables developers to save, branch, and revert their work directly within the chat. It is the “lightweight safety net” I’d like for iterative coding: test a refactor, revert if necessary, taking as much context as possible. It offers a built-in terminal view with support for live runs and rapid diagnostics, and context controls to help the model direct attention to the most important files or instructions.

Table of Contents
  • What’s New for Builders in Claude Sonnet 4.5
  • Agentic Coding Benchmarks and Early Scores Reported
  • Custom Agents and Early Computer-Use Capabilities
  • Less Sycophancy and Clearer Answers, Says Anthropic
  • Pricing, Positioning, and Availability for Sonnet 4.5
  • Bottom Line for Teams Evaluating Claude Sonnet 4.5
Logo for Claude Sonnet 4.5 featuring a white hexagonal grid and abstract hand and head outline on a warm orange background.

Anthropic also added document creation in the chat interface—presentations, spreadsheets, and text docs—so engineering notes, sprint summaries, or runbooks can be created without having to toggle over to tools elsewhere on a screen. The company says the features are available to all paid plans, suggesting an effort to normalize a more powerful workspace rather than hiding power features behind a niche tier.

Agentic Coding Benchmarks and Early Scores Reported

According to internal evaluations shared by the company, Sonnet 4.5 scored a 77.2% on an agentic coding benchmark, coming out ahead of the predecessor Claude Opus 4.1 with a score of 74.5%, and Codex as implemented in OpenAI’s GPT-5 on that test. Benchmarks point the way, not reveal destiny; real-world repos can be messy; performance is often an artifact of stack and tooling, and indeed discipline around testing. Still, the leap indicates Sonnet 4.5 is even better at multi-step tasks such as scaffolding services, wiring dependencies, and seeing through fix-and-verify loops.

For teams, this means more frequent iteration. For example, you can upgrade a codebase to another runtime, updating config files and patching breaking changes one test after another. When an experiment fails, those cycles become faster to run—and to unwind—with checkpoints and a terminal view.

Custom Agents and Early Computer-Use Capabilities

With the Claude Agent SDK, developers can build custom task-specific agents that combine Sonnet 4.5, tools, retrieval, and bespoke policies. That might be a release engineer bot reading CI logs and emitting pinpoint pull requests, or support agent triagers scratching the surface of people problems by reproducing errors and summarizing root causes for human users.

Anthropic is also “applying the model to more general use of computers” with a Chrome extension that lets the model browse sites and carry out simple tasks, like filling in spreadsheets. Access is being made available to customers on the more expensive Max plan on a waitlist basis. Those capabilities are early-stage, and the company is touting strengthened defenses against prompt injection and other adversarial instructions—risks more broadly called out by security researchers and standards body NIST.

The text Sonnet 4.5 with a coral -colored asterisk logo on the left, centered above three buttons labeled Claude app, Claude Developer Platform, and C

Less Sycophancy and Clearer Answers, Says Anthropic

Anthropic’s Sonnet 4.5 system card in turn takes a very à la carte approach to decreasing these trust-undermining behaviors—sycophancy, deception, power-seeking, and fostering delusional belief. In plain terms, that means fewer flattering-but-wrong responses, fewer unearned confidences, and more explicit refusals when the model does not have enough evidence. The company says Sonnet 4.5 is its least prone model to engage in those behaviors, and that it should also be more interpretable than other widely used large language models.

Cross-comparisons between top labs “reproduce the trend,” he said. When OpenAI and Anthropic tested each other’s systems, Claude was less liable to produce sycophantic or harmful outputs, according to their assessments. There are no silver bullet models, but safety work that constrains these behaviors can significantly boost daily reliability for teams shipping code.

Pricing, Positioning, and Availability for Sonnet 4.5

Notably, Anthropic did not change Sonnet 4.5’s API pricing from Sonnet 4, at $3 per million input tokens and $15 per million output tokens, which will appeal to engineering leaders tracking cost-performance ratios on multi-agent pipelines. The company is also pitching Claude as a general workplace assistant, not just for engineers—it’s emphasizing use cases in finance, cybersecurity, and law.

That approach contrasts with usage in other places: an OpenAI study found the overwhelming majority of ChatGPT conversations are non-work-oriented. Anthropic, meanwhile, is embracing enterprise bona fides (industry reports say that major tech companies are using it internally and that the company is ramping up brand marketing in both streaming and live events).

Bottom Line for Teams Evaluating Claude Sonnet 4.5

I’m a big fan of the whole Claude Sonnet 4.5 thing: improved agentic coding, mutually reversible workflow, less incoherence in the workspace, and disapproval for deceptive or obsequious BS that gets pulled by some people.

Ultimately, the best test for developers is to use it in anger—point it at your harnesses, try against your ugliest repositories, and see how well it performs under constraints and failure. If the gains stick, Sonnet 4.5 will seem less a conversationalist and more an honest-to-goodness junior engineer who uses simplicity to know what it doesn’t know.

Bill Thompson
ByBill Thompson
Bill Thompson is a veteran technology columnist and digital culture analyst with decades of experience reporting on the intersection of media, society, and the internet. His commentary has been featured across major publications and global broadcasters. Known for exploring the social impact of digital transformation, Bill writes with a focus on ethics, innovation, and the future of information.
Latest News
The Impact of Gig Workers on Traditional Payroll Systems
XGIMI Debuts Personalizable Memomind Smart Glasses
AirPods Max Stand Comparison Shows Best Bang for the Buck
Can Tenants Apply for the Free Boiler Scheme?
Windows 10 Users Are Being Targeted With New Upgrade Offer
Car Repair Costs Are Up 33%—Here’s How Digital Workshop Manuals Can Save You Thousands
Unplug These 7 Common Gadgets to Slash Your Power Bills
NextSense Showcases EEG Sleep Earbuds at CES 2026
FCC Clears 7,500 More Starlink Satellites
OpenAI Asks Contractors To Upload Real Past Work
Microsoft Permits Admins to Uninstall Copilot With Conditions
Scams on Instagram Offering Password Reset Services Increase
FindArticles
  • Contact Us
  • About Us
  • Write For Us
  • Privacy Policy
  • Terms of Service
  • Corrections Policy
  • Diversity & Inclusion Statement
  • Diversity in Our Team
  • Editorial Guidelines
  • Feedback & Editorial Contact Policy
FindArticles © 2025. All Rights Reserved.