Artificial intelligence just pulled off something that a lot of humans wouldn’t be able to do. It passed a mock version of the CFA Level III exam, the last in a series of notoriously tough tests that gauges whether (or not) you are fit to manage other people’s money, and help them get rich or at least retire as comfortably as possible. That milestone poses an uncomfortable question for Wall Street and the buy side: If models can get over that bar, what does that mean for the analyst job?
AI Models Beat A Benchmark That Humans Struggle To Pass
At New York University’s Stern School of Business, the researchers working with GoodFin assessed 23 top models from labs like OpenAI, Google, Anthropic, Meta, xAI and DeepSeek. According to their arXiv preprint, several models cleared the passing level on a realistic mock of the CFA Level III exam — the stage generally considered the profession’s toughest barrier.

OpenAI’s o4-mini led the pack with a score of 79.1%, with Google’s Gemini 2.5 Flash coming in second at 77.3%. The paper also observes that the pass rate needed for passing the exam is generally around 63%. So for perspective, the CFA Institute quoted a 49% human passing rate on a recent Level III sitting; this much human brain power is required to conquer the exam today.
Interestingly, most performed in the low- to mid-70s on multiple-choice questions, but performance diverged significantly on the essay section. Only reasoning-oriented systems — a class of systems that attempts to decompose and plan through reasoning on complex steps — had significantly strong constructed responses.
Essays Expose The Limits Of Automated Judgment
Level III focuses on the application of higher-order decision-making: constructing and explaining an Investment Policy Statement, adapting asset allocations to client constraints, assessing risk exposures, and defending trade-offs. This is much more like the day-to-day work of an analyst or a wealth manager than learning formulas by heart. The coverage of essays indicates the frontier now lies with storytelling reasoning, not rote recall.
Still, caveats matter. A mock exam is not a proctored environment, and essay grading can be subjective. Scores can be inflated by a risk of contamination with training data — models having seen the same sorts of materials. Actual candidates are under time pressure, have only digital aids, and no outside tools; a fair comparison would have to reflect those conditions.
What This Means for Financial Analysts Today
Exposure is real. Analysts and advisers were named as among the most vulnerable (matching research from giant tech companies and academic partners which has often fingered finance roles). But “susceptible” is not tantamount to “obsolete.” The data suggest that tasks are being automated, not the role.

Anticipate fast offloading of first-draft work: decoding earnings calls, combing filings, composing portfolio rationales, creating scenario analysis and policy constraint checks. What humans can continue to hold dear is client context and credibility — interpreting intent, reconciling ambiguity, negotiating risk tolerance and taking fiduciary judgment. Even the founder of GoodFin has stressed that machines have trouble with nuance and interpersonal cues, the soft data behind where a portfolio manager decides to place his bets.
How AI Is Already Remaking The Investment Desk
Big institutions are transitioning from pilots to production. Morgan Stanley Wealth Management developed a GPT-based assistant to gateway firm research to advisers in seconds. Bloomberg has developed domain-specific models for parsing financial text, and risk platforms like Aladdin have added machine learning into stress-testing and attribution. The throughline is evident: the speed to insight is compressing.
But constraints are real. Compliance requires evidence and audit trails, explainability, and it especially wants to see all that under Reg BI or MiFID II-type regimes. These aren’t hypothetical problems when you’re talking about client assets and potential liability to regulators. And the firms gaining are the ones that are combining models with strong data governance, human review and clear escalation paths.
Bottom Line For Careers In Finance and Investing
The ability to pass a CFA Level III mock demonstrates that frontier models can think through various high-level portfolio management concepts and formulate logical explanations. That is a milestone. It’s not proof, all by itself, that AI can shoulder client responsibility, calibrate judgment in the face of uncertainty or navigate the behavioral side of investing in drawdowns and regime shifts.
Analysts should not be complacent, but not panicked. The winning stance is augmentation: figure out how to supervise model outputs, develop repeatable prompts and checklists, stress-test AI-generated theses, document rationales as if a regulator were going to read them — one might. In this new division of labor, machines bring scale to the grunt work; human beings provide context, ethics and moral accountability.
