Microsoft’s latest DirectX Agility SDK update introduces Shader Execution Reordering, a developer-facing tool that can significantly raise frame rates in ray-traced workloads on supported GPUs. Early numbers shared by Microsoft and third-party testers point to sizable gains on new Intel and Nvidia hardware, hinting at a fresh path for studios to lift performance in path-traced scenes without sacrificing visual fidelity.

What Shader Execution Reordering Actually Does

Ray tracing is notoriously hard on GPUs because rays bounce unpredictably and branch into many different code paths. That divergence kills parallel efficiency. Shader Execution Reordering (SER) attacks the problem by dynamically regrouping similar ray workloads so the GPU executes them more coherently, keeping more cores busy and cutting stalls. The result is less time wasted on divergent threads and more time doing useful shading—especially in path-tracing where secondary and tertiary bounces explode in complexity.

Early Performance Numbers Hint at Big Gains With SER

In internal tests described by Microsoft’s DirectX team, enabling SER on an Intel Arc B‑Series GPU delivered up to 90% higher frames per second in a demanding path-tracing demo. The company also reported a 40% FPS bump on an Nvidia GeForce RTX 4090 in the same scenario when compared with standard ray sorting. Separately, an independent run by X user Osvaldo Pinali on a GeForce RTX 5080 showed an 80% uplift over the baseline implementation.

Important caveat: those gains come from a Microsoft sample called D3D12RaytracingHelloShaderExecutionReordering—useful for stress testing but not a 1:1 proxy for shipping games. Still, the scale of improvement suggests that in scenes with heavy ray traffic—think path-traced global illumination, reflections, and complex materials—SER could be a potent lever.

Why Nvidia and Intel Benefit First From SER

Nvidia’s recent architectures have promoted SER-style scheduling, and the company has worked with studios to wring more coherence out of ray workloads in flagship titles. Intel’s Xe architecture also leans on robust thread scheduling and hardware-assisted sorting, positioning Arc discrete cards and upcoming Xe3 integrated GPUs to capitalize on SER through driver and API support. Industry reports from TechPowerUp and other developer briefings corroborate that these vendors can tap into SER’s advantages right away.

The underlying theme is hardware and driver maturity. SER depends on quickly re-batching rays—if your GPU and driver can shuffle and synchronize threads efficiently, you reap larger dividends. That’s why the big wins so far skew toward newer architectures with sophisticated schedulers and ample RT acceleration.

What About AMD’s Support and Current SER Limitations

AMD has implemented the API, but current RDNA-based Radeon drivers note that RX 9000 series “supports API but doesn’t reorder,” signaling that today’s hardware won’t fully exploit SER’s dynamic reshuffling. Translation: you may see correctness and integration benefits, but not the big FPS spikes showcased on rival silicon. If AMD designs future GPUs around reordering, the door is open for similar gains once the hardware path exists.

This split isn’t new in graphics. Vendor-agnostic features often arrive before every architecture can accelerate them equally. The upside is that standardizing SER through DirectX makes it easier for engine teams to target one API rather than juggling vendor-specific extensions.

How Developers Can Use SER in DirectX and Games

Because it ships via the Agility SDK, SER can reach players through game updates without requiring a full OS upgrade. Microsoft’s DirectX sample demonstrates how to tag and sort rays within a DXR pipeline, giving teams a reference path to prototype inside their own engines. Expect the biggest wins in passes where incoherent rays dominate—multi-bounce GI, glossy reflections, and stochastic sampling techniques used by path tracing.

Studios will still need to profile aggressively. SER won’t uniformly help every scene, and its benefits stack best alongside modern upscalers and frame generation. For example, combining SER with temporal upscaling and hardware RT denoisers could let developers push higher-quality lighting at the same frame budget, or hold image quality steady while raising FPS.

What It Means for Players on Nvidia and Intel GPUs

If adopted, SER could make path-traced modes feel less like tech demos and more like viable presets—especially on high-end Nvidia and new Intel GPUs. Don’t expect a blanket 90% jump across your library; think selective boosts in the most ray-heavy scenes. As engines integrate the feature and vendors refine drivers, those improvements should stabilize and, in some cases, compound with AI-based reconstruction and frame generation.

The bottom line: Microsoft just handed developers a pragmatic tool to squeeze more performance out of ray tracing through smarter scheduling, not image shortcuts. That’s good news for anyone who wants better lighting and higher FPS to coexist—and a clear signal that the next wave of rendering gains will come as much from rethinking GPU work as from raw teraflops.