Speed Up Unreal Engine NNE Inference with NVIDIA TensorRT for RTX Runtime

2026-04-30 · Source: NVIDIA Technical Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Gaming & Interactive Media · Depth: Advanced, medium

Summary

Unreal Engine 5 (UE5) integrates neural network capabilities through its Neural Network Engine (NNE), an abstraction layer that unifies inference workloads across various backends, supporting both GPU and CPU runtimes. NVIDIA has released a new plugin, NNERuntimeTRT, which adds NVIDIA TensorRT for RTX as an NNE runtime option, specifically designed for efficient inferencing on NVIDIA RTX GPUs. TensorRT for RTX optimizes AI models for specific hardware using a Just-In-Time (JIT) compiler, leading to higher throughput compared to default execution providers like DirectML. For instance, a style transfer post-processing sample project demonstrated that TensorRT for RTX completed an enqueue task in 3.8 ms on an NVIDIA GeForce RTX 5090 GPU at 1080p, a 1.5x performance improvement over DirectML's 5.7 ms. The NNE TensorRT for RTX plugin supports both synchronous CPU-to-GPU and asynchronous Render Dependency Graph (RDG) methods, making it suitable for diverse AI applications in rendering, animation, language, and speech.

Key takeaway

For AI Engineers developing real-time graphics applications in Unreal Engine 5 on NVIDIA RTX GPUs, integrating the NNERuntimeTRT plugin is crucial. You should update your engine source to enable TensorRT for RTX as a runtime option, as it delivers substantial performance gains, such as a 1.5x speedup for post-processing tasks compared to DirectML. This optimization allows for more complex neural network features without compromising frame rates, enhancing visual quality and creative possibilities.

Key insights

NVIDIA TensorRT for RTX significantly boosts neural network inference performance within Unreal Engine 5 on RTX GPUs.

Principles

JIT compilation optimizes AI models for specific GPU hardware.
RDG method aligns AI inference with frame rendering for real-time graphics.

Method

Integrate the NNERuntimeTRT plugin into Unreal Engine 5 by modifying `neuralprofile.h` and `neuralprofile.cpp` to include `NNERuntimeTRT` in runtime lists, then build the engine and deploy the plugin.

In practice

Use NNERuntimeTRT for 1.5x faster AI post-processing in UE5.
Resize ONNX style transfer models to 1x3x720x720 to avoid tiling overhead.
Profile performance with Unreal Insights to compare runtimes.

Topics

Unreal Engine NNE
NVIDIA TensorRT for RTX
GPU Inference Optimization
Neural Post-Processing
DirectML Performance

Code references

Best for: AI Engineer, Computer Vision Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by NVIDIA Technical Blog.