The UL Procyon® AI Inference Benchmark for Windows features several AI Inference engines from different vendors. 

ONNX runtime with DirectML

The ONNX Runtime is an open-source inference engine that deploys models from the ONNX standard. ONNX (Open Neural Network Exchange) is an open standard for representing machine learning models, with members including many significant organizations developing AI software or hardware.

The implementation of the ONNX runtime in Procyon uses DirectML, a low-level DirectX 12 library suitable for high-performance, low-latency applications such as frameworks, games, or machine learning inference workloads.

Intel® OpenVINO™

Intel’s distribution of OpenVINO toolkit (OpenVINO) is an open-source toolkit for optimizing and deploying AI inference on Intel hardware. OpenVINO enables developers to use neural networks trained in popular deep learning frameworks with a standard API, then deploy them across various Intel hardware such as CPUs, GPUs and VPUs.

Intel provides tools to optimize models to get better inference performance on supported hardware, with features such as automatic device discovery, load balancing, and dynamic inference parallelism across different processors.

NVIDIA®TensorRT™

NVIDIA TensorRT is an SDK designed to enable high-performance inference on NVIDIA hardware. TensorRT takes a trained network, then produces an optimized runtime engine from it. The SDK includes an optimizer leveraging NVIDIA’s various optimization tools to enable fast inference on their execution runtime, taking advantage of NVIDIA hardware such as Tensor Cores.