The UL Procyon® AI Inference Benchmark for Windows measures the machine learning inference performance of Windows devices using common machine-vision tasks such as image classification, image segmentation, object detection, and super-resolution. These tasks are executed using a range of popular, state-of-the-art neural networks and can run on the device’s CPU, GPU or a dedicated AI accelerator for comparing hardware performance differences.

MobileNet V3

MobileNet V3 is a compact visual recognition model that was created specifically for mobile devices. The benchmark uses MobileNet V3 to identify the subject of an image, taking an image as the input and outputting a list of probabilities for the content in the image. The benchmark uses the large minimalistic variant of MobileNet V3.

Inception V4

Inception V4 is a state-of-the-art model for image classification tasks. Designed for accuracy, it is a much wider and deeper model than MobileNet. The benchmark uses Inception V4 to identify the subject of an image, taking an image as the input and outputting a list of probabilities for the content identified in the image.

YOLO V3

YOLO, which stands for You Only Look Once, is an object detection model that aims to identify the location of objects in an image. The benchmark uses YOLO V3 to produce bounding boxes around objects with probabilities on the confidence of each detection.

DeepLab V3

DeepLab is an image segmentation model that aims to cluster the pixels of an image that belong to the same object class. Semantic image segmentation labels each region of the image with a class of object. The benchmark uses MobileNet V2 for feature extraction enabling fast inference with little difference in quality compared with larger models.

ResNet50

ResNet 50 is an image classification model that provides a novel way of adding more convolutional layers with the use of residual blocks. Its release enabled the training of deep neural networks previously not possible. The benchmark uses ResNet 50 to identify image subjects, outputting a list of probabilities for the content identified in the image.

Real-ESRGAN

Real-ESRGAN is a super-resolution model trained on synthetic data for increasing the resolution of an image, reconstructing a higher resolution image from a lower resolution counterpart. The model used in the benchmark is the general image variant of Real-ESRGAN, and upscales a 250x250 pixels image to an 1000x1000 image.