In some regions, UL Procyon cannot automatically download the required AI models. In these cases, users will have to manually add the models themselves.
Non-converted Pytorch Models
Stable Diffusion 1.5
HFID | nmkd/stable-diffusion-v1-5 |
Link | https://huggingface.co/nmkd/stable-diffusion-1.5-fp16/tree/main |
Variant | Pytorch fp16 (safetensors) |
Use | Used in all engines. Conversion is run locally. |
Stable Diffusion XL
HFID | stabilityai/stable-diffusion-xl-base-1.0 |
Link | https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 |
Variant | Pytorch fp16 (safetensors) |
Use | Used for TensorRT and OpenVINO. Conversion is run locally. Olive UNET conversion for SDXL is very heavy and we have opted to using an already converted model: |
HFID | madebyollin/sdxl-vae-fp16-fix |
Link | https://huggingface.co/madebyollin/sdxl-vae-fp16-fix |
Variant | fp16 (safetensors) |
Use | Used for all engines. Replaces the Olive Optimized model for ONNX Runtime with DirectML. Conversion is run locally. |
Converted Olive-optimized ONNX models
Stable Diffusion XL
HFID | greentree/SDXL-olive-optimized |
Link | https://huggingface.co/greentree/SDXL-olive-optimized/tree/main |
Variant | ONNX Olive Optimized (ONNX) |
Use | Used for ONNX Runtime with DirectML. No conversion is run. |
Converted AMD-optimized ONNX models
Stable Diffusion 1.5
HFID | amd/stable-diffusion-1.5_io16_amdgpu |
Link | https://huggingface.co/amd/stable-diffusion-1.5_io16_amdgpu |
Variant | AMD-optimized (ONNX) |
Use | Used for ONNX Runtime with DirectML. No conversion is run. |
Stable Diffusion XL
HFID | amd/stable-diffusion-xl-1.0_io16_amdgpu |
Link | https://huggingface.co/amd/stable-diffusion-1.5_io16_amdgpu |
Variant | AMD-optimized (ONNX) |
Use | Used for ONNX Runtime with DirectML. No conversion is run. |
Quantized OpenVINO models
Stable Diffusion 1.5
HFID | intel/sd-1.5-square-quantized |
Link | https://huggingface.co/Intel/sd-1.5-square-quantized/tree/main/INT8 |
Variant | Int8 Quantized OVIR |
Use | Used for OpenVINO Runtime with int8 precision. No conversion is run for these models. Requires the full SD15 fp16 pytorch models for converting the Text Encoder and VAE. |
Files | INT8/time_proj_constants.npy INT8/time_proj_constants.raw INT8/unet_int8.bin INT8/unet_int8.xml INT8/unet_time_proj.bin INT8/unet_time_proj.xml |
Quantized Qualcomm QNN models
Stable Diffusion 1.5
HFID | qualcomm/Stable-Diffusion-v1.5 |
Link | https://huggingface.co/qualcomm/Stable-Diffusion-v1.5 |
Variant | w8a16 Quantized QNN |
Use | Used for QNN Runtime with int8 precision. No conversion is run for these models. Requires the UNET, tokenizer and scheduler config of the original SD15 fp16 pytorch model to be placed on disk as well. |
Files | TextEncoder_Quantized.bin |
Installing the models
By default, the benchmark is installed in
%ProgramData%\UL\Procyon\chops\dlc\ai-imagegeneration-benchmark\
- If it does not exist, create a subfolder named ‘models’ at:
%ProgramData%\UL\Procyon\chops\dlc\ai-imagegeneration-benchmark\
- In this ‘models’ folder, create the following subfolders based on the tests you are looking to run:
- For non-converted Pytorch models:
Create a subfolder 'pytorch' and place each full Pytorch model in it with the model's HF ID in the folder structure; E.g....\ai-imagegeneration-benchmark\models\pytorch\nmkd\stable-diffusion-1.5-fp16\<each subfolder of the model>
Please note:
The first run of benchmarks using these models can take significantly longer, as the models need to be converted. - For converted Olive Optimized ONNX models for ONNX Runtime with DirectML:
Create a subfolder ‘onnx_olive_optimized’ and place each full model in it with the model’s HF ID in the folder structure; E.g....\ai-imagegeneration-benchmark\models\onnx_olive_optimized\nmkd\stable-diffusion-1.5-fp16\<each subfolder of the model>
- For converted AMD Optimized ONNX models for ONNX Runtime with DirectML:
Create a subfolder ‘onnx_amd_optimized’ and place each full model in it with the model’s HF ID in the folder structure; E.g....\ai-imagegeneration-benchmark\models\onnx_amd_optimized\nmkd\stable-diffusion-1.5-fp16\<each subfolder of the model>
- For quantized OVIR models for OpenVINO Runtime:
Create a directory ‘ovir\<HF ID>\unet’ and place each part of the int8 model in it:...\ai-imagegeneration-benchmark\models\ovir\intel\sd-1.5-square-quantized\unet\<each required unet part>
- For quantized QNN models for QNN Runtime:
Create a directory ‘qnn\<HF ID>\unet’ and place each model in it:...\ai-imagegeneration-benchmark\models\qnn\qualcomm\Stable-Diffusion-v1.5\<submodel>\<submodel>.bin keeping the original name of the files: ...\text_encoder\TextEncoder_Quantized.bin ...\unet\UNet_Quantized.bin ...\vae_decoder\VAEDecoder_Quantized.bin Follow the instructions in step (2.1) for the required pytorch model files
- For non-converted Pytorch models:
Note:
Not all models for all engines are required to always be present in the installation directory.
- For OpenVINO, only the OVIR models must exist.
- For ONNX Runtime-DirectML, only the Olive-optimized ONNX models must exist.
- For TensorRT, only the Engine created for the current settings (batch size, resolution) and hardware must exist. The Engine is generated from the CUDA-optimized ONNX models in case changes are made.