The Procyon AI Text Generation Benchmark includes several differing AI Models with different LLM implementations. The benchmark will attempt to run the test on all AI models, starting from the lightest Phi-3.5-mini model and increasing in heaviness and system requirements. This allows the benchmark to be used for effectively comparing the performance of a wide range of AI accelerators through a single test.
For each AI model, the benchmark runs seven prompts consisting of a mixture of pre-generated RAG and non-RAG queries. After the benchmark has run, a result will be generated for each AI model run, with AI models that could not run due to system requirements having a score of 0.
Benchmark AI Prompts
The complete set of prompts are available in the benchmark for inspection in the default installation directory:
C:\ProgramData\UL\Procyon\chops\dlc\ai-llm-benchmark\dataset\queries.xml
Sub-scores for each prompt for each AI model are available in the results file.
The benchmark uses seven pre-generated prompts for all AI models of differing subject matters. Below are summarized versions of the prompts:
Prompt 1: Creative writing
Write a long poem in 200 lines about the capitals of all countries in Europe.
Prompt 2: Code review and optimization
Identify the given algorithm and give four suggestions to improve the following code’s performance.
Prompt 3: Identifying sentiment
Perform sentiment analysis on the provided text and assign one of the labels = {positive, negative, neutral}. Explain in detail and step-by-step why you assigned the specific label.
Prompt 4 (RAG Query): Document summarization.
Give a detailed summary of Procyon AI benchmark suite in 500 words based on the provided context.
Prompt 5 (RAG Query): Document analysis
How can UL Benchmarks help retailers? Answer based on the context provided.
Prompt 6 (RAG Query): Document analysis and search
Give me an example computer performance score with Office Productivity Benchmark MP score in Level 2 system based on the provided context.
Prompt 7 (RAG Query): Document analysis & creating informative text.
How can benchmarking save time and money for my organization? How to choose a reference benchmark score for RFPs? Summarize how to efficiently test the performance of PCs for Enterprise IT. Answer based on the context provided.