Ready to take your agents to the next level? This training is designed to help you evaluate the performance of your UiPath Studio Web agents using structured testing methods and Gen AI-enabled scoring. You’ll learn how to build evaluation sets, configure evaluators, and interpret performance metrics like pass rate and agent health score.
With a focus on reliability and continuous improvement, this training covers how to use LLM-as-a-Judge, JSON Similarity, and Exact Match evaluator types to score your agent’s responses—especially when using Generative AI (Gen AI) to generate dynamic, human-like replies. Whether you're validating structured data, comparing outputs semantically, or setting performance thresholds, this course equips you to make your agents production-ready.
The training uses the Community Cloud version of UiPath Studio Web. Features and workflows demonstrated will be available in the Enterprise version by September 2025.
To benefit from this course, you should:
Understand agent development and prompt engineering
Be familiar with automation concepts (variables, control flow, arrays, etc.)
Have completed the Build your first agent with UiPath Studio Web and Agentic prompt engineering courses
This training is designed for developers and technical automation professionals building agents in UiPath Studio Web. It is ideal for those seeking to implement structured, AI-assisted evaluation processes to validate agent performance at scale.
This training covers key areas around:
Defining and organizing evaluation sets for Studio Web agents
Understanding evaluator types: LLM-as-a-Judge, Exact Match, and JSON Similarity
Setting up reusable evaluators and scoring logic
Evaluating agents built with Generative AI activities
Running and interpreting evaluations to improve accuracy and reliability
Applying best practices for test coverage and feedback loops
At the end of this training, you should be able to:
Design and configure evaluation sets to test agent performance
Select evaluator types based on the agent’s use case and output format
Apply LLM-as-a-Judge to assess natural language outputs using Gen AI
Compare structured outputs with JSON Similarity or Exact Match
Track evaluation metrics and iterate to improve agent accuracy
Practice evaluation with real agents using Studio Web