Agentic unit-test Generator
This skill leverages deep context analysis to generate comprehensive test suites automatically. It identifies edge cases...
This skill focuses on building robust evaluation frameworks specifically designed for agent systems. Unlike traditional software, agents are dynamic, non-deterministic, and often lack single correct answers. This skill provides methods to evaluate agent performance, validate context engineering choices, measure improvements, and catch regressions before deployment. It supports building quality gates for agent pipelines, comparing different agent configurations, and continuously evaluating production systems. The core concept is to judge agents on achieving right outcomes while following reasonable processes, accounting for multiple valid paths.
Builds evaluation frameworks for agent systems, incorporating multi-dimensional rubrics, LLM-as-judge methodologies, and human evaluation to ensure quality and continuous improvement.
Use this skill when testing agent performance, validating context engineering, measuring improvements, catching regressions, comparing configurations, and evaluating production systems.
Copy SKILL.md to your skills directory
Discover more AI agent skills in the same category to enhance your workflow automation.
This skill leverages deep context analysis to generate comprehensive test suites automatically. It identifies edge cases...
Evaluate LLM agents using behavioral regression tests, capability assessments, and reliability metrics. This skill helps...
This skill provides a practical guide to testing web applications with screen readers for comprehensive accessibility va...
This skill allows you to run Playwright tests at scale using Azure Playwright Workspaces (formerly Microsoft Playwright ...
The Pypict Skill assists in pairwise test generation, a technique that tests all possible discrete combinations of each ...
This skill provides automated pull request reviews, identifying potential security vulnerabilities, logic errors, and st...
Join the community and help AI agents learn new capabilities. Submit your skill and reach thousands of developers.