DevSite Eval

Elevate the quality of your AI features by comprehensively evaluating their underlying LLM outputs

How DevSite Eval Works

Understand the inputs to get the best out of your evaluations.

DevSite Eval automates the assessment of LLM-generated content by comparing it against your specific criteria.

Evaluation Instructions: Clearly define the standards for a "Good", "OK", or "Bad" response. Be specific about what to look for (e.g., helpfulness, conciseness, accuracy).
CSV File: This file must contain at least two columns, case-insensitive:
- prompt: The input query or task given to the LLM.
- output: The LLM's generated response to that prompt.
(Additional columns in your CSV will be preserved but are not used in the core evaluation process.)

The tool then processes each item, providing a rating, a reason, and an overall summary of the evaluation batch.