Run
Evals
while
developing
CI/CD
pipeline
Production
while
developing
CI/CD
pipeline
Production
A library of 50+ evaluation metrics for your entire pipeline
All evaluators can be run programmatically using our SDK, or automatically using our SaaS platform.
Which spaceship was first to land on the moon
Neil Armstrong was the first astronaut to land on the moon in 1969
You are an expert...
The first spaceship to land on the moon in 1969 carried astronauts Neil Armstrong and Buzz...
Bring your own eval
Create your own evaluator on Athina in a matter of seconds. Use an LLM, or a custom function to create your own eval.
Describe the prompt you would like to use for the evaluator
Response must directly address the user's query
Response {response}
User Query {query}
Everything you need to know about Evals.
Can't find the answer you're looking for?
Feel free to contact us
Why do LLM evals work?
Do all evals require a labeled dataset?
Can I create custom evaluators?
Which models can I use as evaluators?
How can I use evals in my CI / CD pipeline?