Show HN: E2E Testing for Chatbots

github.com

2 points by TarekOraby a day ago

Hi HN,

Tired of shipping chatbot features based on gut feelings and "it seems to work" manual testing? We built SigmaEval, an open-source Python library that brings statistical rigor to testing conversational AI.

Instead of simple pass/fail checks, SigmaEval uses an AI User Simulator and an AI Judge to let you make data-driven statements like: "We're confident that at least 90% of user issues will be resolved with a quality score of 8/10 or higher." This allows you to set and enforce objective quality bars for your AI's behavior, response latency, and more, directly within your existing Pytest/Unittest suites.

It's built on top of LiteLLM to support 100+ LLM providers and is licensed under Apache 2.0. We just launched and would love to get your feedback.

GitHub: https://github.com/Itura-AI/SigmaEval

Docs: https://docs.sigmaeval.com/