Memory layer for your LLM evaluations.
TestRelic's DeepEval pytest plugin captures every eval run — metrics, scores, reasons, prompts — into shared application memory. So your AI coding agent doesn't just have a harness to run evals; it has the team's eval history as context. Drop-in for any DeepEval test suite.
An AI agent harness for testing is the runtime, memory, and tool layer that turns a general-purpose LLM into a senior testing engineer for your specific app. TestRelic provides the memory leg. DeepEval provides the eval runner. Cursor, Claude Code, Copilot, and Codex are the LLMs that read from that memory over MCP.
Quickstart
Three lines from pip to first eval upload.
No conftest changes. No SDK init in your test files. Drop the plugin in, log in, and run your existing DeepEval suite.
pip install testrelic-deepeval
testrelic login
deepeval test run tests/pip install testrelic-deepevalPulls the pytest plugin from PyPI. No conftest plumbing required — the entry point auto-attaches when both DeepEval and testrelic-deepeval are present.
testrelic loginBrowser-based login flow that writes your API key to ~/.testrelic/credentials. One-time setup; CI uses TESTRELIC_API_KEY instead.
deepeval test run tests/Your existing DeepEval suite runs as-is. The TestRelic plugin auto-captures the TestRun and uploads metrics, cases, scores, and reasons to TestRelic.
What the memory looks like
Every eval run, captured and queryable.
Metrics, scores, thresholds, reasons, and evaluation models — uploaded automatically, served back to your AI agent on demand.
$ deepeval test run tests/test_chat_assistant.py
eval run dashboard
248
Cases evaluated
6
Metrics tracked
0.81
Avg score
Eval-run memory queryable from Cursor and Claude Code over MCP.
Comparisons
Where TestRelic fits in your eval stack.
Memory layer, agent harness, DeepEval, Confident AI — what does what.
Give your agent the team's eval history.
Drop-in for any DeepEval test suite — every metric, score, and reason lands in shared application memory your AI coding agent reads over MCP.
Disabled by default without an API key — if TESTRELIC_API_KEY is unset the plugin is a no-op, so your DeepEval suite never fails because of a missing TestRelic credential.