πŸ”₯ DeepEval 4.0 just got released. Read the announcement.

Set Up DeepEval

Installing DeepEval

DeepEval is a powerful LLM evaluation framework. Here's how you can easily get started by installing and running your first evaluation using DeepEval.

Start by installing DeepEval using pip:

pip install -U deepeval

Write your first test

Let's evaluate the correctness of an LLM output using GEval, a powerful metric based on LLM-as-a-judge evaluation.

test_app.py
from deepeval import evaluate
from deepeval.test_case import LLMTestCase, SingleTurnParams
from deepeval.metrics import GEval

correctness_metric = GEval(
    name="Correctness",
    criteria="Determine if the 'actual output' is correct based on the 'expected output'.",
    evaluation_params=[SingleTurnParams.ACTUAL_OUTPUT, SingleTurnParams.EXPECTED_OUTPUT],
    threshold=0.5
)

test_case = LLMTestCase(
    input="I have a persistent cough and fever. Should I be worried?",
    # Replace this with the actual output from your LLM application
    actual_output="A persistent cough and fever could signal various illnesses, from minor infections to more serious conditions like pneumonia or COVID-19. It's advisable to seek medical attention if symptoms worsen, persist beyond a few days, or if you experience difficulty breathing, chest pain, or other concerning signs.",
    expected_output="A persistent cough and fever could indicate a range of illnesses, from a mild viral infection to more serious conditions like pneumonia or COVID-19. You should seek medical attention if your symptoms worsen, persist for more than a few days, or are accompanied by difficulty breathing, chest pain, or other concerning signs."
)

evaluate([test_case], [correctness_metric])

To run your first evaluation, enter the following command in your terminal:

deepeval test run test_app.py

Congratulations! You've successfully run your first LLM evaluation with DeepEval.

Setting Up Confident AI

While DeepEval works great standalone, you can connect it to Confident AI β€” an AI quality platform with observability, evals, and monitoring that DeepEval integrates with natively for dashboards, logging, collaboration, and more. It’s free to get started.

You can sign up here, or run:

deepeval login

Navigate to your Settings page and copy your Confident AI API Key from the Project API Key box. If you used the deepeval login command to log in, you'll be prompted to paste your Confident AI API Key after creating an account.

Alternatively, if you already have an account, you can log in directly using Python:

main.py
deepeval.login("your-confident-api-key")

Or through the CLI:

deepeval login --confident-api-key "your-confident-api-key"

You're all set! You can now evaluate LLMs locally and monitor them in Confident AI.

On this page