DeepEval just got a new look 🎉 Read the announcement to learn more.
Evaluation Models

OpenAI

By default, DeepEval uses gpt-4.1 to power all of its evaluation metrics. To enable this, you’ll need to set up your OpenAI API key. DeepEval also supports all other OpenAI models, which can be configured directly in Python.

Setting Up Your API Key

DeepEval autoloads .env.local then .env at import time (process env -> .env.local -> .env).

Recommended (local dev):

# .env.local
OPENAI_API_KEY=<your-openai-api-key>

Alternative (Shell/CI)

export OPENAI_API_KEY=<your-openai-api-key>

Alternative (notebook)

If you're working in a notebook environment (Jupyter or Colab), set your OPENAI_API_KEY in a cell:

%env OPENAI_API_KEY=<your-openai-api-key>

Command Line

Run the following command in your CLI to specify an OpenAI model to power all metrics.

deepeval set-openai \
    --model=gpt-4.1 \
    --cost-per-input-token=0.000002 \
    --cost-per-output-token=0.000008

Python

You may use OpenAI models other than gpt-4.1, which can be configured directly in python code through DeepEval's GPTModel.

from deepeval.models import GPTModel
from deepeval.metrics import AnswerRelevancyMetric

model = GPTModel(
    model="gpt-4.1",
    temperature=0,
    cost_per_input_token=0.000002,
    cost_per_output_token=0.000008
)
answer_relevancy = AnswerRelevancyMetric(model=model)

deepeval by default uses OpenAI models for evaluations, you can simply pass the name of your desired model in metric initialization and set the OPENAI_API_KEY to use OpenAI models:

from deepeval.metrics import AnswerRelevancyMetric

answer_relevancy = AnswerRelevancyMetric(
    model="gpt-4.1",
)

There are ZERO mandatory and SEVEN optional parameters when creating a GPTModel:

  • [Optional] model: A string specifying the name of the GPT model to use. Defaulted to OPENAI_MODEL_NAME if not set; falls back to gpt-5.4.
  • [Optional] api_key: A string specifying the OpenAI API key for authentication. Defaults to OPENAI_API_KEY if not passed; raises an error at runtime if unset.
  • [Optional] base_url: A string specifying your OpenAI URL.
  • [Optional] temperature: A float specifying the model temperature. Defaults to TEMPERATURE if not passed; falls back to 0.0 if unset.
  • [Optional] cost_per_input_token: A float specifying the cost for each input token for the provided model. Defaults to OPENAI_COST_PER_INPUT_TOKEN if available in deepeval's model cost registry, else None.
  • [Optional] cost_per_output_token: A float specifying the cost for each output token for the provided model. Defaults to OPENAI_COST_PER_OUTPUT_TOKEN if available in deepeval's model cost registry, else None.
  • [Optional] generation_kwargs: A dictionary of additional generation parameters forwarded to the OpenAI chat.completions.create(...) and beta.chat.completions.parse(...) calls.

Available OpenAI Models

Below is a list of commonly used OpenAI models:

  • gpt-5
  • gpt-5-mini
  • gpt-5-nano
  • gpt-4.1
  • gpt-4.5-preview
  • gpt-4o
  • gpt-4o-mini
  • o1
  • o1-pro
  • o1-mini
  • o3-mini
  • gpt-4-turbo
  • gpt-4
  • gpt-4-32k
  • gpt-3.5-turbo
  • gpt-3.5-turbo-instruct
  • gpt-3.5-turbo-16k-0613
  • davinci-002
  • babbage-002

On this page