OpenAI
By default, DeepEval uses gpt-4.1
to power all of its evaluation metrics. To enable this, you’ll need to set up your OpenAI API key. DeepEval also supports all other OpenAI models, which can be configured directly in Python.
Setting Up Your API Key
To use OpenAI for deepeval
's LLM-Evals (metrics evaluated using an LLM), supply your OPENAI_API_KEY
in the CLI:
export OPENAI_API_KEY=<your-openai-api-key>
Alternatively, if you're working in a notebook environment (Jupyter or Colab), set your OPENAI_API_KEY
in a cell:
%env OPENAI_API_KEY=<your-openai-api-key>
Command Line
Run the following command in your CLI to specify an OpenAI model to power all metrics.
deepeval set-openai \
--model=gpt-4.1
--cost_per_input_token=0.000002
--cost_per_output_token=0.000008
The CLI command above sets gpt-4.1
as the default model for all metrics, unless overridden in Python code. To use a different default model provider, you must first unset the current settings:
deepeval unset-open-ai
Python
You may use OpenAI models other than gpt-4.1
, which can be configured directly in python code through DeepEval's GPTModel
.
You may want to use stronger reasoning models like gpt-4.1
for metrics that require a high level of reasoning — for example, a custom GEval for mathematical correctness.
from deepeval.models import GPTModel
from deepeval.metrics import AnswerRelevancyMetric
model = GPTModel(
model="gpt-4.1",
temperature=0,
cost_per_input_token=0.000002,
cost_per_output_token=0.000008
)
answer_relevancy = AnswerRelevancyMetric(model=model)
There are ONE mandatory and ONE optional parameters when creating a GPTModel
:
model
: A string specifying the name of the GPT model to use. Defaulted togpt-4o
.- [Optional]
temperature
: A float specifying the model temperature. Defaulted to 0. - [Optional]
cost_per_input_token
: A float specifying the cost for each input token for the provided model. - [Optional]
cost_per_output_token
: A float specifying the cost for each output token for the provided model.
Available OpenAI Models
This list only displays some of the available models. For a comprehensive list, refer to the OpenAI's official documentation.
Below is a list of commonly used OpenAI models:
gpt-4.1
gpt-4.5-preview
gpt-4o
gpt-4o-mini
o1
o1-pro
o1-mini
o3-mini
gpt-4-turbo
gpt-4
gpt-4-32k
gpt-3.5-turbo
gpt-3.5-turbo-instruct
gpt-3.5-turbo-16k-0613
davinci-002
babbage-002