Skip to main content

Anthropic

DeepEval supports using any Anthropic model for all evaluation metrics. To get started, you'll need to set up your Anthropic API key.

Setting Up Your API Key

To use Anthropic for deepeval's LLM-based evaluations (metrics evaluated using an LLM), provide your ANTHROPIC_API_KEY in the CLI:

export ANTHROPIC_API_KEY=<your-anthropic-api-key>

Alternatively, if you're working in a notebook environment (e.g., Jupyter or Colab), set your ANTHROPIC_API_KEY in a cell:

%env ANTHROPIC_API_KEY=<your-anthropic-api-key>

Python

To use Anthropic models for DeepEval metrics, define an AnthropicModel and specify the model you want to use. By default, the model is set to claude-3-7-sonnet-latest.

from deepeval.models import AnthropicModel
from deepeval.metrics import AnswerRelevancyMetric

model = AnthropicModel(
model="claude-3-7-sonnet-latest",
temperature=0
)
answer_relevancy = AnswerRelevancyMetric(model=model)

There are ZERO mandatory and SIX optional parameters when creating an AnthropicModel. Parameters may be explicitly passed to the model at initialization time, or configured with optional settings. The mandatory parameters are required at runtime, but you can provide them either explicitly as constructor arguments, or via DeepEval settings / environment variables (constructor args take precedence). See Environment variables and settings for the Anthropic-related environment variables:

  • [Optional] model: A string specifying which Claude model to use. Defaults to ANTHROPIC_MODEL_NAME if not passed; falls back to claude-3-7-sonnet-latest if unset.
  • [Optional] api_key: A string specifying your Anthropic API key. Defaults to ANTHROPIC_API_KEY if not passed; raises an error at runtime if unset.
  • [Optional] temperature: A float specifying the model temperature. Defaults to TEMPERATURE if not passed; falls back to 0.0 if unset and raises if < 0.
  • [Optional] cost_per_input_token: A float specifying the cost for each input token for the provided model. Defaults to ANTHROPIC_COST_PER_INPUT_TOKEN if not passed; raises an error at runtime if DeepEval has no pricing metadata for the model and the parameter is unset.
  • [Optional] cost_per_output_token: A float specifying the cost for each output token for the provided model. Defaults to ANTHROPIC_COST_PER_OUTPUT_TOKEN if not passed; raises an error at runtime if DeepEval has no pricing metadata for the model and the parameter is unset.
  • [Optional] generation_kwargs: A dictionary of additional generation parameters forwarded to the Anthropic messages.create(...) call.
tip

Pass generation parameters, such as max_tokens, via generation_kwargs (they are forwarded to messages.create(...)).

Extra **kwargs passed to AnthropicModel(...) are forwarded to the underlying Anthropic client and are not treated as generation parameters.

Available Anthropic Models

note

This list only displays some of the available models. For a comprehensive list, refer to the Anthropic's official documentation.

Below is a list of commonly used Anthropic models:

  • claude-3-7-sonnet-latest
  • claude-3-5-haiku-latest
  • claude-3-5-sonnet-latest
  • claude-3-opus-latest
  • claude-3-sonnet-20240229
  • claude-3-haiku-20240307
  • claude-instant-1.2