Grok
DeepEval allows you to run evals with Grok models via xAI’s SDK, either through the CLI or directly in Python. DeepEval currently validates model names against a supported list—see Available Grok Models.
To use Grok, you must first install the xAI SDK:
pip install xai-sdk
Command Line
To configure Grok through the CLI, run the following command:
deepeval set-grok --model grok-4.1 \
--temperature=0
The CLI command above sets the specified Grok model as the default llm-judge for all metrics, unless overridden in Python code. To use a different default model provider, you must first unset Grok:
deepeval unset-grok
You can persist CLI settings with the optional --save flag.
See Flags and Configs -> Persisting CLI settings.
Python
Alternatively, you can specify your model directly in code using GrokModel from DeepEval's model collection.
from deepeval.models import GrokModel
from deepeval.metrics import AnswerRelevancyMetric
model = GrokModel(
model="grok-4.1",
api_key="your-api-key",
temperature=0
)
answer_relevancy = AnswerRelevancyMetric(model=model)
There are ZERO mandatory and SIX optional parameters when creating a GrokModel:
- [Optional]
model: A string specifying the name of the Grok model to use. Defaults toGROK_MODEL_NAMEif not passed; raises an error at runtime if unset. - [Optional]
api_key: A string specifying your Grok API key for authentication. Defaults toGROK_API_KEYif not passed; raises an error at runtime if unset. - [Optional]
temperature: A float specifying the model temperature. Defaults toTEMPERATUREif not passed; falls back to0.0if unset. - [Optional]
cost_per_input_token: A float specifying the cost for each input token for the provided model. Defaults toGROK_COST_PER_INPUT_TOKENif available indeepeval's model cost registry, elseNone. - [Optional]
cost_per_output_token: A float specifying the cost for each output token for the provided model. Defaults toGROK_COST_PER_OUTPUT_TOKENif available indeepeval's model cost registry, elseNone. - [Optional]
generation_kwargs: A dictionary of additional generation parameters forwarded to the xAI SDKclient.chat.create(...)call.
Any **kwargs you would like to use for your model can be passed through the generation_kwargs parameter. However, we request you to double check the params supported by the model and your model provider in their official docs.
Available Grok Models
Below is the comprehensive list of available Grok models in DeepEval:
grok-4.1grok-4grok-4-heavygrok-4-fastgrok-betagrok-3grok-2grok-2-minigrok-code-fast-1