🔥 DeepEval 4.0 just got released. Read the announcement.

Generate Goldens From Contexts

If you already have prepared contexts, you can skip document processing. Simply provide these contexts to deepeval's Synthesizer, and it will generate goldens directly without processing documents.

LangChain

Generate Your Goldens

To generate synthetic single or multi-turn goldens from documents, simply provide a list of contexts:

from deepeval.synthesizer import Synthesizer

synthesizer = Synthesizer()
goldens = synthesizer.generate_goldens_from_contexts(
    # Provide a list of context for synthetic data generation
    contexts=[
        ["The Earth revolves around the Sun.", "Planets are celestial bodies."],
        ["Water freezes at 0 degrees Celsius.", "The chemical formula for water is H2O."],
    ]
)

There are ONE mandatory and THREE optional parameters when using the generate_goldens_from_contexts method:

  • contexts: a list of context, where each context is itself a list of strings, ideally sharing a common theme or subject area.
  • [Optional] include_expected_output: a boolean which when set to True, will additionally generate an expected_output for each synthetic Golden. Defaulted to True.
  • [Optional] max_goldens_per_context: the maximum number of goldens to be generated per context. Defaulted to 2.
  • [Optional] source_files: a list of strings specifying the source of the contexts. Length of source_files MUST be the same as the length of contexts.
from deepeval.synthesizer import Synthesizer

synthesizer = Synthesizer()
conversational_goldens = synthesizer.generate_conversational_goldens_from_contexts(
    # Provide a list of context for synthetic data generation
    contexts=[
        ["The Earth revolves around the Sun.", "Planets are celestial bodies."],
        ["Water freezes at 0 degrees Celsius.", "The chemical formula for water is H2O."],
    ]
)

There are ONE mandatory and THREE optional parameters when using the generate_conversational_goldens_from_contexts method:

  • contexts: a list of context, where each context is itself a list of strings, ideally sharing a common theme or subject area.
  • [Optional] include_expected_outcome: a boolean which when set to True, will additionally generate an expected_outcome for each synthetic ConversationalGolden. Defaulted to True.
  • [Optional] max_goldens_per_context: the maximum number of goldens to be generated per context. Defaulted to 2.
  • [Optional] source_files: a list of strings specifying the source of the contexts. Length of source_files MUST be the same as the length of contexts.

Remember, single-turn generations produces single-turn Goldens, while multi-turn generations produces multi-turn ConversationalGoldens. To learn more about goldens, click here.

FAQs

What format should my contexts be in?
contexts is a list of contexts, where each context is itself a list of strings that ideally share a common theme. Each context produces up to max_goldens_per_context goldens.
When should I use this instead of generate_goldens_from_docs?
Use generate_goldens_from_contexts when you already have prepared contexts (e.g. an embedded knowledge base) and want to control chunking yourself. generate_goldens_from_docs calls this method under the hood after an extra context construction step.
Are the generated goldens grounded in my contexts?
Yes. The provided contexts are used directly to generate and ground each golden, and to produce expected_outputs when include_expected_output is enabled.
Can my team generate goldens from our contexts without code?
Yes. On Confident AI you can connect your knowledge bases and run the generation pipeline no-code — tweak filtration, evolutions, and styling, experiment with variations, and collaborate on the resulting dataset as a team.

On this page