unit#

langsmith._testing.unit(*args: Any, **kwargs: Any) Callable#

Create a test case in LangSmith.

This decorator is used to mark a function as a test case for LangSmith. It ensures that the necessary example data is created and associated with the test function. The decorated function will be executed as a test case, and the results will be recorded and reported by LangSmith.

Parameters:
  • id (-) – A unique identifier for the test case. If not provided, an ID will be generated based on the test function’s module and name.

  • output_keys (-) – A list of keys to be considered as the output keys for the test case. These keys will be extracted from the test function’s inputs and stored as the expected outputs.

  • client (-) – An instance of the LangSmith client to be used for communication with the LangSmith service. If not provided, a default client will be used.

  • test_suite_name (-) – The name of the test suite to which the test case belongs. If not provided, the test suite name will be determined based on the environment or the package name.

  • args (Any) –

  • kwargs (Any) –

Returns:

The decorated test function.

Return type:

Callable

Environment:
  • LANGSMITH_TEST_CACHE: If set, API calls will be cached to disk to

    save time and costs during testing. Recommended to commit the cache files to your repository for faster CI/CD runs. Requires the ‘langsmith[vcr]’ package to be installed.

  • LANGSMITH_TEST_TRACKING: Set this variable to the path of a directory
    to enable caching of test results. This is useful for re-running tests

    without re-executing the code. Requires the ‘langsmith[vcr]’ package.

Example

For basic usage, simply decorate a test function with @test:

>>> @test
... def test_addition():
...     assert 3 + 4 == 7

Any code that is traced (such as those traced using @traceable or wrap_* functions) will be traced within the test case for improved visibility and debugging.

>>> from langsmith import traceable
>>> @traceable
... def generate_numbers():
...     return 3, 4
>>> @test
... def test_nested():
...     # Traced code will be included in the test case
...     a, b = generate_numbers()
...     assert a + b == 7

LLM calls are expensive! Cache requests by setting LANGSMITH_TEST_CACHE=path/to/cache. Check in these files to speed up CI/CD pipelines, so your results only change when your prompt or requested model changes.

Note that this will require that you install langsmith with the vcr extra:

pip install -U “langsmith[vcr]”

Caching is faster if you install libyaml. See https://vcrpy.readthedocs.io/en/latest/installation.html#speed for more details.

>>> # os.environ["LANGSMITH_TEST_CACHE"] = "tests/cassettes"
>>> import openai
>>> from langsmith.wrappers import wrap_openai
>>> oai_client = wrap_openai(openai.Client())
>>> @test
... def test_openai_says_hello():
...     # Traced code will be included in the test case
...     response = oai_client.chat.completions.create(
...         model="gpt-3.5-turbo",
...         messages=[
...             {"role": "system", "content": "You are a helpful assistant."},
...             {"role": "user", "content": "Say hello!"},
...         ],
...     )
...     assert "hello" in response.choices[0].message.content.lower()

LLMs are stochastic. Naive assertions are flakey. You can use langsmith’s expect to score and make approximate assertions on your results.

>>> from langsmith import expect
>>> @test
... def test_output_semantically_close():
...     response = oai_client.chat.completions.create(
...         model="gpt-3.5-turbo",
...         messages=[
...             {"role": "system", "content": "You are a helpful assistant."},
...             {"role": "user", "content": "Say hello!"},
...         ],
...     )
...     # The embedding_distance call logs the embedding distance to LangSmith
...     expect.embedding_distance(
...         prediction=response.choices[0].message.content,
...         reference="Hello!",
...         # The following optional assertion logs a
...         # pass/fail score to LangSmith
...         # and raises an AssertionError if the assertion fails.
...     ).to_be_less_than(1.0)
...     # Compute damerau_levenshtein distance
...     expect.edit_distance(
...         prediction=response.choices[0].message.content,
...         reference="Hello!",
...         # And then log a pass/fail score to LangSmith
...     ).to_be_less_than(1.0)

The @test decorator works natively with pytest fixtures. The values will populate the “inputs” of the corresponding example in LangSmith.

>>> import pytest
>>> @pytest.fixture
... def some_input():
...     return "Some input"
>>>
>>> @test
... def test_with_fixture(some_input: str):
...     assert "input" in some_input
>>>

You can still use pytest.parametrize() as usual to run multiple test cases using the same test function.

>>> @test(output_keys=["expected"])
... @pytest.mark.parametrize(
...     "a, b, expected",
...     [
...         (1, 2, 3),
...         (3, 4, 7),
...     ],
... )
... def test_addition_with_multiple_inputs(a: int, b: int, expected: int):
...     assert a + b == expected

By default, each test case will be assigned a consistent, unique identifier based on the function name and module. You can also provide a custom identifier using the id argument: >>> @test(id=”1a77e4b5-1d38-4081-b829-b0442cf3f145”) … def test_multiplication(): … assert 3 * 4 == 12

By default, all test test inputs are saved as “inputs” to a dataset. You can specify the output_keys argument to persist those keys within the dataset’s “outputs” fields.

>>> @pytest.fixture
... def expected_output():
...     return "input"
>>> @test(output_keys=["expected_output"])
... def test_with_expected_output(some_input: str, expected_output: str):
...     assert expected_output in some_input

To run these tests, use the pytest CLI. Or directly run the test functions. >>> test_output_semantically_close() >>> test_addition() >>> test_nested() >>> test_with_fixture(“Some input”) >>> test_with_expected_output(“Some input”, “Some”) >>> test_multiplication() >>> test_openai_says_hello() >>> test_addition_with_multiple_inputs(1, 2, 3)