How to add standard tests to an integration
When creating either a custom class for yourself or to publish in a LangChain integration, it is important to add standard tests to ensure it works as expected. This guide will show you how to add standard tests to a custom chat model, and you can Skip to the test templates for implementing tests for each integration type.
Setup
If you're coming from the previous guide, you have already installed these dependencies, and you can skip this section.
First, let's install 2 dependencies:
langchain-core
will define the interfaces we want to import to define our custom tool.langchain-tests
will provide the standard tests we want to use. Recommended to pin to the latest version:
Because added tests in new versions of langchain-tests
can break your CI/CD pipelines, we recommend pinning the
version of langchain-tests
to avoid unexpected changes.
- Poetry
- Pip
If you followed the previous guide, you should already have these dependencies installed!
poetry add langchain-core
poetry add --group test pytest pytest-socket pytest-asyncio langchain-tests==<latest_version>
poetry install --with test
pip install -U langchain-core pytest pytest-socket pytest-asyncio langchain-tests
# install current package in editable mode
pip install --editable .
Let's say we're publishing a package, langchain_parrot_link
, that exposes the chat model from the guide on implementing the package. We can add the standard tests to the package by following the steps below.
And we'll assume you've structured your package the same way as the main LangChain packages:
langchain-parrot-link/
├── langchain_parrot_link/
│ ├── __init__.py
│ └── chat_models.py
├── tests/
│ ├── __init__.py
│ └── test_chat_models.py
├── pyproject.toml
└── README.md
Add and configure standard tests
There are 2 namespaces in the langchain-tests
package:
- unit tests (
langchain_tests.unit_tests
): designed to be used to test the component in isolation and without access to external services - integration tests (
langchain_tests.integration_tests
): designed to be used to test the component with access to external services (in particular, the external service that the component is designed to interact with).
Both types of tests are implemented as pytest
class-based test suites.
By subclassing the base classes for each type of standard test (see below), you get all of the standard tests for that type, and you can override the properties that the test suite uses to configure the tests.
Standard chat model tests
Here's how you would configure the standard unit tests for the custom chat model:
from typing import Tuple, Type
from langchain_parrot_link.chat_models import ChatParrotLink
from langchain_tests.unit_tests import ChatModelUnitTests
class TestChatParrotLinkUnit(ChatModelUnitTests):
@property
def chat_model_class(self) -> Type[ChatParrotLink]:
return ChatParrotLink
@property
def chat_model_params(self) -> dict:
return {
"model": "bird-brain-001",
"temperature": 0,
"parrot_buffer_length": 50,
}
from typing import Type
from langchain_parrot_link.chat_models import ChatParrotLink
from langchain_tests.integration_tests import ChatModelIntegrationTests
class TestChatParrotLinkIntegration(ChatModelIntegrationTests):
@property
def chat_model_class(self) -> Type[ChatParrotLink]:
return ChatParrotLink
@property
def chat_model_params(self) -> dict:
return {
"model": "bird-brain-001",
"temperature": 0,
"parrot_buffer_length": 50,
}
and you would run these with the following commands from your project root
- Poetry
- Pip
# run unit tests without network access
poetry run pytest --disable-socket --allow-unix-socket --asyncio-mode=auto tests/unit_tests
# run integration tests
poetry run pytest --asyncio-mode=auto tests/integration_tests
# run unit tests without network access
pytest --disable-socket --allow-unix-socket --asyncio-mode=auto tests/unit_tests
# run integration tests
pytest --asyncio-mode=auto tests/integration_tests
Test suite information and troubleshooting
For a full list of the standard test suites that are available, as well as information on which tests are included and how to troubleshoot common issues, see the Standard Tests API Reference.
An increasing number of troubleshooting guides are being added to this documentation, and if you're interested in contributing, feel free to add docstrings to tests in Github!
Standard test templates per component:
Above, we implement the unit and integration standard tests for a tool. Below are the templates for implementing the standard tests for each component:
Chat Models
Note: The standard tests for chat models are implemented in the example in the main body of this guide too.
Chat model standard tests test a range of behaviors, from the most basic requirements (generating a response to a query) to optional capabilities like multi-modal support and tool-calling. For a test run to be successful:
- If a feature is intended to be supported by the model, it should pass;
- If a feature is not intended to be supported by the model, it should be skipped.
Tests for "optional" capabilities are controlled via a set of properties that can be overridden on the test model subclass.
You can see the entire list of properties in the API reference here. These properties are shared by both unit and integration tests.
For example, to enable integration tests for image inputs, we can implement
@property
def supports_image_inputs(self) -> bool:
return True
on the integration test class.
Details on what tests are run, how each test can be skipped, and troubleshooting tips for each test can be found in the API references. See details:
Unit test example:
from typing import Type
from langchain_parrot_link.chat_models import ChatParrotLink
from langchain_tests.unit_tests import ChatModelUnitTests
class TestChatParrotLinkUnit(ChatModelUnitTests):
@property
def chat_model_class(self) -> Type[ChatParrotLink]:
return ChatParrotLink
@property
def chat_model_params(self) -> dict:
return {
"model": "bird-brain-001",
"temperature": 0,
"parrot_buffer_length": 50,
}
Integration test example:
from typing import Type
from langchain_parrot_link.chat_models import ChatParrotLink
from langchain_tests.integration_tests import ChatModelIntegrationTests
class TestChatParrotLinkIntegration(ChatModelIntegrationTests):
@property
def chat_model_class(self) -> Type[ChatParrotLink]:
return ChatParrotLink
@property
def chat_model_params(self) -> dict:
return {
"model": "bird-brain-001",
"temperature": 0,
"parrot_buffer_length": 50,
}
Embedding Models
from typing import Tuple, Type
from langchain_parrot_link.embeddings import ParrotLinkEmbeddings
from langchain_tests.unit_tests import EmbeddingsUnitTests
class TestParrotLinkEmbeddingsUnit(EmbeddingsUnitTests):
@property
def embeddings_class(self) -> Type[ParrotLinkEmbeddings]:
return ParrotLinkEmbeddings
@property
def embedding_model_params(self) -> dict:
return {"model": "nest-embed-001", "temperature": 0}
from typing import Type
from langchain_parrot_link.embeddings import ParrotLinkEmbeddings
from langchain_tests.integration_tests import EmbeddingsIntegrationTests
class TestParrotLinkEmbeddingsIntegration(EmbeddingsIntegrationTests):
@property
def embeddings_class(self) -> Type[ParrotLinkEmbeddings]:
return ParrotLinkEmbeddings
@property
def embedding_model_params(self) -> dict:
return {"model": "nest-embed-001"}
Tools/Toolkits
from typing import Type
from langchain_parrot_link.tools import ParrotMultiplyTool
from langchain_tests.unit_tests import ToolsUnitTests
class TestParrotMultiplyToolUnit(ToolsUnitTests):
@property
def tool_constructor(self) -> Type[ParrotMultiplyTool]:
return ParrotMultiplyTool
@property
def tool_constructor_params(self) -> dict:
# if your tool constructor instead required initialization arguments like
# `def __init__(self, some_arg: int):`, you would return those here
# as a dictionary, e.g.: `return {'some_arg': 42}`
return {}
@property
def tool_invoke_params_example(self) -> dict:
"""
Returns a dictionary representing the "args" of an example tool call.
This should NOT be a ToolCall dict - i.e. it should not
have {"name", "id", "args"} keys.
"""
return {"a": 2, "b": 3}
from typing import Type
from langchain_parrot_link.tools import ParrotMultiplyTool
from langchain_tests.integration_tests import ToolsIntegrationTests
class TestParrotMultiplyToolIntegration(ToolsIntegrationTests):
@property
def tool_constructor(self) -> Type[ParrotMultiplyTool]:
return ParrotMultiplyTool
@property
def tool_constructor_params(self) -> dict:
# if your tool constructor instead required initialization arguments like
# `def __init__(self, some_arg: int):`, you would return those here
# as a dictionary, e.g.: `return {'some_arg': 42}`
return {}
@property
def tool_invoke_params_example(self) -> dict:
"""
Returns a dictionary representing the "args" of an example tool call.
This should NOT be a ToolCall dict - i.e. it should not
have {"name", "id", "args"} keys.
"""
return {"a": 2, "b": 3}
Vector Stores
Here's how you would configure the standard tests for a typical vector store (using
ParrotVectorStore
as a placeholder):
from typing import AsyncGenerator, Generator
import pytest
from langchain_core.vectorstores import VectorStore
from langchain_parrot_link.vectorstores import ParrotVectorStore
from langchain_standard_tests.integration_tests.vectorstores import (
AsyncReadWriteTestSuite,
ReadWriteTestSuite,
)
class TestSync(ReadWriteTestSuite):
@pytest.fixture()
def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore
"""Get an empty vectorstore for unit tests."""
store = ParrotVectorStore()
# note: store should be EMPTY at this point
# if you need to delete data, you may do so here
try:
yield store
finally:
# cleanup operations, or deleting data
pass
class TestAsync(AsyncReadWriteTestSuite):
@pytest.fixture()
async def vectorstore(self) -> AsyncGenerator[VectorStore, None]: # type: ignore
"""Get an empty vectorstore for unit tests."""
store = ParrotVectorStore()
# note: store should be EMPTY at this point
# if you need to delete data, you may do so here
try:
yield store
finally:
# cleanup operations, or deleting data
pass
There are separate suites for testing synchronous and asynchronous methods. Configuring the tests consists of implementing pytest fixtures for setting up an empty vector store and tearing down the vector store after the test run ends.
For example, below is the ReadWriteTestSuite
for the Chroma
integration:
from typing import Generator
import pytest
from langchain_core.vectorstores import VectorStore
from langchain_tests.integration_tests.vectorstores import ReadWriteTestSuite
from langchain_chroma import Chroma
class TestSync(ReadWriteTestSuite):
@pytest.fixture()
def vectorstore(self) -> Generator[VectorStore, None, None]: # type: ignore
"""Get an empty vectorstore."""
store = Chroma(embedding_function=self.get_embeddings())
try:
yield store
finally:
store.delete_collection()
pass
Note that before the initial yield
, we instantiate the vector store with an
embeddings object. This is a pre-defined
"fake" embeddings model
that will generate short, arbitrary vectors for documents. You can use a different
embeddings object if desired.
In the finally
block, we call whatever integration-specific logic is needed to
bring the vector store to a clean state. This logic is executed in between each test
(e.g., even if tests fail).
Details on what tests are run, how each test can be skipped, and troubleshooting tips for each test can be found in the API references. See details: