How can I use LangChain Callbacks to log the model calls and answers into a variable
Asked Answered
B

1

10

I'm using LangChain to build a NL application. I want the interactions with the LLM to be recorded in a variable I can use for logging and debugging purposes. I have created a very simple chain:

from typing import Any, Dict

from langchain import PromptTemplate
from langchain.callbacks.base import BaseCallbackHandler
from langchain.chains import LLMChain
from langchain.llms import OpenAI

llm = OpenAI()
prompt = PromptTemplate.from_template("1 + {number} = ")
handler = MyCustomHandler()

chain = LLMChain(llm=llm, prompt=prompt, callbacks=[handler])
chain.run(number=2)

To record what's going on, I have created a custom CallbackHandler:

class MyCustomHandler(BaseCallbackHandler):
    def on_text(self, text: str, **kwargs: Any) -> Any:
        print(f"Text: {text}")
        self.log = text
  
    def on_chain_start(
        self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any
    ) -> Any:
        """Run when chain starts running."""
        print("Chain started running")

This works more or less as expected, but it has some side effects that I cannot figure out where they are coming from. The output is:

enter image description here

And the handler.log variable contains:

'Prompt after formatting:\n\x1b[32;1m\x1b[1;3m1 + 2 = \x1b[0m'

Where are the "Prompt after formatting" and the ANSI codes setting the text as green coming from? Can I get rid of them?

Overall, is there a better way I'm missing to use the callback system to log the application? This seems to be poorly documented.

Beck answered 8/6, 2023 at 14:36 Comment(2)
Maybe use python's StringIO buffer and otherwise treat the buffer like any other stream?Gans
Langchain recently announced some improvements to their callbacks. Perhaps this document is useful for you?Infatuate
R
2

I don't know if you can get rid of them, but I can tell you where they come from, having run across it myself today. A block like this occurs multiple times in LangChain's llm.py class:

            prompt = self.prompt.format_prompt(**selected_inputs)
            _colored_text = get_colored_text(prompt.to_string(), "green")
            _text = "Prompt after formatting:\n" + _colored_text
            if run_manager:
                run_manager.on_text(_text, end="\n", verbose=self.verbose)
            if "stop" in inputs and inputs["stop"] != stop:
                raise ValueError(
                    "If `stop` is present in any inputs, should be present in all."
                )
            prompts.append(prompt)

That's a very select snippet, but shows that the colored_text is only sent to the callbacks on the run_manager. So your handler sees it, but the actual LLM, which gets the prompts list, does not.

I see no way to disable it either - since get_colored_text() always returns extra ANSI codes:

def get_colored_text(text: str, color: str) -> str:
    """Get colored text."""
    color_str = _TEXT_COLOR_MAPPING[color]
    return f"\u001b[{color_str}m\033[1;3m{text}\u001b[0m"

I think it's reasonable to hope that your callback could be called with the raw values, so I'd log an issue.

Ramrod answered 14/8, 2023 at 18:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.