JavaScriptを有効にしてください

【LangChain】Agentの仕組みを理解して, 任意のLLMを使用する

 ·  ☕ 3 min read
  • OpenAIの提供するGPT-3には何種類かある
    • text-davinci-003 / text-curie-001 / text-babbage-001 / text-ada-001
  • 特にLangChainでは最も性能の良いtext-davinci-003が使用されている.
    • だが,APIは金が掛かるのでなるだけフリーのLLMが使いたい
    • transformers-openai-api等で偽サーバを建てて,APIのオリジンをすり替えると任意のLLMをLangChainで使用可能となる

例 (LangChain + GPT-index)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import os
os.environ['OPENAI_API_KEY'] = 'changeme'
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "changeme"

import urllib
import json
import openai
openai.api_base = 'http://127.0.0.1:5000/v1'

from langchain.embeddings import HuggingFaceEmbeddings
from gpt_index import LangchainEmbedding
from gpt_index import LLMPredictor, GPTSimpleVectorIndex, SimpleDirectoryReader
from gpt_index.indices.prompt_helper import PromptHelper
from langchain.agents import Tool
from langchain.chains.conversation.memory import ConversationBufferMemory
from langchain import OpenAI
from langchain.agents import initialize_agent

llm = OpenAI()
embed_model = LangchainEmbedding(HuggingFaceEmbeddings(
    model_name="sentence-transformers/gtr-t5-xl"
))

with urllib.request.urlopen("https://en.wikipedia.org/wiki/Keio_University?action=cirrusdump") as f:
    data = f.read()
    text = json.loads(data)[0]["_source"]["text"]

with open("data/test.txt","w") as f:
    f.write(text)

documents = SimpleDirectoryReader('data').load_data()
llm_predictor = LLMPredictor(llm=llm)
index = GPTSimpleVectorIndex(
    documents=documents,
    prompt_helper=PromptHelper(
        max_input_size=4000, 
        num_output=256,
        chunk_size_limit=2000,
        max_chunk_overlap=0,
        separator="." 
    ),
    llm_predictor=llm_predictor,
    embed_model=embed_model,
    verbose=True
)

tools = [
    Tool(
        name = "Keio Index",
        func=lambda q: str(index.query(q)),
        description="About Keio",
        return_direct=True
    ),    
]

chain = initialize_agent(
    tools=tools, 
    llm=llm, 
    agent="conversational-react-description", 
    verbose=True, 
    memory=ConversationBufferMemory(memory_key="chat_history")
)

print(chain.run(input="What is Keio?"))

LangChainにおけるAgentの仕組み

  • LangChainにおけるAgentの仕組み
    1. なにかしらのpromptをGPT-3 (text-davinci-003) に投げる
    2. レスポンスが返ってくる (stop-sequencesまでのトークンが返ってくる)
    3. regexで当該箇所だけ切り抜かれる (regex = r"Action: (.*?)\nAction Input: (.*)")

  • どんなpromptが投げられているか?
    • tools等もhow-toと並べてそのまま記述されてるっぽい
    • この例ではGPT-indexをtoolsとして使用
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Assistant is a large language model trained by OpenAI.

Assistant is designed to be able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics. As a language model, Assistant is able to generate human-like text based on the input it receives, allowing it to engage in natural-sounding conversations and provide responses that are coherent and relevant to the topic at hand.

Assistant is constantly learning and improving, and its capabilities are constantly evolving. It is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics.

Overall, Assistant is a powerful tool that can help with a wide range of tasks and provide valuable insights and information on a wide range of topics. Whether you need help with a specific question or just want to have a conversation about a particular topic, Assistant is here to assist.

TOOLS:
------

Assistant has access to the following tools:

> Keio Index: About Keio

To use a tool, please use the following format:


Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [Keio Index]
Action Input: the input to the action
Observation: the result of the action


When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:


Thought: Do I need to use a tool? No
AI: [your response here]


Begin!

Previous conversation history:


New input: What is Keio?

LLMが貧弱な場合

  • transformers-openai-api等で弱いLLMを使った場合
    • 上のpromptの要求に対応しきれず,指定したフォーマットで出力してくれない
      • 具体的にはこの形式 (regex = r"Action: (.*?)\nAction Input: (.*)")
    • そのため,以下のようなエラーが出る
      • ValueError: Could not parse LLM output: hogehoge
1
2
3
  File "/home/initial/.pyenv/versions/komei-gpt/lib/python3.8/site-packages/langchain/agents/conversational/base.py", line 83, in _extract_tool_and_input
    raise ValueError(f"Could not parse LLM output: `{llm_output}`")
ValueError: Could not parse LLM output: hogehoge
1
2
    raise ValueError(f"Could not parse LLM output: `{llm_output}`")
ValueError: Could not parse LLM output: ` Thought: Do I need to use a tool? Yes Action: [Keio Index] Index: About Keio`

まとめ

  • 一応LangChainのAgentはフリーのLLMに差し替え可能だけど,まともに動く感じではないかなというのが所感
共有

YuWd (Yuiga Wada)
著者
YuWd (Yuiga Wada)
機械学習・競プロ・iOS・Web