llamaindex实战-ChatEngine-Context（上下文）模式

一铭

845人浏览 · 2024-11-28 20:00:02

一铭 · 2024-11-28 20:00:02 发布

概述

ContextChatEngine 类是一个上下文聊天引擎，目的是：通过检索聊天的上下文信息、设置系统提示使用语言模型（LLM）生成响应，从而提供流畅的聊天体验。

它是一种简单的聊天模式，构建在数据检索器（retriever）之上。对于每个聊天交互：

首先使用用户消息从索引中检索文本
将检索到的文本设置为系统提示中的上下文
返回用户消息的答案

这种方法很简单，适用于与知识库和一般交互直接相关的问题。

实现逻辑

构建和使用本地大模型。这里使用的是gemma2这个模型，也可以配置其他的大模型。
从文档中构建索引
定义一个memory buffer用来保存历史的聊天内容
把索引转换成查询引擎：index.as_chat_engine，并设置chat_mode，和历史消息的内存buffer。

注意：由于检索到的上下文可能会占用大量可用的 LLM 上下文，因此我们要确保为聊天历史记录配置较小的限制：

memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

实现代码

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.ollama import Ollama

local_model = "/opt/models/BAAI/bge-base-en-v1.5"
# bge-base embedding model
Settings.embed_model = HuggingFaceEmbedding(model_name=local_model)

# ollama
Settings.llm = Ollama(model="gemma2", request_timeout=360.0)

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

data = SimpleDirectoryReader(input_dir="./data/paul_graham/").load_data()
index = VectorStoreIndex.from_documents(data)

from llama_index.core.memory import ChatMemoryBuffer
memory = ChatMemoryBuffer.from_defaults(token_limit=1500)

# 构建聊天引擎
chat_engine = index.as_chat_engine(
    chat_mode="context",
    memory=memory,
    system_prompt=(
        "You are a chatbot, able to have normal interactions, as well as talk"
        " about an essay discussing Paul Grahams life."
    ),
)

# 测试效果
response = chat_engine.chat("Hello!")
print(response)

response = chat_engine.chat("What did Paul Graham do growing up?")
print(response)

response = chat_engine.chat("Can you tell me more?")
print(response)


print("--------------reset chat-------------------------")
chat_engine.reset()
response = chat_engine.chat("Hello! What do you know?")
print(response)

输出

从以下输出可以看到，不同大模型的输出不太相同。这和我们的

$ python chat_context.py 
Hello! 👋

I'm ready to chat.  Is there anything you'd like to know about Paul Graham or his essay? I have access to the text at `/root/work/do_llamaindex/data/paul_graham/paul_graham_essay.txt`, so I can answer questions about it directly. 

What are you curious about? 😊  

According to the essay, Paul Graham was a bit of a loner growing up and spent a lot of time reading science fiction and learning to program. 

He even started writing his own programs at a young age, showing early signs of his future success in the tech world.  He wasn't particularly interested in traditional school activities but found joy and fulfillment in exploring the world of computers.


Do you want to know more about his childhood or something else from the essay?
Sure! 

The essay describes Graham as being quite solitary as a child. He wasn't very interested in sports or typical social activities that other kids enjoyed. Instead, he preferred immersing himself in books, particularly science fiction, and teaching himself to program.  He even built his own computer from scratch at one point! 

His parents recognized his unique interests and supported him in pursuing them. They encouraged his love of learning and provided him with the space and resources to explore his passions. 

This early focus on self-directed learning and technical pursuits would undoubtedly shape Graham's future path as a successful programmer, entrepreneur, and influential figure in the tech world.


Is there anything specific about his childhood that you'd like to know more about?
--------------reset chat-------------------------
Hello! I know that I have access to an essay about Paul Graham's life located at the file path /root/work/do_llamaindex/data/paul_graham/paul_graham_essay.txt. 

I can tell you things about the essay's content, but I haven't actually read it yet. Would you like me to open the file and summarize it for you? Or perhaps you have some specific questions about Paul Graham that you'd like me to try and answer based on the information in the essay?

小结

通过对历史消息的缓存，这样可以得到上下文相关的一些信息，可以让大模型的回答更加准确。当然，我认为不能完全依赖这个缓存机制，毕竟这个机制能够缓存的数据是有限的，而且查找相关的上下文内容，也可能有误差。

https://edu.csdn.net/learn/39067/627173?utm_source=2019755004

汇聚全球AI编程工具，助力开发者即刻编程。

更多推荐

2026年企业级AI API聚合平台选型指南：稳定性、协议兼容与生产可控性正在成为核心竞争力

AI编程社区

Cursor+GitOps：自动化运维新姿势

Cursor 与 GitOps 的结合，是 AI 赋能云原生运维的一次精彩实践。Cursor 通过其强大的 AI 能力，极大地降低了 GitOps 的配置编写门槛、提升了代码质量、加速了问题排查，让运维人员能够将更多精力聚焦于架构设计和业务创新，而非繁琐的 YAML 编写。这不仅仅是“新姿势”，更是迈向智能化、自动化运维未来的重要一步。现在，就打开 Cursor，开始你的 GitOps 智能运维之