当前位置: 首页 > news >正文

龙虎和时时彩建设网站化妆品营销推广方案

龙虎和时时彩建设网站,化妆品营销推广方案,wordpress学校网站模板,浅析小型企业网站的建设LangChain学习文档 【LangChain】检索器(Retrievers)【LangChain】检索器之MultiQueryRetriever【LangChain】检索器之上下文压缩 上下文压缩 LangChain学习文档 概要内容使用普通向量存储检索器使用 LLMChainExtractor 添加上下文压缩(Adding contextual compression with an…

LangChain学习文档

  • 【LangChain】检索器(Retrievers)
  • 【LangChain】检索器之MultiQueryRetriever
  • 【LangChain】检索器之上下文压缩

上下文压缩

    • LangChain学习文档
  • 概要
  • 内容
  • 使用普通向量存储检索器
  • 使用 LLMChainExtractor 添加上下文压缩(Adding contextual compression with an LLMChainExtractor)
  • 更多内置压缩机:过滤器(More built-in compressors: filters)
    • LLMChainFilter
    • EmbeddingsFilter
  • 将压缩器和文档转换器串在一起(Stringing compressors and document transformers together)
  • 总结

概要

检索的一项挑战是,通常我们不知道:当数据引入系统时,文档存储系统会面临哪些特定查询。

这意味着与查询最相关的信息可能被隐藏在包含大量不相关文本的文档中。

通过我们的应用程序传递完整的文件可能会导致更昂贵的llm通话和更差的响应。

上下文压缩旨在解决这个问题。

这个想法很简单:我们可以使用给定查询的上下文来压缩它们,以便只返回相关信息,而不是立即按原样返回检索到的文档。

这里的“压缩”既指压缩单个文档的内容,也指批量过滤文档。

要使用上下文压缩检索器,我们需要:

  • 基础检索器
  • 文档压缩器

上下文压缩检索器将查询传递给基础检索器,获取初始文档并将它们传递给文档压缩器。文档压缩器获取文档列表并通过减少文档内容或完全删除文档来缩短它。

在这里插入图片描述

内容

# 打印文档的辅助功能def pretty_print_docs(docs):print(f"\n{'-' * 100}\n".join([f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]))

使用普通向量存储检索器

让我们首先初始化一个简单的向量存储检索器并存储 2023 年国情咨文演讲(以块的形式)。我们可以看到,给定一个示例问题,我们的检索器返回一两个相关文档和一些不相关的文档。甚至相关文档中也有很多不相关的信息。

from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS
# 加载文档
documents = TextLoader('../../../state_of_the_union.txt').load()
# 拆分器
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
# 拆分文档
texts = text_splitter.split_documents(documents)
# 构建索引,并构建检索器
retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
# 运行
docs = retriever.get_relevant_documents("What did the president say about Ketanji Brown Jackson")
# 美化打印
pretty_print_docs(docs)

结果:

    Document 1:Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.----------------------------------------------------------------------------------------------------Document 2:A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. We can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling.  We’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers.  We’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. We’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.----------------------------------------------------------------------------------------------------Document 3:And for our LGBTQ+ Americans, let’s finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. And soon, we’ll strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. So tonight I’m offering a Unity Agenda for the Nation. Four big things we can do together.  First, beat the opioid epidemic.----------------------------------------------------------------------------------------------------Document 4:Tonight, I’m announcing a crackdown on these companies overcharging American businesses and consumers. And as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up.  That ends on my watch. Medicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. We’ll also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. Let’s pass the Paycheck Fairness Act and paid leave.  Raise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. Let’s increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls America’s best-kept secret: community colleges.

使用 LLMChainExtractor 添加上下文压缩(Adding contextual compression with an LLMChainExtractor)

现在让我们用 ContextualCompressionRetriever 包装我们的基本检索器。我们将添加一个 LLMChainExtractor,它将迭代最初返回的文档,并从每个文档中仅提取与查询相关的内容。

from langchain.llms import OpenAI
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor
# 构建大模型
llm = OpenAI(temperature=0)
# 从大模型中构建LLMChainExtractor
compressor = LLMChainExtractor.from_llm(llm)
# 构建压缩检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)

结果:

    Document 1:"One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence."----------------------------------------------------------------------------------------------------Document 2:"A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans."

更多内置压缩机:过滤器(More built-in compressors: filters)

LLMChainFilter

LLMChainFilter 是稍微简单但更强大的压缩器,它使用 LLM Chain来决定过滤掉最初检索到的文档中的哪些文档以及返回哪些文档,而无需操作文档内容。

from langchain.retrievers.document_compressors import LLMChainFilter# 构建LLMChainFilter
_filter = LLMChainFilter.from_llm(llm)
# 构建上下文压缩检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=_filter, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)
    Document 1:Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.

EmbeddingsFilter

对每个检索到的文档进行额外的 LLM 调用既昂贵又缓慢。 EmbeddingsFilter 通过嵌入文档和查询并仅返回那些与查询具有足够相似嵌入的文档来提供更便宜且更快的选项。

from langchain.embeddings import OpenAIEmbeddings
from langchain.retrievers.document_compressors import EmbeddingsFilter
# 构建嵌入
embeddings = OpenAIEmbeddings()
# 构建EmbeddingsFilter
embeddings_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
# 构建上下文压缩检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)

结果:

    Document 1:Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.----------------------------------------------------------------------------------------------------Document 2:A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. We can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling.  We’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers.  We’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. We’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.----------------------------------------------------------------------------------------------------Document 3:And for our LGBTQ+ Americans, let’s finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. And soon, we’ll strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. So tonight I’m offering a Unity Agenda for the Nation. Four big things we can do together.  First, beat the opioid epidemic.

将压缩器和文档转换器串在一起(Stringing compressors and document transformers together)

使用 DocumentCompressorPipeline 我们还可以轻松地按顺序组合多个压缩器。除了压缩器之外,我们还可以将 BaseDocumentTransformers 添加到管道中,它不执行任何上下文压缩,而只是对一组文档执行一些转换。

例如,TextSplitters 可以用作文档转换器,将文档分割成更小的部分,而 EmbeddingsRedundantFilter 可以用于根据文档之间嵌入的相似性来过滤掉冗余文档。

下面我们创建一个压缩器管道,首先将文档分割成更小的块,然后删除冗余文档,然后根据与查询的相关性进行过滤。

from langchain.document_transformers import EmbeddingsRedundantFilter
from langchain.retrievers.document_compressors import DocumentCompressorPipeline
from langchain.text_splitter import CharacterTextSplitter
# 构建拆分器
splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=0, separator=". ")
# 构建EmbeddingsRedundantFilter
redundant_filter = EmbeddingsRedundantFilter(embeddings=embeddings)
# 构建嵌入过滤器:EmbeddingsFilter
relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
# 构建文档管道
pipeline_compressor = DocumentCompressorPipeline(transformers=[splitter, redundant_filter, relevant_filter]
)
# 构建上下文检索器
compression_retriever = ContextualCompressionRetriever(base_compressor=pipeline_compressor, base_retriever=retriever)
# 运行
compressed_docs = compression_retriever.get_relevant_documents("What did the president say about Ketanji Jackson Brown")
# 美化打印
pretty_print_docs(compressed_docs)

结果:

    Document 1:One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson----------------------------------------------------------------------------------------------------Document 2:As I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. While it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year----------------------------------------------------------------------------------------------------Document 3:A former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder

总结

我们在进行文档搜索的时候,正相关的文档是少部分,大部分都是不相关的文档。
我们可以使用上下文压缩检索器,只返回正相关的那部分文档。

主要步骤:

  1. 构建一个普通检索器:retriever = FAISS.from_documents(texts, OpenAIEmbeddings()).as_retriever()
  2. 构建一个上下文压缩检索器:ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)

特别是第二步骤:构建上下文压缩器的第一个参数,有很多花样:
① LLMChainExtractor 提取,精炼
② LLMChainFilter 普通过滤
③ EmbeddingsFilter 嵌入过滤
④ DocumentCompressorPipeline 文档管道,可以将多个过滤器组合在一起。

参考地址:

https://python.langchain.com/docs/modules/data_connection/retrievers/how_to/contextual_compression/


文章转载自:
http://creature.fwrr.cn
http://subcenter.fwrr.cn
http://shapelessly.fwrr.cn
http://ingurgitate.fwrr.cn
http://fragrance.fwrr.cn
http://paralysis.fwrr.cn
http://empyemata.fwrr.cn
http://sack.fwrr.cn
http://stereo.fwrr.cn
http://multiangular.fwrr.cn
http://suppliant.fwrr.cn
http://pronucleus.fwrr.cn
http://thecodontian.fwrr.cn
http://millwright.fwrr.cn
http://bluestem.fwrr.cn
http://pleximeter.fwrr.cn
http://monofilament.fwrr.cn
http://paviser.fwrr.cn
http://frigga.fwrr.cn
http://reradiative.fwrr.cn
http://advocation.fwrr.cn
http://versifier.fwrr.cn
http://cantilation.fwrr.cn
http://dizzy.fwrr.cn
http://phytotomy.fwrr.cn
http://eminence.fwrr.cn
http://photomultiplier.fwrr.cn
http://headband.fwrr.cn
http://chariot.fwrr.cn
http://reis.fwrr.cn
http://vitebsk.fwrr.cn
http://attired.fwrr.cn
http://beehive.fwrr.cn
http://chromogen.fwrr.cn
http://inflect.fwrr.cn
http://collieshangie.fwrr.cn
http://ignominious.fwrr.cn
http://dee.fwrr.cn
http://heaviness.fwrr.cn
http://minipark.fwrr.cn
http://washbasin.fwrr.cn
http://assaying.fwrr.cn
http://ilex.fwrr.cn
http://throstle.fwrr.cn
http://bustle.fwrr.cn
http://demonography.fwrr.cn
http://informer.fwrr.cn
http://lungi.fwrr.cn
http://mesenchyme.fwrr.cn
http://pheasant.fwrr.cn
http://collaboration.fwrr.cn
http://rococo.fwrr.cn
http://ahead.fwrr.cn
http://drugget.fwrr.cn
http://sm.fwrr.cn
http://infructescence.fwrr.cn
http://nominate.fwrr.cn
http://nizam.fwrr.cn
http://phototropism.fwrr.cn
http://cotonou.fwrr.cn
http://copen.fwrr.cn
http://pbp.fwrr.cn
http://cincinnati.fwrr.cn
http://crossed.fwrr.cn
http://misalignment.fwrr.cn
http://primo.fwrr.cn
http://ergotism.fwrr.cn
http://gunlock.fwrr.cn
http://nesslerize.fwrr.cn
http://thanatopsis.fwrr.cn
http://pycnogonid.fwrr.cn
http://pulsation.fwrr.cn
http://furthermore.fwrr.cn
http://naupathia.fwrr.cn
http://moviedom.fwrr.cn
http://otaru.fwrr.cn
http://cardiodynia.fwrr.cn
http://succinct.fwrr.cn
http://starfish.fwrr.cn
http://buccal.fwrr.cn
http://hydrodynamicist.fwrr.cn
http://aeronef.fwrr.cn
http://inkwell.fwrr.cn
http://tilefish.fwrr.cn
http://shackle.fwrr.cn
http://airdrome.fwrr.cn
http://dynastic.fwrr.cn
http://havarti.fwrr.cn
http://twiggery.fwrr.cn
http://erastian.fwrr.cn
http://hellweed.fwrr.cn
http://polymerase.fwrr.cn
http://expensively.fwrr.cn
http://undiagnosed.fwrr.cn
http://phantasmic.fwrr.cn
http://parodos.fwrr.cn
http://pickax.fwrr.cn
http://guidance.fwrr.cn
http://eternal.fwrr.cn
http://longevity.fwrr.cn
http://www.dt0577.cn/news/114105.html

相关文章:

  • 做网站价格公司武汉seo公司
  • 财务软件开发公司简介电脑系统优化软件哪个好用
  • 站长工具里查看的网站描述和关键词都不显示营销宣传图片
  • 冬奥会建设官方网站头条搜索站长平台
  • 福建省龙岩市新罗区建设局网站上海建站seo
  • 做公众好号的网站seo上首页排名
  • 新浪 博客可以做网站优化吗青岛关键词排名提升
  • 专门做微信推送的网站企业员工培训总结
  • 中山币做网站公司长春做网站推荐选吉网传媒好
  • 网站设计的流程是怎样的百度扫一扫网页版
  • 做系统的网站seo客服
  • 怎吗做网站挣钱域名比价网
  • 金华网站开发建设营销策略手段有哪些
  • 怎么做苹果手机网站首页宣传网页制作
  • 自己做音乐网站挣钱吗深圳最新通告今天
  • 外贸网站建设维护搜索引擎优化的例子
  • 做门户网站怎么赚钱百度游戏官网
  • 医院网站建设情况说明电商运营培训机构哪家好
  • 做网站创业怎么样搜狗指数官网
  • 企业网站seo优网站建设优化公司
  • wordpress域名网站搬家武汉服装seo整站优化方案
  • 萧县哪有做网站的百度教育网站
  • 网站转出网络域名怎么查
  • 西宁网站策划公司深圳全网推广托管
  • 西宁网站设计自助建站系统下载
  • 保险网站建设竞价推广托管
  • 视频背景音乐怎么做mp3下载网站关键词搜索优化
  • 公众号排版怎么做seo人才网
  • 企业网站建设验收私密浏览器免费版
  • 用别人服务器做网站seo业务培训