Blog.

Langchain + RAG cheat sheet

Cover Image for Langchain + RAG cheat sheet
Paolo Morisot
Paolo Morisot

Introduction to LangChain: Building Efficient LLM Applications

LangChain is a powerful ecosystem designed to simplify the development of applications based on Large Language Models (LLMs). Whether you're an experienced developer or new to artificial intelligence, LangChain offers tools and predefined chains to streamline your workflow. In this article, we'll explore the main features of LangChain, accompanied by code examples to help you get started.

1. Setting Up Hugging Face

Before diving in, make sure you have an account on Hugging Face. You'll need an API access token to interact with hosted models. Here's how to set it up:

Create an account on Hugging Face if you haven't already. Generate an API access token by visiting your settings page. Keep your token secure; you'll need it for API calls. Note: Never share your access token publicly.

2. Basic API Call to Hugging Face

Let's use an LLM hosted on Hugging Face for a simple prediction:

python
from langchain.llms import HuggingFaceEndpoint

# Replace with your API access token
huggingfacehub_api_token = 'hf_your_access_token'

# Define the LLM
llm = HuggingFaceEndpoint(
    endpoint_url='https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct',
    huggingfacehub_api_token=huggingfacehub_api_token
)

# Ask a question to the model
question = 'What can I do to improve my productivity?'
output = llm.invoke(question)

print(output)

3. Using Prompt Templates

Prompt templates allow you to structure your queries flexibly. Here's how to create a simple template:

python
from langchain.prompts import PromptTemplate
from langchain.llms import HuggingFaceEndpoint

# Create a prompt template
template = "You are an artificial intelligence assistant. Answer the following question: {question}"
prompt = PromptTemplate(template=template, input_variables=["question"])

# Integrate the template with the LLM
llm = HuggingFaceEndpoint(
    endpoint_url='https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct',
    huggingfacehub_api_token=huggingfacehub_api_token
)
llm_chain = prompt | llm

question = "How does LangChain simplify LLM application development?"
print(llm_chain.invoke({"question": question}))

4. Managing Memory in Chat Models

LangChain offers several ways to manage conversation history:

a. ChatMessageHistory This class stores all messages exchanged during a conversation.

python
from langchain.memory import ChatMessageHistory
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model_name="gpt-4", temperature=0)

# Create the conversation history
history = ChatMessageHistory()
history.add_ai_message("Hello! Ask me any question about Python programming.")
history.add_user_message("What is a list comprehension in Python?")
response = llm.invoke(history.messages)
print(response.content)
b. ConversationBufferMemory
This class stores a defined number of recent messages.

```python
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(model_name="gpt-4", temperature=0)

# Define the buffer memory
memory = ConversationBufferMemory(k=4)

# Create the conversation chain
buffer_chain = ConversationChain(llm=llm, memory=memory)

# Interact with the model
buffer_chain.predict(input="Explain decorators in Python.")
buffer_chain.predict(input="Can you give an example with @staticmethod?")

5. Sequential Chains

Sequential chains allow you to link multiple processing steps. For example, creating a learning plan:

python
from langchain.prompts import PromptTemplate

# Template for the activity
learning_prompt = PromptTemplate(
    input_variables=["activity"],
    template="I want to learn how to {activity}. Can you suggest steps to achieve this?"
)

# Template for the time constraint
time_prompt = PromptTemplate(
    input_variables=["learning_plan"],
    template="I only have one week. Can you create a plan to reach this goal: {learning_plan}."
)

# Chain the templates
from langchain.chains import SequentialChain
chain = SequentialChain(chains=[learning_prompt, time_prompt])

# Execute the chain
output = chain.invoke({"activity": "play the piano"})
print(output)

6. Agents

Agents make decisions based on the tools available to them. LangChain provides pre-built agents like the ReAct agent:

python
from langchain.agents import load_tools, initialize_agent
from langchain.chat_models import ChatOpenAI

# Load tools
tools = load_tools(["wikipedia"])

# Define the LLM
llm = ChatOpenAI(model_name="gpt-4", temperature=0)

# Create the agent
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)

# Use the agent
response = agent.run("Summarize key facts about London, England.")
print(response)

7. Creating Custom Tools for Agents

You can create your own tools to extend the capabilities of agents.

python

from langchain.agents import tool

# Example tool function
@tool
def retrieve_customer_info(name: str) -> str:
    """Retrieve customer information based on their name."""
    # Simulate a database
    customers = {
        "Peak Performance Co.": "Information about Peak Performance Co...",
        "Innovatech Ltd.": "Information about Innovatech Ltd..."
    }
    return customers.get(name, "Customer not found.")

# Create the agent with the custom tool
agent = initialize_agent([retrieve_customer_info], llm, agent="zero-shot-react-description", verbose=True)

# Use the agent
response = agent.run("Create a summary for our customer: Peak Performance Co.")
print(response)

8. Integrating Document Loaders

Document loaders allow you to import various types of data into your application.

a. Loading PDFs

python
from langchain.document_loaders import PyPDFLoader

# Load the PDF document
loader = PyPDFLoader("rag_vs_fine_tuning.pdf")
data = loader.load()
print(data[0])

b. Loading CSVs

python

from langchain.document_loaders import CSVLoader

# Load the CSV file
loader = CSVLoader("fifa_countries_audience.csv")
data = loader.load()
print(data[0])

c. Loading HTML

python
from langchain.document_loaders import UnstructuredHTMLLoader

# Load the HTML file
loader = UnstructuredHTMLLoader("white_house_executive_order_nov_2023.html")
data = loader.load()
print(data[0])

9. Splitting Data for Retrieval

Splitting documents into smaller chunks facilitates data management and information retrieval.

a. CharacterTextSplitter

python
from langchain.text_splitter import CharacterTextSplitter

text = 'Words are flowing out like endless rain into a paper cup,\nthey slither while they pass,\nthey slip away across the universe.'
chunk_size = 24
chunk_overlap = 10

# Create an instance of the splitter
splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap
)

# Split the text and print the chunks
chunks = splitter.split_text(text)
print(chunks)
print([len(chunk) for chunk in chunks])

b. RecursiveCharacterTextSplitter

python
from langchain.text_splitter import RecursiveCharacterTextSplitter

text = 'Words are flowing out like endless rain into a paper cup,\nthey slither while they pass,\nthey slip away across the universe.'
chunk_size = 24
chunk_overlap = 10

# Create an instance of the splitter
splitter = RecursiveCharacterTextSplitter(
    separators=["\n"," ",""],
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap
)

# Split the text and print the chunks
chunks = splitter.split_text(text)
print(chunks)
print([len(chunk) for chunk in chunks])

c. Splitting an HTML Document

python
from langchain.document_loaders import UnstructuredHTMLLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load the HTML document
loader = UnstructuredHTMLLoader("white_house_executive_order_nov_2023.html")
data = loader.load()

chunk_size = 300
chunk_overlap = 100

# Split the HTML
splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap,
    separators='.'
)

docs = splitter.split_documents(data)
print(docs)

10. RAG Storage and Retrieval Using a Vector Database

Retrieval-Augmented Generation (RAG) improves accuracy by using an external knowledge base.

a. Using ChromaDB

python
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import ChatPromptTemplate
from langchain.chains import RunnableMap, RunnablePassthrough
from langchain.chat_models import ChatOpenAI
import os

# Load and split your documents
loader = PyPDFLoader('rag_vs_fine_tuning.pdf')
data = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)
docs = splitter.split_documents(data)

# Create the vector database
embedding_function = OpenAIEmbeddings(openai_api_key='your_openai_api_key')
vectorstore = Chroma.from_documents(
    docs,
    embedding=embedding_function,
    persist_directory=os.getcwd()
)

# Configure the retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}
)

# Create the prompt template
message = """
Answer the following question using the context provided:

Context:
{context}

Question:
{question}

Answer:
"""

prompt_template = ChatPromptTemplate.from_messages([("human", message)])

# Define the LLM
llm = ChatOpenAI(model_name="gpt-4", temperature=0)

# Create the RAG chain
rag_chain = RunnableMap({
    "context": retriever,
    "question": RunnablePassthrough()
}) | prompt_template | llm

# Execute the chain
response = rag_chain.invoke("Which popular LLMs were considered in the paper?")
print(response.content)

Conclusion LangChain is a powerful tool for developing complex LLM applications with simplicity and efficiency. By leveraging its features such as prompt templates, memory management, custom agents, and external data integration, you can create intelligent solutions tailored to your specific needs.

Start today and unlock the full potential of language models with LangChain!