Running GPT4All for your PostgreSQL with LangChain

May 21, 2023 - 4 mins read

Author: See Hiong

In this post, I will walk you through the process of setting up Python GPT4All on my Windows PC. Additionally, I will demonstrate how to utilize the power of GPT4All along with SQL Chain for querying a postgreSQL database.

Prerequisites

Before we proceed with the installation process, it is important to have the necessary prerequisites in place.

To follow along with this guide, make sure you have the following:

Jupyter Notebook installed using Anaconda
PostgreSQL setup and ready to use

Having these prerequisites fulfilled is crucial for the successful execution of the upcoming steps.

Furthermore, for this demonstration, I have downloaded the GPT4All-J v1.3-groovy model.

Preparations

To begin, open Jupyter Notebook and install the necessary dependencies by running the following command:

pip install langchain azure.core gpt4all psycopg2

Once the dependencies are installed, proceed with setting up LangChain and loading the model. Use the following code snippet:

from langchain import PromptTemplate, LLMChain
from langchain.llms import GPT4All
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

local_path='X:/ggml-gpt4all-j-v1.3-groovy.bin'
callbacks = [StreamingStdOutCallbackHandler()]
llm = GPT4All(model=local_path, callbacks=callbacks, verbose=True)

Here’s an image showcasing the Jupyter Notebook setup with LangChain and GPT4All model:

jupyter-langchain-gpt4all-model

For the purposes of this demonstration, I utilized the PostgreSQL version of Open Source Shakespeare site as the sample database. We will be performing queries against this Shakespearean data throughout the guide.

To visualize the structure of the database, an Entity-Relationship (ER) diagram can be immensely helpful. Here is the ER diagram as viewed in DBeaver:

dbeaver-open-source-shakespeare-er-diagram

The diagram provides a comprehensive overview of the entities and their relationships within the database, aiding in understanding the data model and facilitating the execution of queries.

Querying GPT4All

Now that we have set up the prompt template and LangChain, we can proceed with querying GPT4All:

template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "How does Shakespeare present the love between Romeo and Juliet?"
llm_chain.run(question)

langchain-gpt4all-query

Querying GPT4All with PostgreSQL

To enable querying from PostgreSQL using SQLChain, we need to set up the SQLDatabase and SQLDatabaseChain. Here’s an example:

from langchain import SQLDatabase, SQLDatabaseChain

# Set up the SQLDatabase by providing the PostgreSQL connection URI and specifying the tables to include
db = SQLDatabase.from_uri("postgresql://postgres:postgres@192.168.68.132:5432/postgres", include_tables=['paragraph','chapter','character','work'])

# Define the prompt template
_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Use the following format:

Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"

Only use the following tables:

{table_info}

If someone asks for the book written, they really mean the work table.

Question: {input}"""
PROMPT = PromptTemplate(
    input_variables=["input", "table_info", "dialect"], template=_DEFAULT_TEMPLATE
)

# Create an instance of SQLDatabaseChain
db_chain = SQLDatabaseChain.from_llm(llm, db, prompt=PROMPT, verbose=True)

With these setups in place, you can proceed to query PostgreSQL using SQLChain and prompt GPT4All with the obtained results:

response = db_chain("How many books are there in the work table")

langchain-gpt4all-db-chain

Please note that generating the response may take some time, especially on a Windows PC. The process involves interacting with the database, running queries, and generating a response using GPT4All, which can be computationally intensive.

Querying OpenAI with PostgreSQL

As a comparison, you can also utilize OpenAI for querying PostgreSQL using SQLChain. Here’s an example:

from langchain.llms import OpenAI
import os
OpenAI.api_key = os.getenv('OPENAI_API_KEY')
llm = OpenAI()

_DEFAULT_TEMPLATE = """Given an input question, first create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Use the following format:

Question: "Question here"
SQLQuery: "SQL Query to run"
SQLResult: "Result of the SQLQuery"
Answer: "Final answer here"

Only use the following tables:

{table_info}

If someone asks for the book written, they really mean the work table.

Question: {input}"""

from langchain import PromptTemplate
PROMPT = PromptTemplate(
    input_variables=["input", "table_info", "dialect"], template=_DEFAULT_TEMPLATE
)

from langchain import SQLDatabase, SQLDatabaseChain
db = SQLDatabase.from_uri("postgresql://postgres:postgres@192.168.68.132:5432/postgres", include_tables=['paragraph','chapter','character','work'])
db_chain = SQLDatabaseChain.from_llm(llm, db, prompt=PROMPT, verbose=True)
db_chain("How many books are there in the work table")

With this setup, you can query PostgreSQL using SQLChain and OpenAI. The response from OpenAI is generally faster compared to GPT4All:

langchain-openai-db-chain

Please note that you need to provide your OpenAI API key in the OpenAI.api_key variable for authentication.