SQL agents with LangGraph πŸ¦œπŸ•Έ️

 


Creating accurate SQL queries with LLMs becomes challenging as query complexity increases. Simple prompts suffice for basic SQL, but complex joins and logic require detailed prompts, iterative feedback, and error handling. This post explores building an agentic SQL generation workflow using LangGraph, a framework in the LangChain ecosystem designed for creating stateful, multi-node graphs. It explains how to set up the graph with nodes, edges, and state management, integrate error propagation without breaking flow, and optimize prompt engineering to improve SQL generation accuracy. Experiments using the Sakila database show how richer prompts—adding schema details and few-shot examples—significantly improve query quality. For consistently correct SQL, especially with complex joins, introducing SQL views is recommended.

SQL query generation

creating SQL commands can be greatly affected by the details that can be given in the prompt. normal LLM calls are enough to generate simple SQL commands but if need more complex queries, they need more informative prompts and a trial and error propagating pipeline which step by step tells LLM to correct mistakes and generate SQL commands accurately. One way to do this is to build agentic workflows with LangGraph which can achieve the same thing by adopting a graph approach. LangGraph is a part of the LangChain eco-system which focuses on creating directed graphs rather than a chain to build agents.

LangGraph is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. Compared to other LLM frameworks, it offers these core benefits: cycles, controllability, and persistence.

to build an SQL agent using this platform, you need state, nodes and edges. nodes are the tools (aka functions) or states and edges are the logical routes that define how the graph uses nodes and decide when to stop. state is a way to store the current state (a snapshot) of the graph. so the graphs can be either state or message. we use a state graph to build the SQL agent which uses Python TypeDict or Pydantic BaseModel as the data structure to store the graph state.

one of the main component in this workflow is the SQL query generation node. which we tell the question or what we want then llm generate a SQL query. we can do that,

because we use a state graph approach all our nodes will receive the state as the first argument, you can use config if you need it as the second argument. so here our state looks like this,

init_llm the function uses use LangChain chat model to initiate an LLM which we bind db_query_tool which is also a Python function to execute the generated query by the LLM to check whether the generated query is error-free. init_llm is something like this,

db_query_tool is a tool by definition, there are few ways to define a tool in LangGraph, the easiest way is to use a Python function with an informative doc string like this

or use @tool decorator to convert the Python function into a LangGraph tool. also can use the class approach as well.

we also need a node to actually run the SQL query and get the query output. this output can be an error message or if the query runs successfully the retrieved data. even though we bind the db_query_tool into LLM earlier it can not execute the tool itself. that way we create a tool node. so the LLM call will generate the query and it will send the generated query into the tool node by calling the bonded tool. before creating the tool node we need to put a mechanism to propagate errors into the workflow without raising and breaking the flow if the query has errors. by doing this we put a feedback loop into the workflow to generate SQL queries with trails and errors.

Error propagating

LangGraph also has ways to achieve this without much fuzz. like in the below,

This is how to get LLM to know it made a mistake in SQL generation and it raise this error when executing it. so db_query_tool we specified how we execute SQL queries with SQLDatabase wrapper class by results = database.run_no_throw(query, include_columns=True). so this will not raise exceptions to break the execution, instead, we export exceptions as messages into the state so we can pass it as feedback into LLM.

then we create our tool node to execute SQL queries and handle errors as feedback to improve LLM generations,

the last node we need is to prepare the results we got from executing the SQL generated by LLM. Because I need results from the database as JSON objects I decided to use output parser from langchain.

those are the all nodes I need to create in this SQL agent. if you need extra steps you can add more nodes to the graph.

Feedback loop

in LangGraph there are a few types of edges normal, conditional, entry, and conditional entry. each of them has a different purpose, for this agent, we only need normal and conditional edges. conditional edges are the routers which decide which node should go next based on the logic implemented. so logic we have in the SQL generations is if LLM generates error-free SQL we need to execute it and get the data and if not SQL generated error-free it needs to go back to LLM with feedback and say ‘this code you created is wrong, and this is what I got! create new one based on this feedback’. so to do this we need to use add_conditional_edges .

workflow.add_conditional_edges("execute_query", should_continue)

to do the routing we need a Python function with the above logic implemented. it is something like this,

Complete graph

this is what it looks like when we put all those parts in together to build our agent. START and END are special nodes which mark the start (entry point) and end (exiting point) in the graph.

Prompt engineering

as always the LLMs prompt is more important than someone realises to get better outputs from LLMs. two things that we must do is give table schemas we hope to use in the LLM and give as much clear annotation about the columns. special things like primary keys, foreign keys, de-duplication and most importantly categorical values if categorical columns are being used. otherwise, LLM will hallucinate if user queries refer to those categorical columns.

let's do some experiments with prompt engineering here to see how SQL generations getting better with each iteration of prompt optimizations.

  • database: Sakila
  • tables: payment, rental, inventory, film
  • user question: what are the top 3 in terms of revenue, PG rating films?

Exp 01: initial prompt

wrong query: 

SELECT title, rental_rate FROM film WHERE rating = \’PG\’ ORDER BY rental_rate DESC LIMIT 3;

Exp 02: initial prompt + column descriptions

let's give more details about columns which in our tables,

let's add this to the prompt and see how this will improve,

Wrong (but close) query: 

SELECT f.title, SUM(p.amount) as revenue FROM film f JOIN inventory i ON f.film_id = i.film_id JOIN rental r ON i.inventory_id = r.inventory_id JOIN payment p ON r.rental_id = p.rental_id WHERE f.rating=\’PG\’ GROUP BY f.title ORDER BY revenue DESC LIMIT 3;

Exp 03: initial prompt + column descriptions + few shot examples

let’s give some queries with joins and subqueries as an example for LLM to get an idea about how to combine these tables to get answers.

Correct (but agent not consistent) query:

 SELECT title, sum(amount * rental_period) as revenue FROM film JOIN (SELECT i.film_id, r.* FROM inventory AS i JOIN (SELECT rental.inventory_id, rental.rental_id, amount, DATEDIFF(return_date, rental_date) as rental_period FROM rental JOIN payment on rental.rental_id = payment.rental_id) AS r ON i.inventory_id = r.inventory_id) AS t ON film.film_id = t.film_id WHERE rating = ‘PG’ GROUP BY title ORDER BY revenue DESC LIMIT 3

with a few examples agent did eventually generate the correct SQL but the agent does not consistently create correct queries.

normally LLMs are done well in simple SQL queries but in this case, they struggle with the queries that need complex joins to get the correct answer. One way to overcome this is to use SQL view. which avoid the need of complex joins. to do that we need to decide what actually we need to get from the database. based on that we can create a view and give it to the agent and can do the SQL querying.


Comments