LLM-Powered Agent Pathfinding (Part 1)

Given 3x3 Matrix, AI Agent Must Find the Single Valid Move

In this series, I use LLama3 prompts for agent pathfinding.

My long-term hobby is to build a multi-agent simulation framework, using LLMs as the “brain” for each AI agent.

Here’s a screenshot of version 0.0001 😅 

Sabrina Ramonov @ sabrina.dev

Pathfinding is crucial:

Agents must be able to navigate the simulation!

Here’s a Youtube version of this post if you prefer watching me code and test prompts real-time:

What are Agents?

First, an introduction to agents:

AI agents are autonomous programs that interact with their environment, make decisions, process information and data, respond to feedback, and take actions to achieve specific goals

all without human intervention!

Traditional agents were limited by heuristic rules, severely struggling with generalization across diverse tasks and situations.

But LLMs change everything.

LLMs handle complex tasks and generalize well.

Because of their emergent abilities, a single LLM can handle multitudes of tasks and collect diverse types of feedback to improve decision making.

LLMs are the “brain” powering next-generation AI agents.

Although there’s a rich history of non-LLM agents, I use the term “agent” to mean “LLM-powered agent”.

Google shared 100 real-world Gen AI use cases, many involving agents.

What is Pathfinding?

Whether an agent wants to visit a friend or flee from foes, pathfinding is key.

It’s required for realistic and intelligent movement.

For example:

It enables agents to compute efficient paths through the simulation, so they can reach destinations, while avoiding obstacles like walls, buildings, or cliffs.

In a multi-agent simulation, pathfinding prevents agents from colliding or getting stuck in crowds.

In exploration scenarios, pathfinding also helps agents search and navigate through unknown environments, discovering new areas and items.

If you have a computer science background, you may be familiar with algorithms like A* search.

Typically, such algorithms are integrated with systems for collision detection, animation, and behavior trees, to create seamless and realistic agent movement and decision-making.

But now, we have LLMs…

A single LLM handles diverse tasks well, thanks to emergent abilities.

Problem Statement

My goal in this series is to use open-source Llama3, running on my Macbook, to power efficient agent pathfinding through obstacles in my simulation.

Initial research in this area is promising but far from “solved”:

Here’s the problem statement, relatively simple to start with:

The LLM-powered agent is given a 3x3 matrix, where all values are 0 except for a single 1 entry:

  • 0 = agent not allowed to move there (e.g. obstacle, wall)

  • 1 = agent allowed to move there

To keep it simple:

  • agent starts in the center of the matrix (1, 1)

  • agent can only move up, down, left, or right

Just like a 2d video game!

So, there are only 4 options for the location of the 1:

  1. up (0, 1)

  2. down (2, 1)

  3. left (1, 0)

  4. right (1, 2)

Can the agent find the 1?

sabrina.dev

A matrix uses zero-based indexing, starting from the upper left corner.

In this example, the correct answer is (2, 1) because the 1 is located in:

  • row index = 2

  • column index = 1

Prompt Experiments

Here is the prompt I start with:

Result: 65% accuracy on 20 samples in the test set.

Some of the LLM outputs are invalid positions like (2, 2).

Remember, the agent can only move up, down, left, or right from the center, just like in a 2d video game. But (2, 2) requires moving diagonally from the center to the bottom right corner, which is not allowed.

Next, I try to constrain the LLM by adding this requirement to the prompt:

The coordinate must be one of the following: (0, 1), (1, 0), (1, 2), (2, 1).

If the coordinate is not in the list above, try again.

Sabrina Ramonov @ sabrina.dev

Result: 80% accuracy on 10 samples.

I’m curious what the LLM is “thinking” so I try chain-of-thought prompt engineering, making the LLM explain its output at each step.

Result: 80% accuracy on 10 samples.

Weird — the LLM reconstructs the matrix correctly and identifies the location of 1 at (2, 1) in tasks 1 and 2. Awesome!

But in task 3, the LLM checks if (2, 1) is one of the allowed options…

… and concludes (2, 1) “doesn’t match any of these options”!

Yet, we can clearly see (2, 1) is an allowed option. It’s the last in the list.

Because of this issue, I remove task 3 altogether.

Then, I try another prompt engineering technique called few-shot prompting — giving LLama3 examples of what “good” or “correct” looks like.

Here’s my prompt, updated with examples of valid answers:

Result: 100% accuracy on 10 samples. Yay! 🥂 

Result: 89% accuracy on 200 samples.

Not bad!

Investigating some incorrect outputs, I notice a weird thing:

Llama3 thinks 1 is located in the 1st row and 1st column, yet returns (0, 0).

Again, (0, 0) should never be returned because it’s not an allowed move — the agent cannot move diagonally from the center of the matrix to the upper left corner.

I notice this type of error happens again.

LLama3 claims the answer is one coordinate, but in its final answer, it returns a different coordinate.

I’m curious if negative prompting could help, i.e. explicitly providing examples of invalid answers.

I append this to my prompt:

Result: 90% accuracy on 10 samples.

Unfortunately, negative prompting made things worse.

Not a big surprise.

Negative prompting is generally less effective than few-shot learning.

Best Performing Prompt

I decide to remove chain-of-thought from the prompt.

In the previous run with 200 samples, I notice the LLM sometimes recognizes the correct location of 1 but in its final answer returns a different coordinate!

Perhaps the multiple steps in chain-of-thought are introducing potential sources of error.

Here is my updated prompt, after removing chain-of-thought:

Result: 99% accuracy on 100 samples. 🙌

Not bad, Llama3, not bad at all!

Check out this video to see the agent bopping around my simulation, using this prompt to determine its movement.

I try a few other variations, like swapping out ”(x, y)” with ”(row, column)” — but the other variations performed worse.

Conclusion

I’m really enjoying playing with LLama3 to tackle pathfinding.

I love watching my little AI agent navigate without crashing into every single obstacle along the way.

Now, it’s only crashing a much smaller percentage of the time!

I am surprised the ultimate winner, with 99% accuracy on 100 test cases, is a deceptively simple prompt with few-shot learning. I’m skeptical this will hold up in more complex pathfinding scenarios.

Stay tuned, as I continue working on LLM-powered agent pathfinding!

P.S. if you’re interested in multi agent simulations, DM me, I’d love to connect!