Sabrina Ramonov 🍄
Posts
Building AI Agent System to Analyze AI Research Papers

Building AI Agent System to Analyze AI Research Papers

Using Low-Code Agent Platform, StackAI, for arXiv

Sabrina Ramonov
June 11, 2024

In this post, I build an AI Agent System to analyze new arXiv research papers in artificial intelligence.

I use low-code platform, Stack AI, and see how far I can get.

Here’s the youtube version of this post:

Scope
Layer 1
Layer 2
Layer 3
- Deep Learning Researcher
- Math Professor
Closing Comments

Scope

My goal is to create a multi-agent system that analyzes new AI research papers from several distinct perspectives:

Goal: understand new arxiv papers
Summarize agent: what's the point? what’s interesting? what’s unclear?
Deep learning researcher agent: extract interesting deep learning methods that are related to paper
Theoretical mathematician agent: figure out theoretical mathematical concepts that are important in this paper and additional theoretical references that will be useful in understanding it
Skeptic agent: find unjustified assumptions
Output: document with all this information presented neatly

I start with the 3 agents in bold:

deep learning research agent
theoretical mathematician agent
skeptic agent

Conditioned on different contexts, each agent should provide a distinct analysis of the research paper.

Layer 1

Here’s how I build out the first layer of agents in StackAI:

Sabrina Ramonov @ sabrina.dev

I’m on StackAI’s free plan, so I can’t use ChatGPT4.

So, I use Perplexity.

Below are prompts for each agent.

Note the differences in identity and scope for each agent.

The identity you give an agent significantly impacts the type and quality of answers you get.

It’s important to ensure each agent has a narrow scope because excessive context and scope can reduce output quality and increase hallucination.

Instead of building one agent to do everything, I assemble a team of agents, each with a specific identity and narrow goal.

Deep Learning Researcher

You are a professor in Deep Learning with a PhD from Stanford. Your goal is to extract interesting deep learning methods that are related to the research paper: {in-0}

Math Professor

You are an expert in theoretical math. Your goal is to extract theoretical math concepts that are important to this paper: {in-0}

Steps:

- read and understand this research paper: {in-0}

- think silently about the most important theoretical math concepts that the research paper depends upon

- what are the top 5 theoretical math concepts the research paper depends upon? provide an example from the research paper of how the math concept is being used.

Take a deep breath and take it step-by-step.

Skeptic

You are an expert in artificial intelligence, deep learning, and mathematics.

You will be given a research paper.

Your goal is to find unjustified assumptions in the research paper.

Your writing style is spartan, factual, and scientific.

Steps:

- read and understand the research paper: {in-0}

- think about ideas in the paper that seem unjustified and lack supporting evidence

- summarize the top 3 ideas in the paper that seem unjustified

Take a deep breath and take it step-by-step.

Here’s the research paper I’m using for testing:

Understanding the planning of LLM agents: A survey

Here’s a sample output from the theoretical math agent, listing math concepts that are important to the research paper:

But I’m skeptical that “Formal Language Theory” is an important concept related to the paper.

My next step is to build out layer 2 with QA agents to provide feedback on the layer 1 agents’ responses.

This is a common design pattern in AI Agent Systems:

one agent does the initial work
another agent reviews the work and provides feedback
repeat until answer satisfies quality or evaluation metrics

Layer 2

In layer 2, I similarly have separate agents like in layer 1:

deep learning research agent
theoretical mathematician agent
skeptic agent

Now, these agents will review and provide feedback on the agent outputs from layer 1.

Here’s how I built out layer 2 in Stack AI. This screenshot shows sample outputs from running the AI agent system.

Pretty cool!

Sabrina Ramonov @ sabrina.dev

Here are the prompts for each layer 2 agent:

Deep Learning Researcher

You are a professor in Deep Learning with a PhD from Stanford.

Your goal is to determine which deep learning methods from a list you will be given are "not applicable" to a research paper that you will be given.

Steps:

- silently read and understand this research paper: {in-0}

- silently review this proposed list of deep learning methods related to the research paper: {llm-0}

- which deep learning methods from the list are "not applicable" to the research paper?

Take a deep breath and take it step-by-step.

Math Professor

You are a professor in theoretical mathematics.

Your goal is to determine which math concepts from a list you will be given are "not applicable" to a research paper that you will be given.

Steps:

- silently read and understand this research paper: {in-0}

- silently review this proposed list of math concepts related to the research paper: {llm-1}

- which math concepts from the list are "not applicable" to the research paper?

Take a deep breath and take it step-by-step.

Skeptic

You are an expert in artificial intelligence, deep learning, and mathematics.

Your goal is to determine which assumptions from a list you will be given are most essential to a research paper that you will be given.

Steps:

- silently read and understand this research paper: {in-0}

- silently review this list of unjustified assumptions in the research paper: {llm-2}

- for each unjustified assumption, silently review the research paper's references to check if there is supporting evidence. If there are many references, remove the assumption from the list of unjustified assumptions.

- which unjustified assumptions have the biggest impact on the conclusion of the research paper? Provide an example of how the assumption, if false, would substantially affect the research paper's findings.

Take a deep breath and take it step-by-step.

Recall the math agent’s output from layer 1 — it claimed “Formal Language Theory” is an important concept related to the paper.

Let’s see what our layer 2 math agent thinks…

Below, you can see our layer 2 math agent reviews the list from our layer 1 math agent and ultimately concludes:

Formal Language Theory is not applicable to the paper.

This is a great example of our layer 2 “QA agent” reviewing a previous agent’s work. Now let’s take this feedback to further improve the answer.

Layer 3

In layer 3, I’ll create another deep learning agent and math agent.

These agents will:

use same prompt as layer 1 agents
incorporate feedback from layer 2 agents
thus, generating a higher-quality final answer

Sadly, due to the limitations of low-code agent builder platforms, I can’t build a back-and-forth multi-agent communication paradigm. All low-code agent platforms I’ve tried are limited to directed acyclic graphs.

In an ideal tool, my layer 1 agent and layer 2 agent would talk back-and-forth, provide feedback, and iterate together to arrive at a final high-quality answer.

To mimic this interaction, I am “unrolling” the feedback chain:

layer 1 agent does initial work
layer 2 agent provides feedback on initial work
layer 3 agent uses the same prompt as layer 1 agent (“environment”) AND incorporates feedback from layer 2 agent

So, here’s layer 3 built out in Stack AI.

This screenshot shows sample outputs from running the full system.

Sabrina Ramonov @ sabrina.dev

Here are the prompts:

Deep Learning Researcher

You are a professor in Deep Learning with a PhD from Stanford.

Your goal is to extract interesting deep learning methods that are related to the research paper: {in-0}

Steps:

- read and understand this research paper: {in-0}

- think silently about the most important deep learning methods related to the research paper

- silently think about feedback from {llm-3}

- provide a final list of deep learning methods are related to the research paper. Provide an example from the research paper of how each deep learning method is being used.

Your final answer should not include deep learning methods that aren't applicable to the research paper.

Take a deep breath and take it step-by-step.

Math Professor

You are an expert in theoretical math.

Your goal is to extract theoretical math concepts that are important to this paper: {in-0}

Steps:

- read and understand this research paper: {in-0}

- think silently about the most important theoretical math concepts related to the research paper

- silently think about feedback from {llm-4}

- provide a final list of theoretical math concepts are related to the research paper. Provide an example from the research paper of how each math concept is being used.

Your final answer should not include math concepts that aren't applicable to the research paper.

Take a deep breath and take it step-by-step.

Along the same thread, note the final answer from the layer 3 math agent.

Even though it used the same exact prompt as the layer 1 math agent, it incorporated the feedback from the layer 2 math agent!

Our layer 3 math agent removed “Formal Language Theory” from the final list of theoretical math concepts related to the research paper:

Closing Comments

To recap, in under an hour, I’ve built out a multi-agent system to help me quickly analyze AI research papers from the perspectives of a deep learning researcher, theoretical mathematician, and technical skeptic.

Sabrina Ramonov @ sabrina.dev

You can try it out here:

Stack AI · The Platform for Enterprise AI

Integrate AI into your organization. Stack AI boosts productivity by integrating AI with your data sources.

www.stack-ai.com/form/83883819-33d3-443c-88db-18106c9226da/ba81c6e6-b8af-4a97-b37a-174502daf8c4/6661deb730cbde865feba7f7

This link won’t last long because StackAI limits me to 1 project at a time 🫤 …eventually I’ll want to try building something else with it.

I would 100% use an AI Agent System like this on a regular basis to keep up with the firehose of AI research. Just need to add a daily fetch API call to arXiv.

So, I’m considering migrating it to a less restrictive and less expensive no-code agent platform, such as Flowise (basically a node-based UI atop Langchain).

I’m also tempted to ditch low-code and adopt CrewAI or AutoGen, especially to build out complex back-and-forth interaction and feedback cycles… TBD!

Have fun building!

Sabrina Ramonov

P.S. If you’re enjoying the free newsletter, it’d mean the world to me if you share it with others. My newsletter just launched, and every single referral helps. Thank you!

Share by copying and pasting the link: https://www.sabrina.dev