Aryan Jainaryanj {at} mit {dot} edu

Multi-Agent Observability + IDE

8/13/2024

Note: Post is from YC's Launch BF forum. Represents idea #3 during the batch.

Hey YC! Aryan and Ayush here.

We recently built a tool to help debug our multi-agent systems and wanted to share it with other founders to try.

The Problem

When developing multi-agent applications, where you chain multiple LLMs/functions together, making a small change (e.g., step 5 of a 10-step process) typically requires rerunning the entire program from the beginning. This process is not only time-consuming but also wastes tokens, which become significant issues as systems grow in complexity.

This problem inspired LangGraph Studio, but it only works with their abstracted library. This is a big pain for developers who want complete control over their systems, especially when it's often a core feature of their product.

Our Solution

We've developed a specialized SDK + IDE for multi-agent development that intelligently caches outputs and maps relationships between steps. You still get granular control of your project, and we handle constructing a call graph between functions (the traditional advantage of using an abstraction like LangGraph).

To get started, install our SDK:

pip install kisho

Replace one line of code and you’re ready to go:

client = trace_oai(api_key=os.environ['OPENAI_API_KEY'])

To trace your custom functions, wrap them with our decorator:

@traced_function
def example(input_var):

What We Do

  • Smart Caching: Our system caches the output of each step in your agent chain.
  • Relationship Mapping: We create a map of how each step relates to others in your process.
  • Remote Execution: Include kisho.start_server() when you run your file, and our debugger can call and execute these functions remotely.
  • Instant Updates: When you make a change, our IDE only reruns the necessary steps, using cached data for the rest.

The result? You can iterate and test prompt and input changes much faster, without wasting any tokens.

Here's a simple demo of editing a prompt of a coding agent, and then watching how those changes cascade down the flow.

Demo of editing a prompt of a coding agent

What's Next

Our next focus is on evaluations, both general functionality and custom evaluations relevant to specific use cases. We’d love to learn how developers building multi-agent systems are currently creating and applying evaluations to improve their products.

We’re looking to personally work with design partners and build out extensive solutions for them. If you're working on agents, we’d love to hear from you: aryan@kisho.app