Microsoft Developer’s session with Ayan Gupta and Julien Dubois explores building advanced multi-agent AI systems in Java using LangChain4j, with hands-on agent orchestration, GPT-4o Mini, and speech synthesis. Learn how these technologies work together to automate creative and practical AI workflows.

Building Multi-Agent AI Systems with LangChain4j and Java

Presented by Microsoft Developer: Ayan Gupta (host) and Julien Dubois (guest)

This episode dives into modern AI development practices with a focus on multi-agent orchestration using LangChain4j. Learn how to construct a coordinated system where specialized AI agents collaborate to perform complex tasks—such as generating poetry and transforming it into speech automatically.

Session Overview

  • Introduction to AI agents and their real-world applications (e.g., automating a coffee order with agent coordination).
  • Step-by-step creation of a three-agent system:
    • Author Agent: Uses GPT-4o Mini for poetry generation.
    • Actor Agent: Leverages Mistral 3B and MaryTTS for text-to-speech (TTS), outputting audio files.
    • Supervisor Agent: Orchestrates workflow, ensuring seamless collaboration.

Orchestration Strategies

  • Pure AI Orchestration: A supervisor LLM determines which agent to call and when.
  • Workflow-Based Orchestration: LangChain4j’s SequenceBuilder and related APIs enable defining explicit sequences, parallel flows, and loops.
  • For this session, workflow orchestration is emphasized for clarity and accessibility.

Technical Walkthrough

  • Author Agent
    • Implements poetry generation using GPT-4o Mini.
    • Features annotations such as @UserMessage and @Agent for input handling.
    • Testing and validating stand-alone behavior.
  • Actor Agent
    • Integrates Mistral 3B for advanced text processing.
    • Connects to MaryTTS (run via Docker container) for TTS conversion.
    • Uses the @Tool annotation for tool integration.
  • Supervisor Agent
    • Maintains a shared context map for workflow state.
    • Employs SequenceBuilder to link Author and Actor agents.
  • Demo: Request a poem about the Java Virtual Machine (JVM), automatically have the Author agent generate it, the Actor agent render audio, and the Supervisor chain the process—resulting in a playable audio file (output.wav).

Implementation Steps

  1. Set up LangChain4j and Agent Interfaces
  2. Register Model APIs (GPT-4o Mini, Mistral 3B)
  3. Configure Docker for MaryTTS
  4. Annotate and deploy agents
  5. Use SequenceBuilder to define coordination logic
  6. Test end-to-end: from prompt input to audio output

Key Technologies

  • Java + LangChain4j
  • GPT-4o Mini (text generation)
  • Mistral 3B (text processing)
  • MaryTTS (text-to-speech, via Docker)
  • SequenceBuilder (workflow orchestration)

Resources


By following this walkthrough, developers can build and orchestrate their own intelligent multi-agent AI solutions in Java, combining generative text and speech, and unlocking new workflow automation possibilities. Presented by Julian Dubois with Ayan Gupta for Microsoft Developer.