pranav_pratik demonstrates how to deliver a ChatGPT-style streaming AI experience in your own apps by integrating Azure OpenAI Service with real-time SignalR communication and Angular frontend, covering the end-to-end architecture and best practices.

Real-Time AI Streaming with Azure OpenAI and SignalR

This guide outlines how to deliver incremental, live-streamed AI responses in your own applications—similar to ChatGPT—by combining Azure OpenAI Service, ASP.NET Core SignalR, and an Angular frontend.

Why Stream AI Responses?

  • Reduces perceived latency: Users see the answer as it’s generated, improving experience and engagement.
  • Mimics ChatGPT: The typing effect keeps users interested, especially for long-form answers.
  • Scales for enterprise: Azure SignalR Service manages high connection loads and simplifies scale-out.

Solution Overview

  • A SignalR Hub in ASP.NET Core calls Azure OpenAI with streaming enabled, sending partial results to clients.
  • The Angular client receives and renders partial output as it’s produced, optionally showing typing indicators.
  • Azure SignalR Service can be used to manage scale, removing the need for sticky sessions on the backend.

Architecture

  • Backend: .NET 8 API using ASP.NET Core SignalR and Azure OpenAI.
  • Frontend: Angular 16+ using @microsoft/signalr client to receive real-time messages.
  • Optional: Azure SignalR Service for scalability.

Prerequisites

  • Azure OpenAI resource (e.g., with gpt-4o or gpt-4o-mini model)
  • .NET 8 and ASP.NET Core
  • Angular 16+

Backend Implementation

1. Install Required Packages

dotnet add package Microsoft.AspNetCore.SignalR
 dotnet add package Azure.AI.OpenAI --prerelease
 dotnet add package Azure.Identity
 dotnet add package Microsoft.Extensions.AI
 dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease
 dotnet add package Microsoft.Azure.SignalR

2. Use DefaultAzureCredential for Authentication

Leveraging Entra ID with DefaultAzureCredential means you don’t store secrets in code.

3. Program.cs Snippet

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddSignalR();
// builder.Services.AddSignalR().AddAzureSignalR(); // Uncomment for Azure SignalR scaling
builder.Services.AddSingleton<AiStreamingService>();
var app = builder.Build();
app.MapHub<ChatHub>("/chat");
app.Run();

4. AiStreamingService.cs (AI Streaming Logic)

Streams content from Azure OpenAI using modern .NET AI extensions:

public class AiStreamingService {
  private readonly IChatClient _chatClient;
  public AiStreamingService(IConfiguration config) {
    var endpoint = new Uri(config["AZURE_OPENAI_ENDPOINT"]);
    var deployment = config["AZURE_OPENAI_DEPLOYMENT"];
    var azureClient = new AzureOpenAIClient(endpoint, new DefaultAzureCredential());
    _chatClient = azureClient.GetChatClient(deployment).AsIChatClient();
  }
  public async IAsyncEnumerable<string> StreamReplyAsync(string userMessage) {
    var messages = new List<ChatMessage> {
      ChatMessage.CreateSystemMessage("You are a helpful assistant."),
      ChatMessage.CreateUserMessage(userMessage)
    };
    await foreach (var update in _chatClient.CompleteChatStreamingAsync(messages)) {
      var chunk = string.Join("", update.Content
        .Where(p => p.Kind == ChatMessageContentPartKind.Text)
        .Select(p => ((TextContent)p).Text));
      if (!string.IsNullOrEmpty(chunk))
        yield return chunk;
    }
  }
}

5. ChatHub.cs (SignalR Hub)

Forwards incremental AI responses to the final client:

public class ChatHub : Hub {
  private readonly AiStreamingService _ai;
  public ChatHub(AiStreamingService ai) => _ai = ai;
  public async Task AskAi(string prompt) {
    var messageId = Guid.NewGuid().ToString("N");
    await Clients.Caller.SendAsync("typing", messageId, true);
    await foreach (var partial in _ai.StreamReplyAsync(prompt)) {
      await Clients.Caller.SendAsync("partial", messageId, partial);
    }
    await Clients.Caller.SendAsync("typing", messageId, false);
    await Clients.Caller.SendAsync("completed", messageId);
  }
}

Frontend Implementation (Angular)

1. Install SignalR Package

npm i @microsoft/signalr

2. Create ai-stream.service.ts

@Injectable({ providedIn: 'root' })
export class AiStreamService {
  private connection?: signalR.HubConnection;
  private typing$ = new BehaviorSubject<boolean>(false);
  private partial$ = new BehaviorSubject<string>('');
  private completed$ = new BehaviorSubject<boolean>(false);

  get typing(): Observable<boolean> { return this.typing$.asObservable(); }
  get partial(): Observable<string> { return this.partial$.asObservable(); }
  get completed(): Observable<boolean> { return this.completed$.asObservable(); }

  async start(): Promise<void> {
    this.connection = new signalR.HubConnectionBuilder()
      .withUrl('/chat')
      .withAutomaticReconnect()
      .configureLogging(signalR.LogLevel.Information)
      .build();
    this.connection.on('typing', (_id: string, on: boolean) => this.typing$.next(on));
    this.connection.on('partial', (_id: string, text: string) => {
      this.partial$.next((this.partial$.value || '') + text);
    });
    this.connection.on('completed', (_id: string) => this.completed$.next(true));
    await this.connection.start();
  }
  async ask(prompt: string): Promise<void> {
    this.partial$.next('');
    this.completed$.next(false);
    await this.connection?.invoke('AskAi', prompt);
  }
}

3. ai-chat.component.ts

@Component({ selector: 'app-ai-chat', templateUrl: './ai-chat.component.html', styleUrls: ['./ai-chat.component.css'] })
export class AiChatComponent implements OnInit {
  prompt = '';
  output = '';
  typing = false;
  done = false;
  constructor(private ai: AiStreamService) {}
  async ngOnInit() {
    await this.ai.start();
    this.ai.typing.subscribe(on => this.typing = on);
    this.ai.partial.subscribe(text => this.output = text);
    this.ai.completed.subscribe(done => this.done = done);
  }
  async send() {
    this.output = '';
    this.done = false;
    await this.ai.ask(this.prompt);
  }
}

4. ai-chat.component.html

<div class="chat">
  <div class="prompt">
    <input [(ngModel)]="prompt" placeholder="Ask me anything…" />
    <button (click)="send()">Send</button>
  </div>
  <div class="response">
    <pre></pre>
    <div class="typing" *ngIf="typing">Assistant is typing…</div>
    <div class="done" *ngIf="done">✓ Completed</div>
  </div>
</div>

Streaming Modes & Content Filtering

  • Default: Buffers and filters output by chunks before sending.
  • Async Filter: Immediate token streaming, moderation handled asynchronously (needs client-side handling for delayed moderation actions).

Best Practices

  • Batch updates to the DOM for efficiency.
  • Only log full responses server-side after completion.
  • Store connection strings in Azure Key Vault and rotate regularly. Authenticate with Entra ID (DefaultAzureCredential).

Scaling & Security

  • Use Azure SignalR Service for scale; avoid sticky sessions in distributed setups.
  • Configure CORS if hosting frontend and backend on different origins.

Learn More


Author: pranav_pratik

This post appeared first on “Microsoft Tech Community”. Read the entire article here