pranav_pratik demonstrates how to deliver a ChatGPT-style streaming AI experience in your own apps by integrating Azure OpenAI Service with real-time SignalR communication and Angular frontend, covering the end-to-end architecture and best practices.

Real-Time AI Streaming with Azure OpenAI and SignalR

This guide outlines how to deliver incremental, live-streamed AI responses in your own applications—similar to ChatGPT—by combining Azure OpenAI Service, ASP.NET Core SignalR, and an Angular frontend.

Why Stream AI Responses?

Reduces perceived latency: Users see the answer as it’s generated, improving experience and engagement.
Mimics ChatGPT: The typing effect keeps users interested, especially for long-form answers.
Scales for enterprise: Azure SignalR Service manages high connection loads and simplifies scale-out.

Solution Overview

A SignalR Hub in ASP.NET Core calls Azure OpenAI with streaming enabled, sending partial results to clients.
The Angular client receives and renders partial output as it’s produced, optionally showing typing indicators.
Azure SignalR Service can be used to manage scale, removing the need for sticky sessions on the backend.

Architecture

Backend: .NET 8 API using ASP.NET Core SignalR and Azure OpenAI.
Frontend: Angular 16+ using @microsoft/signalr client to receive real-time messages.
Optional: Azure SignalR Service for scalability.

Prerequisites

Azure OpenAI resource (e.g., with gpt-4o or gpt-4o-mini model)
.NET 8 and ASP.NET Core
Angular 16+

Backend Implementation

1. Install Required Packages

dotnet add package Microsoft.AspNetCore.SignalR
 dotnet add package Azure.AI.OpenAI --prerelease
 dotnet add package Azure.Identity
 dotnet add package Microsoft.Extensions.AI
 dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease
 dotnet add package Microsoft.Azure.SignalR

2. Use DefaultAzureCredential for Authentication

Leveraging Entra ID with DefaultAzureCredential means you don’t store secrets in code.

3. Program.cs Snippet

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddSignalR();
// builder.Services.AddSignalR().AddAzureSignalR(); // Uncomment for Azure SignalR scaling
builder.Services.AddSingleton<AiStreamingService>();
var app = builder.Build();
app.MapHub<ChatHub>("/chat");
app.Run();

4. AiStreamingService.cs (AI Streaming Logic)

Streams content from Azure OpenAI using modern .NET AI extensions:

public class AiStreamingService {
  private readonly IChatClient _chatClient;
  public AiStreamingService(IConfiguration config) {
    var endpoint = new Uri(config["AZURE_OPENAI_ENDPOINT"]);
    var deployment = config["AZURE_OPENAI_DEPLOYMENT"];
    var azureClient = new AzureOpenAIClient(endpoint, new DefaultAzureCredential());
    _chatClient = azureClient.GetChatClient(deployment).AsIChatClient();
  }
  public async IAsyncEnumerable<string> StreamReplyAsync(string userMessage) {
    var messages = new List<ChatMessage> {
      ChatMessage.CreateSystemMessage("You are a helpful assistant."),
      ChatMessage.CreateUserMessage(userMessage)
    };
    await foreach (var update in _chatClient.CompleteChatStreamingAsync(messages)) {
      var chunk = string.Join("", update.Content
        .Where(p => p.Kind == ChatMessageContentPartKind.Text)
        .Select(p => ((TextContent)p).Text));
      if (!string.IsNullOrEmpty(chunk))
        yield return chunk;
    }
  }
}

5. ChatHub.cs (SignalR Hub)

Forwards incremental AI responses to the final client:

public class ChatHub : Hub {
  private readonly AiStreamingService _ai;
  public ChatHub(AiStreamingService ai) => _ai = ai;
  public async Task AskAi(string prompt) {
    var messageId = Guid.NewGuid().ToString("N");
    await Clients.Caller.SendAsync("typing", messageId, true);
    await foreach (var partial in _ai.StreamReplyAsync(prompt)) {
      await Clients.Caller.SendAsync("partial", messageId, partial);
    }
    await Clients.Caller.SendAsync("typing", messageId, false);
    await Clients.Caller.SendAsync("completed", messageId);
  }
}

Frontend Implementation (Angular)

1. Install SignalR Package

npm i @microsoft/signalr

2. Create ai-stream.service.ts

@Injectable({ providedIn: 'root' })
export class AiStreamService {
  private connection?: signalR.HubConnection;
  private typing$ = new BehaviorSubject<boolean>(false);
  private partial$ = new BehaviorSubject<string>('');
  private completed$ = new BehaviorSubject<boolean>(false);

  get typing(): Observable<boolean> { return this.typing$.asObservable(); }
  get partial(): Observable<string> { return this.partial$.asObservable(); }
  get completed(): Observable<boolean> { return this.completed$.asObservable(); }

  async start(): Promise<void> {
    this.connection = new signalR.HubConnectionBuilder()
      .withUrl('/chat')
      .withAutomaticReconnect()
      .configureLogging(signalR.LogLevel.Information)
      .build();
    this.connection.on('typing', (_id: string, on: boolean) => this.typing$.next(on));
    this.connection.on('partial', (_id: string, text: string) => {
      this.partial$.next((this.partial$.value || '') + text);
    });
    this.connection.on('completed', (_id: string) => this.completed$.next(true));
    await this.connection.start();
  }
  async ask(prompt: string): Promise<void> {
    this.partial$.next('');
    this.completed$.next(false);
    await this.connection?.invoke('AskAi', prompt);
  }
}

3. ai-chat.component.ts

@Component({ selector: 'app-ai-chat', templateUrl: './ai-chat.component.html', styleUrls: ['./ai-chat.component.css'] })
export class AiChatComponent implements OnInit {
  prompt = '';
  output = '';
  typing = false;
  done = false;
  constructor(private ai: AiStreamService) {}
  async ngOnInit() {
    await this.ai.start();
    this.ai.typing.subscribe(on => this.typing = on);
    this.ai.partial.subscribe(text => this.output = text);
    this.ai.completed.subscribe(done => this.done = done);
  }
  async send() {
    this.output = '';
    this.done = false;
    await this.ai.ask(this.prompt);
  }
}

4. ai-chat.component.html

<div class="chat">
  <div class="prompt">
    <input [(ngModel)]="prompt" placeholder="Ask me anything…" />
    <button (click)="send()">Send</button>
  </div>
  <div class="response">
    <pre></pre>
    <div class="typing" *ngIf="typing">Assistant is typing…</div>
    <div class="done" *ngIf="done">✓ Completed</div>
  </div>
</div>

Streaming Modes & Content Filtering

Default: Buffers and filters output by chunks before sending.
Async Filter: Immediate token streaming, moderation handled asynchronously (needs client-side handling for delayed moderation actions).

Best Practices

Batch updates to the DOM for efficiency.
Only log full responses server-side after completion.
Store connection strings in Azure Key Vault and rotate regularly. Authenticate with Entra ID (DefaultAzureCredential).

Scaling & Security

Use Azure SignalR Service for scale; avoid sticky sessions in distributed setups.
Configure CORS if hosting frontend and backend on different origins.

Learn More

Author: pranav_pratik

This post appeared first on “Microsoft Tech Community”. Read the entire article here