Table of Contents

VAST.Audio Library

The VAST.Audio library provides acoustic echo cancellation and WebSocket-based PCM audio streaming for VASTreaming applications. It enables real-time audio communication with echo removal and network distribution of raw PCM audio.

Overview

Feature Description
Echo Cancellation Acoustic echo cancellation for full-duplex audio
WebSocket PCM Server Stream raw PCM audio via WebSocket
Publisher-Subscriber Model Multiple clients per publishing point
Cross-Platform Native library support for Windows, macOS, Linux

Requirements

  • .NET: .NET 6.0 or later
  • Platforms: Windows (x86, x64, ARM64), macOS, Linux, Android
  • Native Library Dependencies: VAST.Audio.Native* (platform-specific)
  • Dependencies: VAST.Common

Echo Cancellation

The IEchoCanceller interface provides acoustic echo cancellation for real-time audio communication. It removes speaker audio (render) that leaks into the microphone (capture), enabling clear full-duplex audio.

How It Works

┌─────────────────────────────────────────────────────────────┐
│  Speaker Output (Render)                                    │
│      ↓                                                      │
│  ┌─────────┐     Acoustic Path     ┌─────────┐              │
│  │ Speaker │ ──────────────────→   │  Mic    │              │
│  └─────────┘     (Echo)            └─────────┘              │
│      ↓                                 ↓                    │
│  AnalyzeRender()                   ProcessCapture()         │
│      ↓                                 ↓                    │
│  ┌─────────────────────────────────────────────┐            │
│  │          Echo Canceller                     │            │
│  │ (Builds echo model, subtracts from capture) │            │
│  └─────────────────────────────────────────────┘            │
│                        ↓                                    │
│              Clean Capture Audio                            │
└─────────────────────────────────────────────────────────────┘

Features

Feature Support
Sample Formats S16 (16-bit integer), FLT (32-bit float)
Sample Rates Any valid sample rate
Channels Configurable per stream
Cancellation Levels 0-4 (higher = stronger)
Delay Compensation Configurable render/capture delay

Creating an Echo Canceller

var canceller = VAST.Media.EchoCancellerFactory.Create();

// Configure media types
canceller.CaptureMediaType = new VAST.Common.MediaType
{
    ContentType = ContentType.Audio,
    SampleFormat = SampleFormat.S16,
    SampleRate = 48000,
    Channels = 1,
};

canceller.RenderMediaType = new VAST.Common.MediaType
{
    ContentType = ContentType.Audio,
    SampleFormat = SampleFormat.S16,
    SampleRate = 48000,
    Channels = 2,
};

// Set cancellation strength (0-4)
canceller.CancellationLevel = 3;

// Optional: Set delay if known (in 100-nanosecond units)
canceller.Delay = 500000; // 50ms

Processing Audio

The echo canceller requires both render and capture audio to function:

// Feed speaker output to build echo reference
canceller.AnalyzeRender(renderSample);

// Process microphone input to remove echo
if (canceller.IsReady)
{
    var cleanSample = canceller.ProcessCapture(captureSample);
    // cleanSample contains audio with echo removed
}

Cancellation Levels

Level Description
0 Minimal cancellation
1 Light cancellation
2 Moderate cancellation
3 Strong cancellation (recommended)
4 Maximum cancellation

Higher levels provide more aggressive echo removal but may affect voice quality. Start with level 3 and adjust based on your environment.

WebSocket PCM Server

The WsPcmServer enables streaming raw PCM audio over WebSocket connections using a publisher-subscriber model.

Note

WsPcmServer is not available as a standalone server. It operates as part of the VAST.Network.StreamingServer infrastructure and is instantiated by the StreamingServer automatically when enabled.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    WsPcmServer                              │
├─────────────────────────────────────────────────────────────┤
│  Publishing Point: /audio/stream1                           │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Publisher ──────→ Client 1                           │   │
│  │           ──────→ Client 2                           │   │
│  │           ──────→ Client 3                           │   │
│  └──────────────────────────────────────────────────────┘   │
├─────────────────────────────────────────────────────────────┤
│  Publishing Point: /audio/stream2                           │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Publisher ──────→ Client A                           │   │
│  │           ──────→ Client B                           │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Protocol

The WebSocket PCM protocol uses:

  • Subprotocol: vast-ws-pcm
  • First Frame: JSON text frame with stream descriptor
  • Subsequent Frames: Binary frames containing raw PCM samples

Stream Descriptor Format

{
    "sample-format": "S16",
    "channels": 2,
    "sample-rate": 48000
}

Server Parameters

Configure the server using WsPcmServerParameters:

var parameters = new VAST.Audio.WsPcmServerParameters
{
    WsPcmPath = "/pcm"
};

Native Library Support

The library uses platform-specific native libraries for audio processing. The correct library is automatically selected based on the runtime platform:

Platform Native Library
Windows x64 VAST.Audio.Native64.dll
Windows x86 VAST.Audio.Native32.dll
macOS VAST.Audio.Native.dylib
Linux x64 VAST.Audio.Native64
Linux x86 VAST.Audio.Native32
Android VAST.Audio.Android assembly

Troubleshooting

Common Issues

Issue Cause Solution
Echo not removed Canceller not ready Ensure both AnalyzeRender and ProcessCapture are called
UnauthorizedAccessException Missing license Verify license includes Audio features
Native library not found Missing dependencies Deploy platform-specific native libraries
WebSocket 406 error Protocol mismatch Client must request vast-ws-pcm subprotocol
WebSocket 403 error Authorization denied Check Authorize event handler
WebSocket 409 error Publishing point conflict Publishing point already has a publisher

Echo Cancellation Tips

  1. Feed render audio first: Call AnalyzeRender() before ProcessCapture() to build the echo reference
  2. Match timing: Render and capture should be synchronized; use Delay property to compensate for known latency
  3. Consistent sample rates: While different rates are supported, matching rates provide best results
  4. 10ms chunks: The canceller processes audio in 10ms chunks internally

See Also