VAST.Audio Library

The VAST.Audio library provides acoustic echo cancellation and WebSocket-based PCM audio streaming for VASTreaming applications. It enables real-time audio communication with echo removal and network distribution of raw PCM audio.

Overview

Feature	Description
Echo Cancellation	Acoustic echo cancellation for full-duplex audio
WebSocket PCM Server	Stream raw PCM audio via WebSocket
Publisher-Subscriber Model	Multiple clients per publishing point
Cross-Platform	Native library support for Windows, macOS, Linux

Requirements

.NET: .NET 6.0 or later
Platforms: Windows (x86, x64, ARM64), macOS, Linux, Android
Native Library Dependencies: VAST.Audio.Native* (platform-specific)
Dependencies: VAST.Common

Echo Cancellation

The IEchoCanceller interface provides acoustic echo cancellation for real-time audio communication. It removes speaker audio (render) that leaks into the microphone (capture), enabling clear full-duplex audio.

How It Works

┌─────────────────────────────────────────────────────────────┐
│  Speaker Output (Render)                                    │
│      ↓                                                      │
│  ┌─────────┐     Acoustic Path     ┌─────────┐              │
│  │ Speaker │ ──────────────────→   │  Mic    │              │
│  └─────────┘     (Echo)            └─────────┘              │
│      ↓                                 ↓                    │
│  AnalyzeRender()                   ProcessCapture()         │
│      ↓                                 ↓                    │
│  ┌─────────────────────────────────────────────┐            │
│  │          Echo Canceller                     │            │
│  │ (Builds echo model, subtracts from capture) │            │
│  └─────────────────────────────────────────────┘            │
│                        ↓                                    │
│              Clean Capture Audio                            │
└─────────────────────────────────────────────────────────────┘

Features

Feature	Support
Sample Formats	S16 (16-bit integer), FLT (32-bit float)
Sample Rates	Any valid sample rate
Channels	Configurable per stream
Cancellation Levels	0-4 (higher = stronger)
Delay Compensation	Configurable render/capture delay

Creating an Echo Canceller

var canceller = VAST.Media.EchoCancellerFactory.Create();

// Configure media types
canceller.CaptureMediaType = new VAST.Common.MediaType
{
    ContentType = ContentType.Audio,
    SampleFormat = SampleFormat.S16,
    SampleRate = 48000,
    Channels = 1,
};

canceller.RenderMediaType = new VAST.Common.MediaType
{
    ContentType = ContentType.Audio,
    SampleFormat = SampleFormat.S16,
    SampleRate = 48000,
    Channels = 2,
};

// Set cancellation strength (0-4)
canceller.CancellationLevel = 3;

// Optional: Set delay if known (in 100-nanosecond units)
canceller.Delay = 500000; // 50ms

Processing Audio

The echo canceller requires both render and capture audio to function:

// Feed speaker output to build echo reference
canceller.AnalyzeRender(renderSample);

// Process microphone input to remove echo
if (canceller.IsReady)
{
    var cleanSample = canceller.ProcessCapture(captureSample);
    // cleanSample contains audio with echo removed
}

Cancellation Levels

Level	Description
0	Minimal cancellation
1	Light cancellation
2	Moderate cancellation
3	Strong cancellation (recommended)
4	Maximum cancellation

Higher levels provide more aggressive echo removal but may affect voice quality. Start with level 3 and adjust based on your environment.

WebSocket PCM Server

The WsPcmServer enables streaming raw PCM audio over WebSocket connections using a publisher-subscriber model.

Note

WsPcmServer is not available as a standalone server. It operates as part of the VAST.Network.StreamingServer infrastructure and is instantiated by the StreamingServer automatically when enabled.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    WsPcmServer                              │
├─────────────────────────────────────────────────────────────┤
│  Publishing Point: /audio/stream1                           │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Publisher ──────→ Client 1                           │   │
│  │           ──────→ Client 2                           │   │
│  │           ──────→ Client 3                           │   │
│  └──────────────────────────────────────────────────────┘   │
├─────────────────────────────────────────────────────────────┤
│  Publishing Point: /audio/stream2                           │
│  ┌──────────────────────────────────────────────────────┐   │
│  │ Publisher ──────→ Client A                           │   │
│  │           ──────→ Client B                           │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

Protocol

The WebSocket PCM protocol uses:

Subprotocol: vast-ws-pcm
First Frame: JSON text frame with stream descriptor
Subsequent Frames: Binary frames containing raw PCM samples

Stream Descriptor Format

{
    "sample-format": "S16",
    "channels": 2,
    "sample-rate": 48000
}

Server Parameters

Configure the server using WsPcmServerParameters:

var parameters = new VAST.Audio.WsPcmServerParameters
{
    WsPcmPath = "/pcm"
};

Native Library Support

The library uses platform-specific native libraries for audio processing. The correct library is automatically selected based on the runtime platform:

Platform	Native Library
Windows x64	VAST.Audio.Native64.dll
Windows x86	VAST.Audio.Native32.dll
macOS	VAST.Audio.Native.dylib
Linux x64	VAST.Audio.Native64
Linux x86	VAST.Audio.Native32
Android	VAST.Audio.Android assembly

Troubleshooting

Common Issues

Issue	Cause	Solution
Echo not removed	Canceller not ready	Ensure both AnalyzeRender and ProcessCapture are called
UnauthorizedAccessException	Missing license	Verify license includes Audio features
Native library not found	Missing dependencies	Deploy platform-specific native libraries
WebSocket 406 error	Protocol mismatch	Client must request `vast-ws-pcm` subprotocol
WebSocket 403 error	Authorization denied	Check Authorize event handler
WebSocket 409 error	Publishing point conflict	Publishing point already has a publisher

Echo Cancellation Tips

Feed render audio first: Call AnalyzeRender() before ProcessCapture() to build the echo reference
Match timing: Render and capture should be synchronized; use Delay property to compensate for known latency
Consistent sample rates: While different rates are supported, matching rates provide best results
10ms chunks: The canceller processes audio in 10ms chunks internally

Table of Contents