Library Performance

VASTreaming libraries are engineered for high performance across a wide range of hardware, from resource-constrained embedded devices like Raspberry Pi to high-load server environments handling thousands of concurrent connections.

Latency

VASTreaming is optimized for low-latency streaming, which is critical for real-time applications such as video surveillance, live broadcasting, and interactive communications.

Protocol	Typical LAN Latency of VASTreaming Library
SRT	100-200 ms
RTSP/RTP	200-300 ms
RTMP	200-300 ms
WebRTC	200-300 ms
WebTransport	200-300 ms
MJPEG over HTTP	100-200 ms
HLS	15-30 seconds
LL-HLS	2-5 seconds
MPEG-DASH	5-20 seconds

For SRT protocol latency can be reduced below 100ms on LAN environments by utilizing our custom protocol extensions.

Throughput and Scalability

The following benchmarks demonstrate the performance levels achievable with VASTreaming libraries.

High-Performance Server Environment

Test configuration: Cloud server with 8 vCPUs and 16 GB RAM. CPU utilization remained below 70% at maximum load while maintaining good Quality of Service (QoS).

Protocol	Max Concurrent Sessions	Notes
RTSP	~900	Full video streaming
RTMP	~600	Full video streaming
HLS/DASH	~15,000	HTTP-based delivery
WebRTC	less than 100	Peer connections with TURN relay

Note that WebRTC implementation is based on the Google Native WebRTC library whose performance is subpar and is not suitable for high performance servers.

Embedded Environment (Raspberry Pi 4)

Scenario	Max Concurrent Sessions
RTSP (audio only)	~40

Performance on embedded devices depends heavily on whether hardware acceleration is available and utilized.

Hardware Acceleration Impact

Hardware acceleration significantly improves performance for encoding and decoding operations:

Operation	Consumer grade CPU	Consumer grade GPU	Tesla T4	Tesla A16
H.264 1080p decode	1-2 streams	8-16 streams	~40 streams	~130 streams
H.264 1080p encode	1 stream	4-8 streams	~20 streams	~65 streams
H.265 4K decode	Limited	2-4 streams	No data	No data
H.265 4K encode	Not practical	1-2 streams	No data	No data

Actual consumer grade CPU and GPU performance varies based on specific hardware (NVIDIA GPU, Intel Quick Sync, Apple VideoToolbox, etc.) and encoding parameters.

Tesla GPU figures are taken from production environments but the actual figures heavily depend on input stream and encoding parameters.

Video Mixing Performance

The MixingSource class enables real-time compositing of multiple video streams into a single output, commonly used for video walls, picture-in-picture layouts, and multi-camera productions.

Single MixingSource

A single MixingSource instance can composite up to 32 concurrent video streams in one rendering pass. For optimal performance up to 10-20 simultaneous videos are recommended for maintaining 30 fps output.

Chained MixingSources

For applications requiring more than 32 video inputs, multiple MixingSource instances can be chained together. Each MixingSource composites its inputs into an intermediate output, which then feeds into a parent MixingSource for final composition.

Using this approach, 64 or more videos can be merged into a single video wall. For best performance, use 8 first-level MixingSources each handling 8 videos:

MixingSource 1 (8 videos) ──┐
MixingSource 2 (8 videos) ──┤
MixingSource 3 (8 videos) ──┤
MixingSource 4 (8 videos) ──┼──► Final MixingSource (64 videos output)
MixingSource 5 (8 videos) ──┤
MixingSource 6 (8 videos) ──┤
MixingSource 7 (8 videos) ──┤
MixingSource 8 (8 videos) ──┘

This hierarchical approach distributes the rendering load evenly and scales efficiently while maintaining real-time performance.

Memory Management

VASTreaming employs a tiered memory pool architecture to minimize garbage collection pressure and heap fragmentation during long-running operation:

Small buffer pool (5 KB) - Metadata and control messages
Standard buffer pool (50 KB) - Compressed video frames
Large buffer pool (1 MB) - Uncompressed SD/HD video frames
Extra large buffer pool (>1 MB) - 4K and higher resolution frames

The library includes background monitoring that detects heap fragmentation and schedules Large Object Heap (LOH) compaction during periods of low activity, ensuring stable performance over extended operation.

Performance Optimization Tips

To achieve optimal performance with VASTreaming libraries:

Choose the right protocol - Use UDP-based protocols (RTP, SRT) for lowest latency; use HTTP-based protocols (HLS, DASH) for maximum compatibility and scalability
Optimize GOP structure - Shorter GOPs enable faster channel switching and lower latency but reduce compression efficiency
Monitor memory usage - Use the built-in diagnostics to track buffer pool utilization and detect potential memory issues
Enable hardware acceleration - Use platform-specific extension assemblies (VAST.Common.Ext.Win32, etc.) to leverage GPU encoding/decoding when applicable
Scale horizontally - For very high loads, distribute connections across multiple server instances behind a load balancer

Table of Contents