Library Performance
VASTreaming libraries are engineered for high performance across a wide range of hardware, from resource-constrained embedded devices like Raspberry Pi to high-load server environments handling thousands of concurrent connections.
Latency
VASTreaming is optimized for low-latency streaming, which is critical for real-time applications such as video surveillance, live broadcasting, and interactive communications.
| Protocol | Typical LAN Latency of VASTreaming Library |
|---|---|
| SRT | 100-200 ms |
| RTSP/RTP | 200-300 ms |
| RTMP | 200-300 ms |
| WebRTC | 200-300 ms |
| WebTransport | 200-300 ms |
| MJPEG over HTTP | 100-200 ms |
| HLS | 15-30 seconds |
| LL-HLS | 2-5 seconds |
| MPEG-DASH | 5-20 seconds |
For SRT protocol latency can be reduced below 100ms on LAN environments by utilizing our custom protocol extensions.
Throughput and Scalability
The following benchmarks demonstrate the performance levels achievable with VASTreaming libraries.
High-Performance Server Environment
Test configuration: Cloud server with 8 vCPUs and 16 GB RAM. CPU utilization remained below 70% at maximum load while maintaining good Quality of Service (QoS).
| Protocol | Max Concurrent Sessions | Notes |
|---|---|---|
| RTSP | ~900 | Full video streaming |
| RTMP | ~600 | Full video streaming |
| HLS/DASH | ~15,000 | HTTP-based delivery |
| WebRTC | less than 100 | Peer connections with TURN relay |
Note that WebRTC implementation is based on the Google Native WebRTC library whose performance is subpar and is not suitable for high performance servers.
Embedded Environment (Raspberry Pi 4)
| Scenario | Max Concurrent Sessions |
|---|---|
| RTSP (audio only) | ~40 |
Performance on embedded devices depends heavily on whether hardware acceleration is available and utilized.
Hardware Acceleration Impact
Hardware acceleration significantly improves performance for encoding and decoding operations:
| Operation | Consumer grade CPU | Consumer grade GPU | Tesla T4 | Tesla A16 |
|---|---|---|---|---|
| H.264 1080p decode | 1-2 streams | 8-16 streams | ~40 streams | ~130 streams |
| H.264 1080p encode | 1 stream | 4-8 streams | ~20 streams | ~65 streams |
| H.265 4K decode | Limited | 2-4 streams | No data | No data |
| H.265 4K encode | Not practical | 1-2 streams | No data | No data |
Actual consumer grade CPU and GPU performance varies based on specific hardware (NVIDIA GPU, Intel Quick Sync, Apple VideoToolbox, etc.) and encoding parameters.
Tesla GPU figures are taken from production environments but the actual figures heavily depend on input stream and encoding parameters.
Video Mixing Performance
The MixingSource class enables real-time compositing of multiple video streams into a single output, commonly used for video walls, picture-in-picture layouts, and multi-camera productions.
Single MixingSource
A single MixingSource instance can composite up to 32 concurrent video streams in one rendering pass. For optimal performance up to 10-20 simultaneous videos are recommended for maintaining 30 fps output.
Chained MixingSources
For applications requiring more than 32 video inputs, multiple MixingSource instances can be chained together. Each MixingSource composites its inputs into an intermediate output, which then feeds into a parent MixingSource for final composition.
Using this approach, 64 or more videos can be merged into a single video wall. For best performance, use 8 first-level MixingSources each handling 8 videos:
MixingSource 1 (8 videos) ──┐
MixingSource 2 (8 videos) ──┤
MixingSource 3 (8 videos) ──┤
MixingSource 4 (8 videos) ──┼──► Final MixingSource (64 videos output)
MixingSource 5 (8 videos) ──┤
MixingSource 6 (8 videos) ──┤
MixingSource 7 (8 videos) ──┤
MixingSource 8 (8 videos) ──┘
This hierarchical approach distributes the rendering load evenly and scales efficiently while maintaining real-time performance.
Memory Management
VASTreaming employs a tiered memory pool architecture to minimize garbage collection pressure and heap fragmentation during long-running operation:
- Small buffer pool (5 KB) - Metadata and control messages
- Standard buffer pool (50 KB) - Compressed video frames
- Large buffer pool (1 MB) - Uncompressed SD/HD video frames
- Extra large buffer pool (>1 MB) - 4K and higher resolution frames
The library includes background monitoring that detects heap fragmentation and schedules Large Object Heap (LOH) compaction during periods of low activity, ensuring stable performance over extended operation.
Performance Optimization Tips
To achieve optimal performance with VASTreaming libraries:
- Choose the right protocol - Use UDP-based protocols (RTP, SRT) for lowest latency; use HTTP-based protocols (HLS, DASH) for maximum compatibility and scalability
- Optimize GOP structure - Shorter GOPs enable faster channel switching and lower latency but reduce compression efficiency
- Monitor memory usage - Use the built-in diagnostics to track buffer pool utilization and detect potential memory issues
- Enable hardware acceleration - Use platform-specific extension assemblies (VAST.Common.Ext.Win32, etc.) to leverage GPU encoding/decoding when applicable
- Scale horizontally - For very high loads, distribute connections across multiple server instances behind a load balancer