Advanced Capture and Mixing
The ExtendedCapturePage demonstrates multi-source video mixing with compositing, overlays, visual effects, and live streaming. It extends the Simple Capture page with the ability to combine multiple video and audio inputs into a single mixed output.
Overview
The ExtendedCapturePage performs the following:
- Enumerates video devices, audio devices, and displays
- Creates a MixingSource that composites multiple inputs into a single output
- Supports several simultaneous input sources: camera, microphone, screen capture, video file, network stream, user-generated frames, text overlay, and logo overlay
- Arranges sources into layers with configurable positioning, z-order, and visual effects
- Previews the mixed output locally or the direct camera feed
- Streams the mixed output to a remote server (RTMP, RTSP, etc.)
- Dynamically adds and removes sources at runtime without stopping sessions
Input Sources
The mixing source accepts the following inputs, each toggled independently via checkboxes:
| Source | Type | Description |
|---|---|---|
| Camera | IVideoCaptureSource2 | Hardware camera capture |
| Microphone | IAudioCaptureSource2 | Hardware microphone capture |
| Screen | IScreenCaptureSource | Display or window capture |
| Video file | IsoSource | MP4 file played in a loop |
| Network stream | IMediaSource | Remote stream (RTSP, RTMP, etc.) |
| User source | VirtualNetworkSource | Programmatically generated frames |
| Text overlay | Text content | Dynamic timestamp updated every second |
| Logo overlay | Image content | Static watermark image |
Creating the Mixing Source
this.mixingSource = new VAST.Image.Mixing.MixingSource();
this.mixingSource.AddRef();
this.mixingSource.Parameters.VideoDecoderParameters.PreferredMediaFramework = videoFramework;
this.mixingSource.Parameters.VideoDecoderParameters.AllowHardwareAcceleration = allowHardwareAcceleration;
this.mixingSource.Parameters.AudioDecoderParameters.PreferredMediaFramework = audioFramework;
this.mixingSource.Parameters.AudioDecoderParameters.AllowHardwareAcceleration = false;
this.mixingSource.Parameters.VideoEncoderParameters.PreferredMediaFramework = videoFramework;
this.mixingSource.Parameters.VideoEncoderParameters.AllowHardwareAcceleration = allowHardwareAcceleration;
this.mixingSource.Parameters.AudioEncoderParameters.PreferredMediaFramework = audioFramework;
this.mixingSource.Parameters.AudioEncoderParameters.AllowHardwareAcceleration = false;
MixingSource composites all input sources into a single output with video and audio tracks. AddRef() is called because the mixing source is shared between the preview and streaming sessions.
Decoder and encoder parameters are configured with the selected framework. The encoding framework options are the same as in the Simple Capture page.
Configuring the Descriptor
The mixing scene is defined by a Descriptor that specifies global options, input sources, output tracks, and scene composition:
VAST.Image.Mixing.Descriptor descriptor = new VAST.Image.Mixing.Descriptor
{
AllowVideoProcessing = true,
AllowAbsoluteTimestamps = true,
Processing = new VAST.Image.Mixing.Processing
{
VideoProcessing = new VAST.Image.Mixing.VideoProcessing
{
Discard = !allowVideo,
},
AudioProcessing = new VAST.Image.Mixing.AudioProcessing
{
Discard = !allowAudio,
}
}
};
AllowVideoProcessing enables layer compositing and visual effects. AllowAbsoluteTimestamps enables absolute timestamp mode for accurate source synchronization. Discard controls whether video or audio output is generated.
The descriptor is then populated with output tracks, input sources, and scene composition before being applied via mixingSource.Update(descriptor).
Output Tracks
When streaming is active, the video track is configured for H.264 encoding:
var track = new VAST.Image.Mixing.VideoTrack
{
Index = trackIndex,
Width = outputWidth,
Height = outputHeight,
Framerate = outputFramerate,
Codec = VAST.Common.Codec.H264,
Bitrate = videoBitrate,
KeyframeInterval = keyframeInterval,
Profile = 66, // baseline
Level = 31,
};
When only preview is active, the track uses uncompressed output to save encoding resources:
track.Codec = VAST.Common.Codec.Uncompressed;
track.PixelFormat = VAST.Common.PixelFormat.None;
Audio track configuration follows the same pattern — AAC for streaming, PCM for preview only. On Android, the audio track uses the capture device's native sample rate and channels because audio resampling is not available on that platform.
Adding Sources
Each source is added to the descriptor's Sources list as a Source. The source index in this list is used later for layer composition.
Camera and Microphone
this.activeVideoCaptureSource = VAST.Media.SourceFactory.CreateVideoCapture(
this.videoDevice.DeviceId, this.videoCaptureMode);
this.activeVideoCaptureSource.Rotation = this.videoRotation;
this.activeVideoCaptureSource.AddRef();
this.videoCaptureSourceIndex = descriptor.Sources.Count;
descriptor.Sources.Add(new VAST.Image.Mixing.Source { MediaSource = this.activeVideoCaptureSource });
Capture sources are created via SourceFactory and added using the MediaSource property. AddRef() is called because capture sources are shared between the mixing source and the preview renderer.
Screen Capture
this.activeScreenCaptureSource = VAST.Media.SourceFactory.CreateScreenCapture();
this.activeScreenCaptureSource.DeviceId = this.screen.DeviceId;
this.activeScreenCaptureSource.Region = new VAST.Common.Rect {
Left = this.screen.Location.Left, Top = this.screen.Location.Top,
Right = this.screen.Location.Right, Bottom = this.screen.Location.Bottom };
this.activeScreenCaptureSource.ShowMouse = true;
this.activeScreenCaptureSource.AddRef();
Available displays are enumerated via DisplayHelper. The screen capture source captures the selected display area with the mouse cursor visible.
Video File
var fileSource = new VAST.File.ISO.IsoSource();
fileSource.Stream = stream;
fileSource.PlaybackRate = 1;
fileSource.Loop = true;
descriptor.Sources.Add(new VAST.Image.Mixing.Source { MediaSource = fileSource });
A video file (video.mp4) is loaded from the app package and played in an endless loop. On Android, the app package stream is not seekable, so it is copied into a MemoryStream first.
Network Stream
descriptor.Sources.Add(new VAST.Image.Mixing.Source { Uri = tboxStreamingSource.Text });
A remote stream is added by URI. The mixing source handles connection, decoding, and synchronization automatically.
User Source (Manual Frame Pushing)
this.userSource = new VAST.Network.VirtualNetworkSource();
this.userSource.AddRef();
this.userSource.AddStream(new VAST.Common.MediaType
{
ContentType = VAST.Common.ContentType.Video,
CodecId = VAST.Common.Codec.Uncompressed,
PixelFormat = VAST.Common.PixelFormat.BGRA,
Width = this.outputWidth,
Height = this.outputHeight,
Framerate = this.outputFramerate,
});
A VirtualNetworkSource is created for pushing user-generated frames. The demo generates an animated bouncing rectangle using SkiaSharp and pushes frames at 30 fps:
VAST.Common.VersatileBuffer vb = VAST.Media.MediaGlobal.LockBuffer(imageSize);
vb.Append(bmpImage.GetPixels(), imageSize);
vb.Pts = vb.Dts = currentVideoTimestamp;
vb.StreamIndex = 0;
this.userSource.PushMedia(vb);
vb.Release();
Text Overlay
descriptor.Sources.Add(new VAST.Image.Mixing.Source
{
Content = DateTime.Now.ToString(),
Format = "text",
HorizontalAlignment = VAST.Common.HorizontalAlignment.Left,
VerticalAlignment = VAST.Common.VerticalAlignment.Top,
Decoration = new VAST.Image.Mixing.Decoration
{
FontFamily = "Calibri",
Size = 30,
Bold = true,
Italic = true,
Color = "#FFFFFF00",
OutlineColor = "#FF000000",
OutlineWidth = 1,
}
});
Text sources use the Content property with Format = "text". The Decoration object configures font, size, style, color, and outline. A background task updates the text content every second by re-calling updateSources.
Logo Overlay
descriptor.Sources.Add(new VAST.Image.Mixing.Source
{
Content = stream, // image stream
Format = "image"
});
Image sources use the Content property with Format = "image" and a Stream containing the image data.
Scene Composition
After adding sources, the scene defines how they are composited into layers:
descriptor.Processing.VideoProcessing.Mixing = new VAST.Image.Mixing.VideoMixing
{
Type = VAST.Image.Mixing.VideoMixingType.All,
Layers = new List<VAST.Image.Mixing.Layer>()
};
VideoMixingType.All composites all layers in order. Each Layer specifies which source it renders, its position, stretch mode, and optional visual effects.
Layer Layout
| Layer | Position | Description |
|---|---|---|
| Camera | Full frame | Primary background with brightness, contrast, and chroma key |
| Screen capture | Full frame | Display capture, z-order configurable relative to camera |
| Video file | Bottom-left quadrant | Picture-in-picture overlay |
| Network stream | Bottom-right quadrant | Picture-in-picture overlay |
| Text overlay | Bottom strip | Dynamic timestamp text |
| Logo overlay | Top-left corner | Static watermark (245x59 pixels) |
| User source | Full frame | Animated overlay with transparency |
Each layer uses LayoutType.Manual with explicit Location rectangles and StretchType.Preserve to maintain aspect ratio:
new VAST.Image.Mixing.Layer
{
Sources = new List<int>(new int[] { this.videoFileSourceIndex }),
Layout = VAST.Image.Mixing.LayoutType.Manual,
Stretch = VAST.Image.Mixing.StretchType.Preserve,
Location = new VAST.Common.Rect(50, this.outputHeight / 2 + 50,
this.outputWidth / 2 - 50, this.outputHeight - 50)
}
Visual Effects
The camera layer supports brightness, contrast, and chroma key (green screen) effects:
new VAST.Image.Mixing.Layer
{
Sources = new List<int>(new int[] { this.videoCaptureSourceIndex }),
Layout = VAST.Image.Mixing.LayoutType.Manual,
Stretch = VAST.Image.Mixing.StretchType.Preserve,
Location = new VAST.Common.Rect(0, 0, this.outputWidth, this.outputHeight),
BrightnessAdjustment = (float)this.sliderBrightness.Value,
ContrastAdjustment = (float)this.sliderContrast.Value,
ChromaKeyColor = chromaKeyColor,
ChromaKeyThreshold = 0.07f + (float)this.sliderChromaKeyThreshold.Value / 200f,
ChromaKeySmoothing = 0.05f + (float)this.sliderChromaKeySmoothing.Value / 200f,
}
These effects are adjustable at runtime via UI sliders. Changing a slider triggers updateScene which rebuilds the layer list and calls mixingSource.Update(descriptor) without recreating sources.
Audio Mixing
Audio uses a single-source mode that passes through the microphone input:
descriptor.Processing.AudioProcessing.Mixing = new VAST.Image.Mixing.AudioMixing
{
Type = VAST.Image.Mixing.AudioMixingType.Single,
SourceIndex = this.audioCaptureSourceIndex,
};
Local Preview
this.previewSession = new VAST.Media.MediaSession();
this.createMixingSource();
this.previewSession.AddSource(this.mixingSource);
this.previewSession.Start();
The preview session uses MediaSession with the mixing source. The preview renderer can display either the direct camera feed (lower latency, no effects) or the mixed output (all layers and effects visible):
if (this.pickerRenderingSource.SelectedIndex == 0)
{
// Direct camera preview
this.activeVideoCaptureSource.Renderer = this.videoPreview.Renderer;
}
else
{
// Mixed output preview
this.mixingSource.Renderer = this.videoPreview.Renderer;
}
Streaming
this.streamingSession = new VAST.Media.MediaSession();
this.createMixingSource();
this.streamingSession.AddSource(this.mixingSource);
VAST.Media.IMediaSink sink = VAST.Media.SinkFactory.Create(tboxServerUri.Text);
sink.Uri = tboxServerUri.Text;
this.streamingSession.AddSink(sink);
this.streamingSession.Start();
The streaming session connects the same mixing source to a network sink. When streaming starts, the mixing source output switches from uncompressed to H.264/AAC encoded. When streaming stops while preview remains active, the output switches back to uncompressed to save encoding resources.
Dynamic Source Updates
Sources can be added or removed at runtime via checkbox toggles without stopping sessions. Each change triggers updateSources which rebuilds the complete source list and scene composition, then applies it via mixingSource.Update(descriptor).
Scene-only changes (brightness, contrast, chroma key sliders) trigger updateScene which rebuilds only the layer composition without recreating sources.
Resource Management
The mixing source uses reference counting to coordinate between preview and streaming sessions. Both sessions call AddSource(mixingSource), incrementing the reference count. When a session is disposed, the cleanup() method checks mixingSource.RefCount — resources are released only when the last session is stopped:
if (this.mixingSource.RefCount > 1)
{
// one or more media sessions are still active
return;
}
this.mixingSource.Release();
Send Log
The page includes a Send Log button that uploads the application log file to VASTreaming support for diagnostics:
await VAST.Common.License.SendLog("MAUI extended capture issue");
SendLog sends the current log file to the support server. A valid license key must be configured for this feature to work.
See Also
- Sample Applications — overview of all demo projects
- MAUI App Demo — parent page with app initialization and demo page overview