Animated Backgrounds in Video Calls & Virtual Meetings
Overview
Coworkers have been asking me how I get animated and moving backgrounds (and foregrounds) in my video conference calls and virtual meetings. So, I’ve created this article to give an overview of the process and tools. For this article’s background example I chose a video that won’t get me into copyright trouble. Its background is a recording of the city gates of Amnoon. And its a scene I recorded while playing the game Guild Wars 2.
The method to create this effect is layering. In oversimplified terms, the background is the bottom layer. In my example I used a video instead of a static image. Layered on top of the background is the output from my webcam. And finally, the very top layer is a graphic with some text. I combined all these layers using the free software tool OBS (Open Broadcaster Software) and output that as a single video feed.
Since I’m not actually “broadcasting” or streaming (OBS is primarily for Twitch, YouTube, Facebook Live, etc.), I need something to convert the broadcast into a useable format. Therefore, I’ve installed the OBS plugin VirtualCam, which makes the OBS output look like a webcam to other programs. Instead of selecting my actual webcam as the input for my video-conferencing software, I set it to the virtual device named “OBS-Camera.” This technique works for Microsoft Teams, Zoom, Skype, Jitsi, and GotoMeeting. I’ve done so much testing that some of the results are fading into a memory fog. So, I’m only “mostly positive” that it will work with Google Hangouts, Google Meet, and Discord. The only “failure” I distinctly remember is for my doctor’s tele-medicine product, and that was because it would not let you choose an input device.
OBS is just one “link” in the chain of tools used to create my desired output, and each software tool runs on a fairly beefy computer (see the end of article for details). Below is more detail on each link in this chain of tools. They are presented in a logical sequence: from the source to the destination. Along the way I’ll also mention some alternative tools that I’ve experimented with, since they may suit your needs better. And at the end of the article is a collection of miscellaneous notes, tips, and tricks.
The Chain of Tools
Environment
I’ve arranged my environment to improve the image and audio quality I can produce. I’ve added lamps and baffles to diffuse, bounce, direct and control the temperature (color) of the lighting. Not only of my face but also of the background (so it’s easier for the computer to cleanly “remove” my real background). For audio I do the same to optimize conditions: I set the timer on my air conditioner so it cools the room beforehand, and automatically turns off just prior to the start of the next meeting. I then turn down fans and other background noise. I’ve also covered the glass (bordering our front door) with a decorative overlay so the dog is oblivious to the comings and goings in our neighborhood. And finally, I close the office door if the grandchildren are visiting.
Audio hardware
My primary microphone is a dynamic mic with a cardriod sensitivity pattern (i.e., it minimizes off-axis and extraneous background sounds and focuses on just my voice). It’s a Samson Q2U in a Rycote InVision USM shock mount on a Gator Frameworks boom and stand. The mic can be connected via USB, but I’m using an XLR cable to a Zoom H6 acting as my computer’s Audio Interface. The Zoom H6 lets me mix multiple audio inputs, directly monitoring the mic, control gain, enhance the audio (e.g., volume compression), etc.
video hardware
Built-in webcams typically produce terrible video (grainy, choppy, dark, and low resolution) and at the wrong angle. Dell is notorious for their “nostril” cameras, which are mounted below the screen. Therefore, I began my journey with equipment already at hand. For video I used an iPhone XS Max because it has an awesome camera. To make my iPhone act as a webcam, I used Kinoni’s EpocCam app and PC software. To hold the phone at the proper height and angle, I used a Ram Mount X-grip with an extension arm and custom base (a glass brick filled with decorative river stones). A wireless Qi charging pad from Anker was stuck to the back of the X-grip to supply continuous power.
Although the iPhone was a high-quality solution, I wanted a dedicated webcam because I kept forgetting it was still mounted above and behind my monitor. Not only did I keep leaving it behind, it was also inconvenient to use the phone, as a phone, in this configuration. Although the cameras on iPads and Touch iPods are not as good as recent iPhones, they could be a dramatic improvement over the built-in webcams you’re using. And EpocCam works with Android devices and on macOS as well, as well as other competitors that I’ve heard of from other users.
Before for buying a dedicated webcam, I also experimented with other “normal” cameras configured to work as webcams. One option was using the HAYOX capture device to convert HDMI output to USB input (e.g., when connecting a GoPro HERO8 Black in a Media Mod “cage”). But the latency and low-light performance was poor. I also converted a security camera I had on hand (the Wyze Pan Cam) into a webcam by applying a special firmware change. This was purely out of curiosity since the camera has an extremely large field of view that makes it undesirable except for the most desperate of users. You also lose the Pan/Tilt/Zoom controls and Infrared features, so it’s now restored back to its “security camera” configuration. And I can feed it into OBS as a secondary camera view using an iPad connected with an Apple HDMI adapter—it’s pretty cool, but not particularly useful for virtual meetings.
None of the above experiments compared to the performance of a dedicated webcam like the Logitech Brio (which is what I’m currently using). The less expensive Logitech C920s, C922, and StreamCam are also great alternatives. And I’ve craved pricer upgrades such as the HuddleCam HD or an Alpha-series Sony mirrorless camera (e.g., the wallet-busting a7S III). But those are more suited to professional streamers and media influencers that broadcast for a living.
The final piece of hardware equipment I have is a “green screen” (for chroma keying). I would NOT recommend it for most users, and I only use it for special situations. A lot of the software that I mention below can be used without it. Green screens can be tricky to set up because they must be evenly lighted (no shadows or brighter areas) and can “splash” a green glow onto the subject if the light angles or distance are wrong. And when the screen is far enough back from your position, then it has to be humongous to still fill the camera’s FOV (field of view)! I use the Valera Explorer 90 and even at this size it is a challenge to position so that it fills my webcam’s FOV. I wish it came with other color screens (chroma blue, neutral gray, and white), and I may make some by hand if the vendor doesn’t add them. I had originally contemplated a retractable ceiling-mounted backdrop that pulls down like a movie-projector screen. But I went with the Valera since it collapses easily and is small enough to store out-of-sight in a closet corner. If I were a professional streamer, and had a larger office/studio, then I’d probably go for a fixed screen (like this massive 8x8 foot backdrop) or perhaps a wall covered with special chroma green paint.
Software
I use multiple software programs to create the video feed used in virtual meetings. And the combination changes based on the look to be achieved. If I’m using a static image as my background, then no additional software is needed. Both Microsoft Teams and Zoom include excellent features that do background replacement.
A more complex composition, like my example video above, uses a few more tools. Let’s look at the layers (from front to back) and the tools used for each. In the foreground is a graphic with text that provides additional information. The broadcast industry calls this a “Lower Third” (or L3) since it typically appears at the bottom of the screen. In my example video above, my L3 is actually positioned in the upper right corner. It was created using the free art program Paint.NET but any graphics software (CorelDraw, Photoshop, Procreate, etc.) could be used. I save my L3 graphics in the PNG format since it lets me save images with transparent backgrounds. But also because PNG does a superior job of compressing mostly solid, non-gradient colored shapes and text, which is what most L3s are.
Both L3 examples above have a section for a ticker (scrolling text). This text is layered over the L3 and comes from, and is configured in, OBS. The ticker is a “Text” layer with a “Scroll” filter added. Below is a screen shot from OBS for an AFK (Away From Keyboard) screen. At the bottom (second pane from the left) is the SOURCES pane and it shows the two layers that make up the preview being displayed. The bottom layer is named “Please Stand By TV” and it pulls in the background image. On top of that is the ticker: a layer named “AFK Text” which contains the “I will be back in just a moment” message (including settings for placement, color, font, size, speed, opacity, etc.).
For my example video at the top of this article there is a middle layer, which is me via the webcam. In OBS this is a “Video Capture Device” layer type. However, there is a software component that sits between the Logitech Brio and OBS. The “Logitech Camera Settings” application lets me adjust and optimize the camera’s video. I adjust saturation, white balance, contrast, etc. to match the background (whether moving or static). For example, if it’s a sunny beach scene then I would set the white balance to a warmer golden cast, increase the contrast, and bump up the brightness so it matches the scene. I also adjust my office lighting so that the shadows fall in the same direction as in the background. If the background is of a thunderstorm at sea, then I would match my image with a cooler white balance (i.e., a subtle blue cast), a darker exposure, and less contrast. I’ve also taken the opposite approach, and selected backgrounds that already match the lighting in my office. With more believable backgrounds (like a photo of an office or kitchen versus the cockpit of a spaceship), the matched lighting has been realistic enough to cause people to think I was actually in those locations!
In addition to the camera’s utility software, OBS can also apply filters, and adjustments, and LUTs (adjustment Look Up Tables). A LUT is customized to both your specific camera and to your specific lighting conditions. To create a LUT, you first capture an image from your camera, which is taken under set lighting conditions. Then use a photo (or video) editing program to make adjustments to the captured image until it looks best. The adjustments are not applied directly to the captured image. Instead they are put on a separate layer, and you’re viewing your image through the adjustment layer. (Think of it as if you were painting on a pane of glass that is sitting on top of a photo.) Next, you replace, cover, or hide your captured image with a LUT reference table (original and unmodified). The LUT table is now sitting below those same adjustments. The results are flattened (the layers combined) and saved to a PNG image. This file is a custom LUT that can be applied to your camera’s output so all video gets the enhancements. Below is the before-and-after for a Wyze Pan Cam, a camera that’s optimized for security monitoring, not image quality. As you can see it adds a terrible yellow cast to the video, but with a custom LUT applied, the colors are much more natural.
For an animated, moving background I’ve been using YouTube videos. Yep, it’s just that simple! In OBS this is a “Browser” layer and would be positioned at the bottom. If your videoing or photographing your own (or when choosing someone else’s) backgrounds, be mindful of the angle. In a meeting, your webcam is at eye-level while seated! So choose/take photos at that same height to create more natural backgrounds.
Picking a good background is a balancing act. If it’s too plain, then the artificial outline—the edge where the computer cut you out from your real background—will be very noticeable. A bit of detail and texture in the background helps to hide that outline. If the scene is too busy and detailed, it becomes distracting and you blend with it instead of being in front of the background. In real TV studios they use a “hair light” to ensure distinction and depth—to make the person stand out from, instead of blend into, the background.
When removing and replacing your actual real-life background, you want it to be plain and as uniform as possible. A blank wall would be excellent. This helps the computer distinguish your outline from the background. Angle your lights and use baffles (I use foam core boards) so that the light falls on your face and shoulders but not on the wall behind you which are slightly darker. To prevent “hot spots” and deep shadows on your face, bounce the light off the walls instead of pointing lights directly on yourself. This will soften and diffuse the lighting and create a more appealing appearance.
While on the topic of texture and detail in background images, it’s important to not go overboard. Some software cannot cope with an image that is too intricate. I had a photo of the interior of the NASA space station. The original was too complicated for Microsoft Teams to even display. Also, you don’t want to overload the software by having it to constantly downscale large images. And photos from modern cameras and phones are massively oversized compared to a computer screen. They are so large they can crash your software. To prevent big images from slowing down or crashing my software, I proactively resize my backgrounds to “Full HD” size—that is, 1920 pixels wide by 1080 pixels tall. (This is also the size that MS Teams would reduce a background image to, so it saves the time and effort required to convert it every time,) And while I’m cropping, sizing, and enhancing backgrounds, I will also flip them so the incoming light in the image matches my actual lighting. For example, a background photo may have a window (with incoming light) that is on the left. I will flip that photo so the window is on the right-hand side, like my real-life window.
While we’re discussing software limits, don’t load too many backgrounds into your meeting software! I learned the hard way that MS Teams will crash if you have over 100 custom backgrounds. I now keep all my custom backgrounds in their own folder and only copy about 75 to MS Teams at a time. Below is a script I use to replace old custom backgrounds with a set of fresh “finished” images.
DEL /Q C:\Users\Craig\AppData\Roaming\Microsoft\Teams\Backgrounds\Uploads\*.jpg
COPY /Y C:\Backgrounds\Finished\*.jpg C:\Users\Craig\AppData\Roaming\Microsoft\Teams\Backgrounds\Uploads\
Tips, Tricks, & Notes
Use a wired ethernet connection when video conferencing. It’s not only faster, but also more reliable and stable than WiFi.
If you’re using a headset with a mic on the end of a boom arm, position the capsule so it’s below your chin, at your cheek bone, or pointed to the corner of your mouth. Most mics are sensitive enough to clearly pick up your voice even at a distance. Don’t put it in front of your mouth, below your nose, or near the path of your exhalations. Otherwise, all the gross heavy breathing noises will make you sound like a pervert.
When not speaking, be sure you’re on mute! In MS Teams you can tell who forgot and is creating interference and distraction by looking at the participant icons. This seems to happen more if the person called in on a phone (audio only) instead of joining with video using a computer. People sending audio have a glowing ring around their icon/initials. The meeting organizer, facilitator, (or participants—if the permission hasn’t been revoked) can then mute the offender.
MS Teams will shrink the video feeds, as needed, to fit all the participants on the screen. You can override this behavior and enlarge a person’s video. Right click their image and select Pin. Their video will stay pinned regardless of who is talking. You can pin multiple participants as long as they fit on your screen. You can unpin a person and it will revert to default sizing.
MS Teams crops videos, as needed, to fit more videos on the screen. This can sometimes cut off a part of their face. You can resize the image so that it shrinks-to-fit instead of cropping-to-fit: right click the video and select “Fit to frame” to see the entire width of their video.
The Zoom H6 can be used with an iPhone/iPad, even without AA batteries! When the H6 boots up, select PC (instead of iPAD) as the connected device. Use a powered USB hub to make the connection and to supply power to the H6.
Advanced video processing can be intensive and requires a computer with sufficient capabilities. My Windows 10 PC has a Core i7-7700K at 4.2GHz, 32GB of RAM, a Samsung 850 Pro 512GB SSD, 4 Toshiba 7200RPM 500GB drives in a RAID 5 array, an ASUS Prime Z270-AR motherboard, an Anker 10-port powered USB3.0 hub, a 1000 watt Corsair power supply feeding dual video cards, and a Corsair Hydro H100i liquid cooling system to supplement two case fans to keep the whole thing from burning itself out. This computer sits next to a window air conditioner that counters all the heat coming out of this PC, the monitors, and accessories. Before the AC was installed the office could reach 80°F even in the dead of winter with all the heating vents closed.