Page 2 of 68
Table of Contents
Introduction ............................................................................................................. 6
Virtual Reality Basics .............................................................................................. 7
Types of Virtual Reality ......................................................................................7
What is Game Based Virtual Reality?
13
Mirror Rigs
14
Fisheye
15
Light-field
16
Photogrammetry
18
Stitching Approaches.......................................................................................18
Geometric
19
Optical Flow
20
Live Stitching
21
Avoiding Artifacts
22
Resolution
23
24
25
27
Drones
31
Clean Plates
32
Page 3 of 68
33
Underwater
33
38
38
40
42
45
51
HRTFs
52
Caveat: personalization
52
53
B-format explained
54
B-format representations
54
B-format playback
55
Recording B-format
55
Mixing B-format
56
Page 4 of 68
64
Available Tools
64
Final Conform
64
64
64
Rendering in 360
64
Interactivity ...........................................................................................................65
Appendix ...............................................................................................................66
Guidelines for Avoiding Artifacts using the Jaunt ONE ....................................66
Distance from the camera rig
66
66
66
Legal
68
Page 5 of 68
Introduction
Virtual reality (VR) is truly a new medium. Along with the excitement at the creative possibilities
there is also much confusion within the film industry on how best to shoot a compelling piece of
VR content. Questions regarding camera movement, blocking, lighting, stereoscopic 3D versus
mono, spatial sound capture, and interactivity all get asked repeatedly.
As Jaunt is at the forefront of cinematic virtual reality production, the purpose of this guide is to
share our experiences with shooting a vast array of VR content with the wider communitywhat
works and what doesnt. We are not, however, trying to produce an exhaustive text on the
entirety of filmmaking but rather trying to cover the additional complexities and challenges that
come with shooting in VR.
Much of what will be discussed is framed through the lens (so to speak) of the Jaunt ONE
camera system as that is the rig with which we are most familiar and we provide specific details
on it wherever applicable. The vast majority of the content of this paper covers general VR
shooting techniques however and we attempt to keep the material as agnostic as possible.
Virtual reality technology as well as the language of cinematic VR is constantly and rapidly
changing at a breakneck pace so we will endeavor to update this guide from time to time as new
techniques present themselves, new technology develops, and we receive feedback from the
community.
Were interested to hear your feedback and what is working (or not) for your production teams.
To send feedback, please shoot an email to fieldguide@jauntvr.com
We hope you enjoy this guide.
Wikipedia https://en.wikipedia.org/wiki/Virtual_reality
Page 7 of 68
This type of VR also lends itself to highly interactive content with the Rift and Vive also offering
tracked hand controls allowing you to pick up and move objects, wield a sword, shoot a gun,
and generally interact with the entire environment. Its very much like being dropped into a video
game.
Just because the first round of HMDs have been heavily targeting gamers does not mean that
this type of technology is only for gaming. Game engines are just as capable of making an
interactive film or music video as they are a game and excel at creating worlds you can visit that
are completely unlike real life.
Page 8 of 68
Here you have the advantage of scenes looking completely real and not computer generated as
with game engines. Scenes are also usually captured with spatial sound microphones making
them sound just as real. If you hear a dog to your right and turn your head youll see the dog in
the spot the sound came from. Its as if you were dropped into a movie.
Unlike in game engines, however, you cannot move around the scene freely. Only if the camera
is moved during filming do you move. As new camera systems and acquisition technologies are
developed eventually you will be able to move around filmed scenes as well. See below under
Types of VR Cameras for more on this.
Though not as interactive as full game based environments you can still add interactivity to
cinematic VR. Branching Choose Your Own Adventure stories, gaze detection, interactive
overlays and interfaces, audio or clip triggers, gestures, and even full CG integration are all
possible.
All of this leads to a completely new form of media. A blank canvas with which weve only just
begun to realize whats possible. The killer app in VR will be some combination of cinema,
gaming, and interactive theatre. Right now were only in the dress rehearsal and anything is
possible. Even just five years from now VR content will look nothing like it does today.
Page 9 of 68
bigger. This is similar to the difference between a closeup in a 2D film versus how something
actually gets closer and comes out at you in a 3D film.
With 360 stereoscopic 3D on the other hand you have full 3D in every direction and objects can
actually get closer to the viewer. This leads to a much more naturalistic and immersive feeling
as this is how we actually experience things in real life. Imagine filming a shark underwater in
VR. For maximum impact youd want the viewer to feel the shark actually getting up close and
personal with the viewer. With stereoscopic 3D you can achieve that while in mono, although
still menacing, the shark wont actually ever get any closer and you lose that sense of presence
and immersionand fear factor!
Wherever possible you should always
strive to shoot in full 360 3D. Why
doesnt everyone just do that then? As
you would expect, the camera rigs are
more expensive and the stitching
process is much more complicated and
its difficult to get good results without a
lot of post effort and dollars. Jaunts
Jaunt ONE camera and the Jaunt Cloud
Services (JCS) are meant to ease this
process, greatly automating the entire
effort.
Page 10 of 68
360 Video
A note must be made about what we call 360 video. How is this different from VR? In an effort
to get people into VR and leverage the heavily trafficked platforms that exist now many
companies, Facebook and Googles YouTube in particular, have started promoting 360 video.
This is video you can watch in a browser on the web and pan around the scene with your
mouse or move and tilt your smartphone around to do the same on mobile apps.
As well, our Jaunt smartphone and web apps have this capability for those times or for those
users that do not yet have a VR viewing device to be able to experience the content in full 3D.
Brands and companies such as Facebook love 360 video as it allows them to leverage their
massive user bases on platforms that everyone is already using.
We must be careful however to call this virtual reality. If too many users believe this is the real
deal then they may think that they have actually experienced virtual realitywith all of its
presence and immersion in 3D with spatial soundand not been that impressed. We need to
make a clear distinction between 360 video and true VR and use the former to activate viewers
fully into the latter or risk VR dying an early death similar to what happened with broadcast 3D
TV.
Which leads us to what the requirements to be considered virtual reality actually are.
Minimum VR Requirements
You could talk to 100 people about what is essential to be considered virtual reality and get as
many answers. As we are looking for maximum immersion and presencethe feeling of actually
being thereJaunt assumes a minimum of four things:
360 Equirectangular Imagesthis is a scene in which you can look around, up, and down a
full 360 degrees. Some camera rigs have instead opted for 180 particularly cameras that are
streaming live to reduce bandwidth and stitching complexity. However, as soon as you look
behind you youre pulled right out of the scene. Often times to combat just having a black
background behind you a graphic will be inserted such as a poster frame from the show, stat
card from a game, etc.
Stereoscopic 3Dthis is one of the more contentious requirements as many people are filming
in mono today as it is both cheaper and simpler to capture and stitch per the reasons given
above. However, to truly get that sense of presenceof being presentthat is the hallmark of VR
you really need to shoot in stereoscopic 3D wherever possible. Stereo 3D vision is how we see
in real life and is equally important in VR.
Spatial 3D Soundsound is always a hugely important part of any production. In VR it is critical.
Not only does it help with immersion but it is one of the few cues, along with motion and light, to
get your viewers attention for an important moment as they could be looking anywhere.
Capturing spatial audio increases your sense of place.
Viewed in an HMDfinally, none of the above is any good unless you have a method of actually
viewing it. Though 360 video is often created for those without a viewing device and allows you
Version 1.5, January 2017
Page 11 of 68
to pan around the image, it doesnt allow you to see in 3D or provide you with spatial audio
playback. For the full experience you really must use a proper HMD. The good news is you
dont need an expensive Oculus Rift or HTC Vive. There are some very inexpensive or even
free options on the market with the selection increasing at a dizzying rate.
Types of HMDs
There are many different types of HMDs or head mounted displays that vary drastically in price
and capability ranging from the very simple Google Cardboard to the Samsung Gear VR to the
Oculus Rift and HTC Vive. It was the cell phone and its suite of miniaturized components
gyroscopes, accelerometers, small hi-resolution screensthat led to the resurgence of viable
virtual reality and allowed Palmer Lucky to create the first Oculus headset and its the cellphone
that is the basis of all of them, even the high end Rift and Vive.
The higher end HMDs provide full body tracking and some also include hand controllers
creating a room scale VR system that allows you to move about and interact with your
environment. But using just your cellphone with some simple lenses housed in a cardboard or
plastic enclosure gets you a pretty amazing experience. This will only get better as cell phone
manufacturers integrate better VR subsystems into their handsets.
The list of HMDs is ever growing at a breakneck pace but for a good overall list of the current
HMDs on the market or in development see the VR Times.
Page 12 of 68
Camera
In this section we discuss the various types of VR camera rigs you will encounter, some of the
gotchas to be aware of with VR cinematography and how to avoid them, mounts and rigging
solutions, the importance of clean plates, and underwater and aerial VR shoots.
Types of VR Cameras
There are many types of camera systems for shooting VR and the space is evolving rapidly.
Each has their own strengths and weaknesses and we cover the major forms below. There are
many other forms of panoramic cameras but we wont cover those that dont allow for video
capture such as slit-scan cameras. Where possible, its best to research and test each one
based on your own needs.
Panoptic
These camera systems are generally inspired from the visual system of flying insects and
consist of many discrete camera modules arranged on a sphere, dome, or other shape. The
term comes from the Greek Panoptes which was a giant with a hundred eyes in Greek
mythology. Jaunts camera systems, including the Jaunt ONE, are of this variety.
Page 13 of 68
This is by far the most popular type of VR camera rig and many companies have jumped into
the fray by designing lightweight rigs to support a variety of off the shelf camera modules
usually the GoPro. Being small, lightweight, and relatively inexpensive the GoPro has proved to
be the go to camera for snapping together a VR camera rig. In fact, Jaunts first production
camera, the GP16, consisted of sixteen GoPro cameras in a custom 3D printed enclosure.
However, there are numerous problems with
a GoPro based system including image
quality, heat dissipation, and most
importantly lack of sync. When shooting VR
it is crucial that all of your camera modules
are in lockstep so that overlapping images
match precisely and can be easily stitched
together in post. Out of the box, GoPros
have no built-in syncing capability and even
when properly synced in post based on
audio/visual cues they can drift over time.
JAUNT GP16 CAMERA
This isnt to pan GoPro cameras. The mere fact that they have enabled so many different VR
rigs is a feat unto itself but they werent originally conceived for this task and the cracks are
showing.
Jaunt has since moved on to twenty-four custom built camera modules in the Jaunt ONE that
provide four times the sensor size with better low light performance, higher dynamic range with
eleven stops of latitude, better color reproduction, global shutters to prevent tearing of fast
moving objects, and most importantly synced camera modules.
The number of cameras in any given system is a function of overlap. You need enough cameras
to provide sufficient overlap between images of between 15-20% in order to properly stitch
adjacent frames togethermore if you want to provide a stereo stitch. The more cameras you
have in a rig and the more closely spaced they are to one another also provides a shorter
minimum distance to camera allowing subjects to get much closer before stitching falls apart.
See Stitching Approaches and Distance to Subject below for more information.
Mirror Rigs
Another common type of panoramic 360 camera is the mirror rig. This typically has a number of
cameras in a circular configuration shooting up into a collection of mirrors that are facing out into
the scene at an angle. A good example of this kind of rig is the Fraunhofer OmniCam.
Page 14 of 68
These rigs can be either mono or stereo and are generally bigger and heavier than other types
of panorama rigs due to the mirrors. A big benefit of these rigs however is that the mirrors allow
the cameras to shoot into a virtual nodal point within the mirrors that provide minimal or no
parallax in the scene making stitching very easy and relatively artifact free.
Because of that many of these rigs allow for realtime stitching and transmission of live 360
imagery. By having two cameras shooting into each mirror you can create a seamless stereo
stitch. The main drawback again being the size and weight of these rigs and the relatively
powerful computer they must be attached to for live stitching.
Fisheye
Many consumer panoramic cameras are of this variety because they are relatively cheap, small,
lightweight, and are easily stitchedusually in-camera. Some use one lens, like the Kodak 360
Action Cam, and capture 180 degrees while a two lens system, like the Ricoh Theta, captures a
full 360 degrees by stitching the two halves together.
Page 15 of 68
Though they are convenient and easily stitched the quality of this type of camera is relatively
low. Many can stream to an iPhone or Android device making them a good remote viewing
solution if your VR camera doesn't provide one. See below under Live Preview for more
information.
Prosumer versions of these types of cameras also exist with much larger lenses and sensors.
Unfortunately all cameras of this type produce only monoscopic images and not stereoscopic
3D images lessening the immersion for VR purposes.
Light-field
Light-field cameras are the latest technology to hit the panoramic market. They represent the
future of virtual reality filmmaking though their practical use is still a ways off. Instead of focusing
light through a lens and onto a sensor there are hundreds of tiny micro-lenses that capture light
rays from every conceivable direction.
Page 16 of 68
Unlike in a light field still camera with its micro-array of lenses, most video based light field
cameras use numerous camera modules arranged in a grid or sphere configuration. With some
fancy processing these multiple video streams can be packed into a compressed light field
format that enables you to move around the scene as the video playsalbeit limitedly.
You are limited in movement roughly equal to the diameter of the sphere or width of the grid
from which it was captured. You wont be fully walking around the room but you will be able to
move your head and see shifting parallax which can help minimize motion sickness in VR.
Page 17 of 68
Unfortunately, the practical uses of these cameras in production are currently limited as they
require a large array of computers attached to the camera for data capture and processing. In
addition, the light-field movie stream, even though it is compressed, is enormous making
working with, downloading, or streaming incredibly difficult at todays bandwidth limits.
Though currently difficult to film light field imagery it is quite possible to render it out in CG using
technology developed by oToy. For a good video describing holographic or light field video
rendering see oToys website.
Photogrammetry
To fully realize scene capture for VR you need to change your thinking entirely and move from
the current inside-out methodology to an outside-in perspective. That is, instead of filming with
an array of cameras that are facing out into the scene, surround the scene with an array of
cameras that are looking in.
Microsoft has created a video based photogrammetry technology used to create holographic
videos for its HoloLens augmented reality headset called Free Viewpoint Video. An array of
cameras placed around a green screen stage captures video from many different angles where
it is then processed using advanced photogrammetry techniques to create a full 3D mesh with
projection mapped textures of whatever is in the scene. Their technology uses advanced mesh
tessellation, smoothed mesh reduction, and compression to create scenes that you can actually
walk around in VR or AR.
For more information on the process see this Microsoft video on YouTube.
Another company working in this space, 8i, uses a similar array of cameras to capture what they
call volumetric video stored in a proprietary compressed light field format. This technology does
not create a full CG mesh (though that is an option) but yet still allows you to walk about the
scene and observe it from any angle. For more info visit 8i.
Whatever the technology or approach, advanced realtime photogrammetry techniques will be an
important capture technology in the not too distant future allowing you to fully immerse yourself
in any scene. As the technology improves and reduces in cost it will also allow consumers to
truly connect like never before through holographic video feeds and social environments.
For a list of current camera technologies mentioned in this section and additional information,
please visit The Full Dome Blogs Collection of 360 Video Rigs.
Stitching Approaches
Once you have shot the scene with your 360 camera youll need to stitch all the individual
cameras together to create a single, seamless 360 spherical image. Creating an image without
visible seams or artifacts is one of the more difficult and time consuming issues in VR
filmmakingparticularly when creating a 3D image.
Version 1.5, January 2017
Page 18 of 68
Jaunts Cloud Services (JCS) has made this process nearly automatic and currently supports
the Jaunt ONE and Nokia Ozo cameras. There are a variety of approaches to stitching outlined
below. Jaunt has experimented with several techniques but has currently settled on optical flow
as the technology that provides the best 3D with no seams and a minimum of artifacts.
Geometric
Geometric stitching is the approach used by most off the shelf software like Kolor Autopano and
was first used at Jaunt with our GP16 camera. In this approach barrel distortion due to lensing is
corrected in each image, the images are aligned based on like points between them, and then
smoothly blended together. This creates a full 360x180 equirectangular unwrapped spherical
image.
GEOMETRIC STITCHING
Stitching in stereo 3D is more difficult and requires creating a virtual stereo camera pair within
the sphere using slices of each image for the left and right eye virtual cameras to be stitched.
As mentioned, this is not always perfect and can lead to visible seams and also 3D artifacts
where portions of the scene are not at the correct depth. For that reason, Jaunt has moved on
to the following optical flow technique.
Page 19 of 68
Optical Flow
Jaunt is currently using and optimized optical flow algorithm as the basis of its Jaunt Cloud
Services (JCS) online stitching platform both for the Jaunt ONE and Nokia Ozo cameras.
Optical flow is a technique that has been used in computer graphics for some time. It has many
applications including motion estimation, tracking and camera creation, matte creation, and
stereo disparity generation. At its core, optical flow algorithms calculate the movement of every
pixel in a scene usually across frames in a time based analysis from frame to frame.
In the case of stitching a stereo panorama the flow graph is used to track distinct pixels
representing like visual elements in the scene between adjacent cameras. Using this in
conjunction with known physical camera geometry and lens characteristics it is possible to
determine the distance of each pixel between the cameras and therefore a disparity (depth)
map of the entire scene.
2 Better Exploiting Motion for Better Action Recognition (PDF Download Available). Available from: https://www.researchgate.net/
Page 20 of 68
Though difficult to see in the below picture since it is not moving, you can see the warping and
distortion around the gentlemans nose. In the video, this area pops and wiggles back and forth
around his outline. Generally, the closer objects are to camera the harder it is for adjacent
cameras to see similar points
which causes more estimation
and larger halos. For more
information see Avoiding
Artifacts and Distance to
Subject below.
Unfortunately, it is nearly
impossible for current
algorithms to fully eliminate all
errors and some post
processing will be required to
remove them. See the PostProduction section below for
more information on
techniques.
Live Stitching
Live stitching uses similar methods as above but obviously must do them in realtime so that a
live 360 image can be transmitted. There are very few products that currently do live stitching
or do it well. Most of those either operate in less than 360 going typically for a 180 approach
or in monoscopic instead of stereoscopic 3D.
Version 1.5, January 2017
Page 21 of 68
One off the shelf solution is VideoStitch which enables you to do live monoscopic stitching via
their Vahana VR product with live stereo 3D on the horizon. It is a software solution that works
with a variety of VR camera rigs. See VideoStitch for more information.
Many live stitching solutions are of the aforementioned mirror rig variety as their configuration
allows for easier, quicker stitching. The Fraunhofer OmniCam is one such solution and has two
models, one for mono and one capable of live transmission of 3D 360 streams. See Mirror Rigs
above for more information.
Avoiding Artifacts
Each of the above cameras and algorithms have their own idiosyncrasies in terms of how well
they will stitch without introducing artifacts. Nearly all algorithms introduce some form of
undesirable artifacts. These can be lessened if the scene is shot correctly so it is worth your
while to investigate and test the capabilities of your particular camera/algorithm combination.
In the case of the Jaunt ONE camera and its optical flow based stitching algorithm for example
we sometimes see chattering halos around moving objects or objects too close to camera.
These often occur because the flow algorithm has a hard time finding similar pixels between
adjacent cameras in order to do the stitch. If a person is standing in front of a bright blown out
window its difficult for the algorithm to tell which pixel is which as they are all of similar value
around the subject as there is no detail in that area.
Likewise, if there are too many points that look exactly the same you can run into a similar
issue. If a person is standing in front of wallpaper with fine vertical stripes the flow algorithm can
have a tough time figuring out which point to match between many points that look the same.
The solution then is to place your subject over a different portion of the background that does
not have similar repeating detail in the case of the wallpaper or to expose the camera down so
that there is more information within the window for the algorithm to discern.
You can also run into this problem with objects that are too close to camera. If an object is too
close then one camera may see detail in the scene that are blocked by that close object in the
adjacent camera. In this case, there is no way for the algorithm to see around that object and it
must make pixels up. This estimated region around those objects are the halos and since this is
evaluated from frame to frame they may vary over time resulting in their chattery nature.
The solution to this problem is simple. Move your subject back. For the Jaunt ONE camera the
safe distance to closest subject is 4-6.
Algorithms are becoming smarter all the time and artifacts are expected to be reduced if not
eliminated in the not too distant future. In the meantime, though you may never eliminate all
artifacts you can drastically reduce them by taking some time to think about how your algorithm
will stitch the shot and compose your scene accordingly.
Page 22 of 68
Resolution
A note about resolution. Most individual camera modules in a VR rig be they GoPros, machine
vision cameras, or custom modules as in the Jaunt ONE camera, are usually HD (1920 x 1080)
or 2k (2048 x 1556) resolution. When these images get stitched together they obviously create a
much larger panorama.
The Jaunt ONE camera for instance is capable of creating up to a 12k resolution image.
Working with this size image with todays technology however is completely impractical. Just a
couple of years ago the industry was working with HD TV or 2k film images. Then in rapid
succession the industry introduced stereo 3D, ultra-high definition (UHD) and 4k, high dynamic
range (HDR), wide color gamut, and high frame rate (HFR) imagery each with the potential to at
least double the amount of data if not worse.
Where each of the above was a concern on its own before, now we come to VR which calls for
all of the above combined. And we need it all right now. But it is not possible to work with such
large files in post production with current technology. The CPUs and GPUs even on the fastest
machines cant keep up. Given this, Jaunt has currently limited the resolution we output to 3840
x 3840 stacked left/right eye equirectangular stereo 3D at 60fps.
Even this size of file can be difficult to work with in post and beefy machines are needed for
compositing, editing, color correction, and CGI at that resolution and frame rate. And although
we compress the final file delivered to the consumer, bandwidth around the world remains highly
variable and is also an issue.
Ultimately, the biggest bottleneck is the resolution of the final display device. While there is
always a desire to future proof your media right now it is important to keep in mind that all of the
major HMDs including the Rift, Vive, and PlayStation VR currently only support around 1k per
eye. This will of course improve over time but you want to balance the desire for massive
resolution with what is practical today. As technology improves over time we will be able to
increase the resolution and fidelity of our images.
Distance to Subject
With any type of rig that uses an array or ball of outward facing cameras, as the Jaunt ONE
camera does, one of the chief constraints is distance of the subject to camera. Get too close
and the stitch will fall apart and may be unusable. The closest distance you can achieve is a
factor of overlaphow many cameras you have and how closely spaced they are.
The issue is, the closer you come to camera the closer the cameras must be to each other.
Otherwise, one camera may see an object where its neighboring camera does notlike part of a
face in a closeup for example.
Of course the closer the cameras are to each other, the more cameras you need to cover the
full 360, and the smaller they physically must become. The smaller the cameras, the smaller
the sensors, the worse the image quality and low light sensitivity. Designing a proper VR
camera then becomes a game of optimization and tradeoffs.
Page 23 of 68
2015LUNARConfidential|JAUNTMeeting |April17,2015
The Jaunt ONE camera has a total of 24 lenses16 around the horizon, four facing up, and four
facing down. In this configuration and with our optics we can achieve a minimum distance to
subject of about 4 feet.
Ignore these distances at your peril as it can cost you thousands of dollars in post trying to fix
these shots as you are literally trying to recreate the missing information seen in one camera but
not in its neighbor. Many shots are simply not reparable.
Many scenes can benefit by getting much closer to camera than the minimum distance may
allow. One of the hallmarks of VR is being able to elicit visceral emotions from viewers by
bringing actors or objects right up to viewers. Unlike in 2D where this close-up is really just a
big-up and not really any closer at all, you can create real intimacy or anxiety in your viewers
by having someone step up to the camera actually getting closer to their POV.
A way around these minimum distance limits is to shoot the main environment in 360 3D with
the VR camera and then use a traditional stereo camera rig to shoot the actor green screen in
the same environment and lighting and then composite them into the 360 background in post.
This obviously takes additional equipment and more time but can be worth the payoff when you
really need an extreme closeup.
Camera Motion
Of all issues related to cinematic VR none is more important than that of moving the camera as
it has the potential of literally making your audience sick. Directors and filmmakers are used to
moving the camera to achieve dynamism within shots and as a tool to add extra emotion to
scenes. In 2D this rarely presents a problem but in VR it can cause dizziness, disorientation,
nausea, or even vomiting. Not something you want to do to your audience! Special care must be
taken to ensure that doesnt happen while still enabling interesting camera moves.
Sickness in VR
Motion sickness in VR is thought to occur due to the separation between your visual system and
your auditory canal. Normally, when you move there is a corresponding action between what
your see and the fluid within your inner ear canal. In VR, unless you are actually moving in
Page 24 of 68
tandem with what you are experiencing within the headset there becomes a disconnect. This
disconnect is a similar physiological effect to being poisoned and your body reacts the same.
For more on this see the article on virtual reality sickness on Wikipedia.
This can happen in gaming or cinematic VR but is currently more prevalent in the cinematic
case due to technological limitations. That being you can only experience moves if the camera
was moved by the filmmaker yet you yourself are not physically moving. In room-scale gaming
VR very often you are self directing your own moves in the virtual world through your actual
physical actions since real time gaming engines can generate everything on the fly and there is
no disconnect between the physical and the virtual.
Put yourself in a spaceship within a gaming engine however and the possibility for sickness
once again becomes very real as you are now zipping through space with no corresponding
physical forces on you. To counter this, some of the emerging location based VR experiences
are actually building around motion based rides which mimic the moves within the virtual space.
In addition to providing a more thrilling experience the possibility of feeling sick is diminished.
Page 25 of 68
Minimal Bumps
You also want to make sure that there are minimal bumps or jostling of the camera during your
moves otherwise it will feel as though you are on a mountain bike going down a bumpy hill.
Again youll want a stabilization system that can mitigate these bumps during shooting or use
post stabilization software to remove as much as possible after the fact. However, due to the
nature of spherical video your options in post here are limited as you cant translate in spherical
space to offset any bumps and youll have to live with any motion blur present in the video.
No Pans
You shouldnt pan or yaw the camera on the z-axis. This effectively forces a head turn to the
viewer which is very disconcerting in VR. You should instead allow the viewer to turn their head
naturally within the environment and look where they choose.
If you need the viewer to look at a particular spot so as not to miss a crucial piece of action you
should use lighting, movement, or sound to guide their eye instead. Depending on your
playback engine and resources, there is also the possibility of adding some interactivity to the
piece such that particular pieces of action are only triggered when the view actually looks in that
direction.
See Getting the Viewers Attention under Directing the Action below.
Minimal Acceleration
Finally, you should limit the acceleration present in your camera moves. Fast acceleration and
deceleration definitely can cause motion sickness
Ideally you would want no acceleration instead cutting into the shot with smooth, controlled
motion. If this is not possible then very slow acceleration or deceleration is generally acceptable.
Any rigs you use should be precise enough to allow for this without unneeded rubber banding or
sway. See below for more information.
The Last Word
Of course rules were made to be broken! All of the above should be heeded in most cases but
there are shots where you may be going for a little bit of motion sicknessbeing pushed off the
side of a building perhaps or on a rollercoaster. Here, the side effects of motion in virtual reality
can actually work in your favor. Use sparingly.
Page 26 of 68
Types of Mounts
Over the course of many projects we have experimented with all of the types of rigs below in an
attempt to create smooth, stable shots. There is no one best solution and all require some form
of rig removal in the end.
Tripod
The simplest and most widely used solution
in VR. Put the camera on sticks and dont
move it. Here you have no possibility of
motion sickness and a fairly compact form.
The tripod will still be seen by the bottom
cameras but by pulling in the legs and
making the footprint as small as possible it
will minimize the ground occlusion. Clean
plates can be taken which can help with the
removal using a DSLR camera. See Clean
Plates below.
By attaching a sling on the tripod underneath the camera it also provides a good place to stash
additional equipment like sound gear or computers. Weve even had camera operators
contorting themselves to hide underneath the tripod in scenes where they would otherwise be
seen!
Though this isnt the most dynamic of solutions it roots the viewer in the scene and lets them
fully explore it without distraction and with no chance of motion sickness. The vast majority of
your shots will likely be of this form with moving shots sprinkled in for extra effectand in VR you
really feel them giving those an added sense of weight.
Page 27 of 68
Dolly
As in 2D the dolly can create some very smooth,
beautiful moves. Its very easy to mount most any VR
camera on a dolly. However, this is also one of the
bulkier, most visible rigs available. Due to its size and
the fact that you have at least one operator controlling
itand most likely tracksthere will be a huge swath of
the scene that is occluded. It will be virtually impossible
to paint this out on any reasonable budget making the
dolly a not very good solution.
Remote Controlled Slider
The slider is a length of track that comes in various lengths and build qualities with mounts that
can move a camera down the track at various speeds. These can usually be controlled by a
computer to create complex moves that can be repeated.
Sliders vary drastically in quality both in terms
of track gauge and motor speed. Youll want
one sturdy enough to support your camera of
choice with a motor that can drive the camera
at a sufficient speed for the move without a
lot of error. Cheaper rigs can produce
significant recoil when stopping and starting
which is not ideal given motion issues in VR.
Sliders are much smaller than dollys but
generally have a fairly long bit of track that
will obviously be in the shot. There is no easy
way of removing them other than painting
them out. See Clean Plates below.
Motion Control Rig
Similar to the slider above, motion control
rigs use a set of tracks and a motorized rig to
reproduce moves accurately again and
again. They are generally bulkier than a
slider being almost a combination between a
slider and a dolly but with many more
degrees of freedom. They can take several
forms the first large-scale application of
which was by Lucasfilm for Star Wars in
order to film the many passes that were
necessary for the final composites in the
films.
Page 28 of 68
They can accommodate most any camera but are very bulky and a big portion of the 360
scene will be blocked making paint-out costly. However, if you are needing to composite many
different layers of action with a single camera move in a mixed CG/live-action virtual reality
scene, shooting those layers against green screen using a MoCo rig could be a good option as
the rig could be easily removed in this case.
Cable Cam
Providing that a cable cam rig is sturdy enough to support
your VR camera it is actually a pretty good solution for 360
shooting. The camera is mounted upside down to a mount
on a small sliding cart attached to the cable. The rig is fairly
small and, if outdoors, would be likely replaced with sky and
therefore easily painted out.
Again, youll want to make sure that the motor on the cart is
precise enough to prevent sway when both starting and
stopping otherwise youll have the dreaded swaying horizon
lines and motion sickness. You can use the camera as
counterbalance and also attach gyroscopic stabilizers to
combat this as seen here.
Here the weight of your camera is the major consideration
so make sure your cable can support it or use smaller,
lighter VR camera rigs often of the GoPro variety.
Stedicam
The Stedicam is the workhorse of stabilized solutions for
2D productions. It is a mechanical solution that isolates the
camera from the operators movement. The operator
wears a vest which is attached to an arm that is connected
by a multi axis low friction gimbal to the Stedicam
armature which has the camera on one end and a
counterbalance on the other.
This setup allows for very fluid movements of the operator
without affecting the camera yet easy camera moves with
very little effort when desired. It produces very smooth and
pleasing moves in VR. Unfortunately, it also means you
have an operator extremely close to camera in all of your
shots. When the viewer looks behind them they get an
extreme closeup of the operators face right in theirs. Very
disconcerting to say the least!
It is nearly impossible to remove the operator from the scene in post. We have gone so far as to
attach a GoPro camera to the operators back and attempt to patch that back into the scene in
compositing. Unfortunately, the framing, lens, and differing perspective never allows a decent fit
Page 29 of 68
and youre left to blurring it into place. We have also tried blurring out the operator or replacing
him with a black vignette. All far from ideal and they remove the viewer from the experience.
If you are going to use a Stedicam the best option is to place a graphic to cover him. This could
be a static logo or an animated stats card for a player at an NFL game for example. Here the
viewer turns around, sees the graphic, and then can decide whether or not to turn back around
later. Yet they still remain grounded in the experience.
Gyroscopic Stabilizer
With virtual reality skyrocketing and the need for stabilized
movement without occluding the scene many other options
are coming to market. The Mantis 360 from Motion
Impossible is one of the more interesting solutions as it
combines a wheel dampened remote controlled dolly
(buggy) with a gimbal stabilized monopod for the VR
camera.
This allows you to remotely move your camera with smooth,
stabilized motion, no operator in view, and a very small
footprintsmaller than even a tripodallowing for easier
ground replacement or smaller logos needed to cover it up.
Eventually you will be able to plot courses within a set or
room and repeat these as though it were a MoCo rig as
well.
For more information visit Motion Impossible.
Page 30 of 68
Drones
Drones are a special category of camera rig.
Originally it was thought to be a bad idea
mounting a VR camera to a drone due to the
potential for very bumpy, unstable movement. But
it turns out that if properly stabilized and flown
according to the guidelines for movement above
drones can make for some amazing aerial shots
making it seem as though you are flying.
Typically the camera is mounted inverted to the
drone with some kind of stabilization system as
seen in the picture to the side. You want the
stabilization to occlude as little of the scene as
possible. Obviously the top of the scene (bottom
of the camera if mounted upside down) will see
the drone but this is relatively easily painted out in
post as it is usually just sky.
As with the cablecam, weight is of primary importance here. The heavier your camera the bigger
the drone required. Bigger drones require a special pilots license and additional restrictions from
the FAA which limits where and how you can fly them and increases the cost of operating them.
For these shots you may need to use a smaller, lighter GoPro based rig as seen in the picture
above with Jaunts older GP16 rig.
Finally, most drones, particularly the bigger ones, drown out any audio. You will likely need to
replace the audio with a separate voice over or music track.
Page 31 of 68
Clean Plates
All of the rigs above occlude the view of the scene from the VR camera in some capacity.
Usually the ground will be occluded from a tripod or slider or remote dolly and it will be
necessary to paint it back in or cover it up with a logo.
Logos are fine for quick turnarounds or travel or news pieces for example but not ideal for more
cinematic or dramatic content where youll want to paint out any rig that is in the scene.
Depending on the rig it could be relatively easy or very time consuming and expensive.
To aid in any rig removal it is highly recommended that you shoot clean plates of the ground that
your rig is covering so that you can use them to paint it back in. The most common tool for this
is a DSLR still camera with a wide enough lens to cover the area of ground occluded by your rig
from the distance of the bottom of the VR camera rig.
In a pinch, even an iPhone can be used. Depending on the area covered by the rig, you may
need to take multiple overlapping shots with your still camera. Youll also need to ensure that
your feet or shadows do not end up in the shots. A simple still camera rig for clean plates
consisting of sandbagged C-stands can assist in getting you out of your own way.
Obviously, if you have a moving shot, creating clean plates becomes more complicated.
Depending on the distance covered you may need to shoot many overlapping plates. It is
recommended that you overlap your plates by 15-20% to enable aligning and stitching these
plates together in post to create a ground patch that can be used to fully paint out the rig over
the full length of travel.
If the distance is long, like if remotely moving the Mantis rig above, it may be better to attach a
GoPro or other small video camera to the back of the rig and either manually or procedurally
pull still frames from it for ground plane reconstruction in post.
Page 32 of 68
Water Issues
Working in and around water constantly comes up. Some of the most exciting shots in VR can
actually be underwaterpicture coming nose to nose with a Great White shark or exploring the
Great Barrier Reef. Unfortunately, the complexities of many VR camera rigs makes shooting in
these environments cumbersome or impossible.
Rain
When shooting in 2D with a framed shot, rain is rarely an issue as the camera can be placed
under a waterproof tarp with a hood over the lens or other such mechanism. Not so when
shooting in 360the tarp or hood would be visible in the shot and block much of the scene.
This makes shooting in rain, even a light rain, very difficult. Not only are many rigs not built to
withstand water exposure but even a few drops would ultimately land on one or more lenses
making that portion of the scene blurry and likely unstitchable.
Many a shoot has been called off due to in-climate weather or a chance of rain where its just
not worth potential damage to the camera. Likewise, depending on your camera rig, even snow
might be too risky. In the Jaunt ONE case, a light snow might be doable but anything more and
you risk water exposure and damage. In any event, within a matter of seconds snow would hit
one of the lenses and the shot would be compromised.
Underwater
Obviously shooting underwater is the ultimate test for any
camera rig. For standard 2D video cameras (or even 3D
cameras) there are many underwater housing options
available that allow for full submersion. Its much more
complicated to devise such an enclosure for VR rigs without
them obscuring the scene or compromising the lenses.
Typically these are for the smaller GoPro type rigs. One
such rig is the Kolor Abyss rig.
Unfortunately, most underwater VR rigs are currently mono
only and dont support the somewhat larger 360 stereo 3D
rigs which can limit their impact. Nose to nose with a shark isnt as great if it never really gets
any closer to you. More options need to be available in this space and for a wider variety of rigs.
Even if you do have such a rig available, shooting underwater presents its own challenges. The
sea is very unpredictable and uncontrollable and sea life likely wont respect whatever distance
from camera is required from your rig likely making some shots unstitchable. Its also usually
difficult to operate all the camera modules within the rig and you must pay special attention to
battery life and memory card space as gaining access to these enclosures typically is not easy
and takes a fair amount of time away from shooting.
Page 33 of 68
Jaunt is currently investigating options for a waterproof enclosure for the Jaunt ONE camera
system as underwater shooting in full 360 3D VR is too ripe with possibilities to miss. Many
companies in the underwater camera housing space are also looking to get into VR and
devising solutions for a variety of rigs.
Page 34 of 68
Page 35 of 68
There is another option in certain circumstances. If you have control of the scene and there
arent too many objects that are moving in and out, sunlight and shadows changing, trees
blowing, etc then you might be able to shoot the scene in two halves. Shooting the main action
in 180 with crew and lights on the opposite side and then switching everything over and filming
the second 180. These two sections would then be comped together in post. You would
actually film and stitch both shots fully 360 but only use each half as needed in the comp to
enable a proper blend.
Again, this only works if you have command of the scene. If lighting and shadows change or
something moves then the two halves may not comp together properly and the shot will be
ruined unless you spend more time in post painting things out. This also wont work if you have
a moving camera unless you are using a MoCo rig and the movement for the two halves are
nearly exact. If they arent it will be impossible to blend the two due to parallax differences.
More generally you can use this technique to create clean plates and remove unwanted objects
by filming the scene with and without the objects present in their particular locations. By
continuing to roll until an unwanted object is out of the scene you can comp the clean portion
into the main action thereby removing that object. This is often used for removing crew and
other production gear when piggybacking onto an existing 2D production where control is
limited. Again this only works if you have a static camera.
Page 36 of 68
Live Preview
So if you cant be near the camera while youre filming how do you see what youre doing? The
simple answer is you need live previewing out of your camera. Seems like a simple feature that
2D cameras have had for years in video village. Of course, being VR, things arent that simple.
The first problem is that there are many camera modules in any given VR rig. In the Jaunt ONE
there are 24 cameras all shooting HD footage. This is a huge amount of data to manage even if
cabled. And remember, having no behind the camera means you would see these cables
traveling to video village. Any solution then must be wireless which is currently impossible with
that many cameras and current bandwidth limitations. As anyone whos worked with WiFi
broadcast from a GoPro knows, there are many inherit limitations with wireless signals not the
least of which is distance and obstructions. If youre hiding behind a steel wall say bye bye to
WiFi.
Even if you were able to wirelessly receive all camera feeds, live stitching algorithms still arent
great and require extra processing horsepower on set. If you cant view a full 360 stitched
preview out of your camera then which camera do you choose to monitor? Your rig must provide
the ability to choose which camera or cameras to look at on the fly or you must use another
solution.
Luckily there are several inexpensive solutions available in a pinch if your rig doesnt support
some form of live preview. Inexpensive consumer cameras such as the aforementioned Ricoh
Theta or Kodak 360 Action Cam can be placed on or near your camera that stream a livestitched 360 mono image to your iPhone or iPad over WiFi that
enables to your move around the scene and see what is happening.
Coupled with standard wireless audio feeds you can safely direct your
actors. Again, these are consumer cameras and WiFi and image
quality are not the best so be aware of the limitations.
Another higher quality solution is the Teradeck Sphere. This allows
you to connect up to 4 HDMI cameras (such as GoPros) to its hub
and wirelessly transmit and stitch the 360 image directly on your
iPhone or iPad. This solution is small enough to mount underneath
your VR rig and provides a high quality mono stitched imaged with
which you can move around in.
The Jaunt ONE camera was designed to record directly onto SD
cards with no cables necessary making it largely autonomous. Simply
press the button to record and walk away. However, for the reasons
above, a wireless live preview capability will be added to the camera in
a future firmware update.
TERADEK SPHERE
Page 37 of 68
Page 38 of 68
Page 39 of 68
have the desired effect. They might be looking off to the side so when you cut theyre looking at
a chair and not your actor. See below for ways on Getting the Viewers Attention.
Most important are the pace of the cuts. Every time you cut its like youre teleporting to a
different location and that can be very jarring especially if the pace is too quick so youll want to
slow this way down. The viewer needs a good amount of time within a new position to fully
immerse themselves, look around, and get their bearings. Cut too quickly and your viewer will
be frantically looking about trying to figure out what is going on and what to look at all while you
are tiring them out.
Though hard cuts can work and are effective for feeling abrupt changes, in general it is much
gentler and more effective to blink. This is where the scene dims to black and then back up to
the cut scene over the course of about a second or more. Its very much like blinking in actual
life and opening your eyes in a new location. Surprisingly effective. Spherical reveals or wipes in
360 are also a great way to gently unroll the next scene.
New methods of coverage need to be developed for VR that get around some of the limitations
and are more suitable to the medium. For instance, instead of cutting to a close-up from a more
distant wide shot, stay in the distant shot and overlay a 2D or 3D inset close-up shot of the
main subject. This not only keeps you from feeling teleported but also provides visual interest in
the scene by introducing other elements overlaid at different depths. You can even project these
in post on different objects in the sceneto make a video wall for example as we did with a shot
including Ryan Seacrest for Tastemades A Perfect Day in LA.
Ultimately, many more of these types on innovations will be needed to evolve our language of
storytelling in VR.
Page 40 of 68
Not so in VR. The viewer is free to look around in any direction they like and there is no frame.
Which means its entirely possible they may miss an important story point.
So how do you get the viewers attention and keep it? Luckily, many of the same tricks from film
still applymotion, light, and sound.
Motion
As mentioned above humans are finely attuned to motion and will generally gravitate to anything
moving in the scene. Have a butterfly flit in and around and most viewers will typically follow it.
Set this up right with enough preroll for them to see it and you can guide them precisely to what
you want them to see. This is doubly true if you couple movement with the stereoscopic depth
you have at your disposal. Have the butterfly also fly towards the viewer and you are
guaranteed to grab their interest.
Light
Light is also a motivating factor. Just as in a 2D frame light can can draw attention to objects or
subjects. As viewers look about the scene a ray of light highlighting something is a subjective
clue that they should pay attention to that object. Similarly, dappled light can highlight and
heighten different depth cues along with actual stereoscopic depth.
Sound
Sound is an incredibly important component of any piece of content but exponentially so in VR
because of what it provides in capturing the viewers attention. Many VR platforms, including the
Version 1.5, January 2017
Page 41 of 68
Jaunt app, are capable of playing back spatial 3D audio in the ambisonic or even Dolby Atmos
formats. These sound formats record and emanate sound from where they actually occurred in
the scene. This gives you an extraordinary opportunity to use sound in directing the viewers
interest. Place a car crash behind them with corresponding sound and they are guaranteed to
look. See below in the Sound section for more information.
Interactive
Depending on how your content is being distributed you may have some interactive capabilities
at your disposal. If so, this is another great wayperhaps the best wayto make sure your
viewers are looking where you want.
Most platforms or development environments use gaze detection to know exactly at what
portion of the 360 scene the viewer is looking at during any given moment. If you can harness
that information interactively you can do some very cool things. For one, if someone isnt looking
at what you need them to for an important story point you can pause of loop the scene until they
do and then trigger the scene to continue.
Also, as noted above, viewers may be looking somewhere else in the scene so that when
cutting to a new shot they arent at all focused on what you need them to be. Using gaze
detection you can cut to the new shot and change its yaw value (the rotation of the 360 sphere)
to match the object of focus in that scene to the direction in which the viewer is looking. Magic.
There are many more ways in which interactivity can skirt tricky issues in VR and engage
viewers more fully. See below in the Interactivity section for more information.
None of the Above
Finally, maybe you should just let go and let the viewers look wherever it is they please! In this
new medium it might be best to relinquish such strict control and let the audience have their own
experience. Secondary actions in scenes can give the viewers more to look at and can enhance
the narrative. This can lead to personalized experiences and repeat viewings. Tricky stuff to be
sure and it needs to be planned in from the script writing phase but this is the future of
storytelling in VR.
Page 42 of 68
it feels as though they they are going to fall as seen in this picture from Jaunts North Face:
Climb.
Placing the camera on the ground is also very unnatural and feels as though you are embedded
or trapped within the floor leading to a very claustrophobic feeling for some! While unnatural
feeling this can be used to great effect in the right circumstances as Doug Liman did in Jaunts
first thriller series, Invisible. Here, after the invisible killers first kill, the viewer feels a sense of
hopelessness and claustrophobia as they look eye level into the dead victims face.
Page 43 of 68
In general, for most circumstances, its best to place the camera at the height of an average
human and move up or down from there remaining conscience of how this affects the viewers
perception and their identity. There are of course many situations where you will want to play
with the viewers emotions, making them feel small or powerful, and camera height is a great
way to achieve this.
Another very important part of identity that should be mentioned is your bodyor in this case
lack of it. In any of these setups it can be very disconcerting for the viewer to look down and not
see the rest of their torso, arms, and legs. It can make the viewer feel like a disembodied ghost.
Here again interactivity and more advanced body tracking and sensors will enable us to overlay
a CG avatar of the viewers body that gives them a heightened sense of presence. Already hand
controllers with the HTC Vive enables this in realtime gaming engines and its only a matter of
time where this becomes the standard in cinematic VR as well. In the meantime be aware that
the lack of it can reduce immersion.
Page 44 of 68
Page 45 of 68
Extreme Contrast
Unlike in traditional filmmaking where you are generally only exposing for one section of the
environment in frame, with 360 filming you need to account for the entire environment. In many
cases, especially outside, that means you might have a very bright sunlit side and a darker
shadow side.
Generally, your camera rig should allow for individual exposures for the cameras that make up
the rig. Most often they would be set to auto exposure to adequately expose the scene through
each camera. The stitching algorithm will then blend these exposures to make for a seamless
stitch. You may want to lock a particular cameras exposure settings for a certain effect or to
keep the camera from cycling up and downin the presence of strobe lights for example. Just
beware that some cameras may blow out in the highlight areas or become underexposed with
lots of noise and not enough detail in the shadow areas.
The Jaunt ONE camera with Jaunt ONE Controller software is able to globally or individually
control each cameras ISO or shutter speed to create a proper exposure around the camera
while in manual mode. Full automatic mode with control over Exposure Value (EV) bias is also
Page 46 of 68
available for each camera to auto expose the scene and is the recommended mode for general
use.
For more information please consult the Jaunt ONE Controller user guide.
Flares
Flares are caused when sunlight hits a lens and scatters causing circles, streaks, haze, or a
combination of these to form across the image. Generally these are undesirable though certain
filmmakers go to great lengths to include them. Just ask J.J. Abrams! In stereoscopic
cinematography they can create stereo artifacts between the left and right eyes as the lens
refraction is different between adjacent cameras due to varying angles incident to the light
source and should generally be avoided.
Page 47 of 68
Of course, in VR things are more difficult. With a traditional camera you can usually eliminate
flares just by blocking out the source of the light thats causing the flare near to the lens.
Generally this means mounting a flag or lens hooda piece of dark cardboard or fabricon or
near the camera housing or on a C-stand to block the light. This is possible because you can
usually mount the flag outside of the frame so it is not scene by the lens. With a 360 VR
camera there is no frame of course and anything mounted around the camera will be seen and
recorded. Thus, to eliminate flares in VR you need to get creative.
If possible, you should first try to position or rotate the camera in such a way that can minimize
any flares. Hopefully your camera system is capable of doing quick 360 previews. With the
Jaunt ONE, you can use the Jaunt ONE Controller software to shoot a quick 360 preview still
frame or short video sequence to check for flares. It should be immediately obvious if any of the
lenses are getting hit.
For more information see the Jaunt ONE Controller guide.
Page 48 of 68
If you cant rotate or position the camera to eliminate or minimize the flares in the lens then you
should try to move it behind something within the scene that can block the light and act as an
organic flag. A tree, a rock, a vehicle, a building, or even another person can go a long way
towards blocking the light none of which would need to be removed later.
If you cant move or reposition the camera then you may be forced to just live with the flares if
they arent too distracting or cause discomfort in stereo due to left/right eye rivalry. Your final
option if they are too distracting is to remove them in post by painting them out. In VR, you have
many adjacent cameras available to you which may not have been hit by flares which you can
use as a clean plates to clone into one or both eyes after stitching.
Rigging
Much like flares, rigging lights becomes a bigger
issue in VR due to the lack of a frame and there
being no behind the camera. Big, expensive lighting
rigs may become a thing of the past simply because
youll always need to hide them. DPs of the future
will need to become much more adept at hiding their
lighting and making it more organic to the scene or
taking better advantage of natural light.
Page 49 of 68
One very useful tool in the VR DPs toolkit are LED strip lights. Many manufacturers are coming
out with these in a variety or formats and configurations and they can easily be placed on or
around the camera housing or tripod while not being seen in the shot. These can provide
enough illumination to fill the scene or subject with light from the camera.
Additional banks of these can be hidden within the scene to provide extra lighting such as key
lights, rim lights, hair lights, etc. Many of these are fully dimmable and color temperature
selectablesome fully changeable to any of millions of colors. Some are capable of local
dimming where you can individually control each LED on the strip for various lighting effects.
Some, like the Ninja lights above, are even capable of being remote controlled via WiFi so you
have a full DMX controller on your iPhone or iPad to control each of the individual strips that
make up your scene lighting.
If these solutions dont end up providing enough light, you can always bring in traditional set
lighting and try to remove it in post. If you have full control over the environment this is most
easily done by shooting in 180 halves and then compositing these together in post. You would
first shoot the action in the front 180 with the lighting behind the camera in the other 180.
You would then switch this and shoot the action or scenery in the other 180 while lighting in the
reverse. This only works if you can control the environment to enable you to stitch these two
halves together later. If you are outside and something moves between the two halves or a car
disappears or the lighting changes on you this wont work. Youll also need to be able to
separate the action and lighting into discrete halves or it will become more difficult and
expensive to blend the two in post.
Ultimately, the lighting communityboth on the creative side and the hardware sideis going to
need to get a lot more resourceful in terms of how they approach VR in order to simplify
shooting and reduce post costs while still creating beautifully lit scenes.
Page 50 of 68
Binaural recording
One application of binaural audio is binaural recording, in which a pair of microphones are
placed into the ear canals of a person or a dummy head in order to record live sound with the
physiological aspects of the head, torso, and ears, affecting the left and right channel
accordingly. The resulting recording is intended to be played back over headphones, providing
not only an accurate stereo image, but the illusion of being present in the sonic environment. In
the real world, however, we are able to turn our head from side-to-side, changing the relative
Version 1.5, January 2017
Page 51 of 68
positions of objects in the scene. Head mounted displays provide this capability, so binaural
recording is not a technique that can be used for capturing audio for VR.
HRTFs
In order to provide the spatial cues of binaural audio and allow for head movement, VR audio
requires the use of head-related transfer functions, or HRTFs. An HRTF is a series of audio
impulse responses measured using a binaural recording apparatus, with each IR captured at a
different position relative to the head. When placing a sound into a VR scene, the following
attributes are considered:
1. The objects position within the scene
2. The users position within the scene
3. The direction the user is facing
A fully-featured object sound rendering system will utilize this information in selecting the
appropriate directional IR from the HRTF and additional processing to create distance cues.
Furthermore, the size, shape, and material composition of the virtual environment can be
modeled to create an acoustic simulation for an even more convincing immersive experience. In
general, such advanced features are available only in game engines in which the experience is
generated in real-time. As will be discussed later, 360 video players generally employ a subset
of these features for the sake of being able to take advantage of bitstream formats such as AAC
audio, avoiding the need for large downloads.
Caveat: personalization
It should be noted each of us has a very unique physiology when it comes to our ear shape,
head size, and other physical attributes. Our brains are tuned to understand the signals coming
from our own ears, not the ears of another person or those of a dummy head. Therefore, the
effectiveness of a binaural recording or binaural audio achieved with HRTFs is reduced the
more the recording apparatus differs from our own body. Several approaches exist for
generalizing binaural audio as a one-size-fits-all solution, but the results vary from person to
person. The only way to fully achieve the illusion of auditory presence is by using personalized
HRTFs, but this is impractical for any kind of widespread adoption. The good news is that the
visual component of VR makes up for the inaccuracy of generalized HRTFs to a certain extent.
Page 52 of 68
The format represents the full 360 scene, including height information
The format can be streamed over a network
The scene can be rotated using data from a head-mounted display
The scene can be rendered for headphone playback
The rest of this chapter will discuss the following formats which fulfill the above criteria:
Ambisonic B-format via compressed PCM audio (e.g. AAC)
Dolby ATMOS via Dolby Digital Plus E-AC3
Another format is the Facebook 360 Spatial Workstation (formerly TwoBigEars 3DCeption),
which defines a format for cinematic VR which is not yet able to be streamed over a network. Its
toolset, however, allows for B-format conversion and can be used for the purpose of cinematic
VR production.
Also worth mentioning is so-called quad binaural, which is produced by the 3DIO Omni
microphone. This format provides four directions worth of stationary binaural recordings which
can be cross-faded with head tracking data.
Page 53 of 68
Increasing order above TOA is possible, but for practical purposes is rather uncommon
(fourth order requires 25 channels, etc.).
B-format explained
Ambisonics is typically encoded in B-format, which represents a collection of spherical
harmonics. Since the mathematics involved in understanding ambisonics is beyond the scope of
this guide, we prefer to use an analogy to a well-understood concept in sound reproduction:
B-format is to surround sound as mid/side is to left/right stereo.
In other words, similarly to how a mid/side recording can be converted to left/right stereo, Bformat can be converted to a surround sound speaker setup. In fact, FOA is itself an extension
of mid/side stereo.
From Wikipedia: Ambisonics can be understood as a three-dimensional extension of M/S (mid/
side) stereo, adding additional difference channels for height and depth. The resulting signal set
is called B-format. Its component channels are labelled W for the sound pressure (the M in M/
S), X for the front-minus-back sound pressure gradient, Y for left-minus-right (the S in M/S) and
Z for up-minus-down.
The W signal corresponds to an omnidirectional microphone, whereas XYZ are the components
that would be picked up by figure-of-eight capsules oriented along the three spatial axes.
HOA involves more complex pressure gradient patterns than figure-of-eight microphones, but
the analogy holds.
It is important to understand that B-format is not a channel-based surround sound format such
as 5.1. Surround sound formats deliver audio channels intended to be played over speakers at
specific positions. B-format, on the other hand, represents the full sphere and can be decoded
to extract arbitrary directional components. This is sometimes useful in playing back ambisonics
over a speaker array, but for the purposes of VR, ambisonics to binaural conversion is
performed in the player application, such as the Jaunt VR app.
B-format representations
There are two predominant channel layouts for B-format. The most common is called FurseMalham (FuMa), which orders the four first-order channels W, X, Y, Z. HOA representations
using FuMa involve the use of a lookup table to map spherical harmonics to channel numbers.
The second representation is called Ambix, which has recently been established as a format
unto itself. Ambix orders the first-order components W, Y, Z, X. Ambixs channel ordering is
based on an equation that maps spherical harmonics to channel indices, and scales up to
arbitrarily higher orders without the need for a lookup table. For this reason, Ambix is the
preferred representation for HOA and has been adopted by Google and others as the de-facto
standard transmission format for ambisonics. Unfortunately, most production tools available
Version 1.5, January 2017
Page 54 of 68
today were built with FuMa in mind, so conversions are necessary when interoperating with
Ambix tools and publishing to YouTube. At the time of writing, Jaunt VR continues to use FuMa.
B-format playback
Unlike stereo or surround formats, B-format can not simply be played back by mapping its
channels to speakers. B-format signals must be decoded to directional components, which can
be mapped to speakers or headphones. The most common method of B-format decoding
employs an algorithm to extract a virtual microphone signal with a specific polar pattern and
yaw/pitch orientation. For example, an FOA signal can be used to synthesize a cardioid
microphone pointing in any direction. HOA signals allow for beamforming narrower polar
patterns for more spatially precise decoding. In general, the higher the order, the narrower the
virtual mics, and the more speakers can be used to reproduce the sound field.
In VR playback applications involving FOA, a cube-shaped decoder is often used, employing
eight virtual microphones. To achieve binaural localization, each virtual mic is processed
through an HRTF of the corresponding direction. The resulting processed signals are then
summed for the left ear HRTFs and right ear HRTFs. Prior to binaural rendering, the soundfield
is rotated using head tracking data from a head mounted display.
Recording B-format
There are microphones which are designed specifically for the purpose of capturing first order
B-format, including:
Page 55 of 68
Mixing B-format
When mixing for cinematic VR, traditional DAW workflows are preferred. Since 360 video is a
linear format just like any other video, it makes sense to use tools and workflows that are
already well established in the film and video production industry. Also, since 360 video is
linear, it can be streamed to the end-users device without the need for downloading the entire
scene. This means we can deliver our spatial audio mixes as traditional PCM audio and
package it within MOV and MP4 containers along with h.264 video. This applies to Dolby Atmos
as well as ambisonic B-format.
This section covers some common activities you will encounter when mixing B-format:
1.
2.
3.
4.
5.
Create an ambisonic mix using monophonic sound sources (e,g. lavaliers, sound effects)
Edit an ambisonic field recording
Adapt a surround sound recording to B-format
Match audio sources with 360 video
Combine the B-format audio mix with 360 video
In our walkthrough of these steps, we assume the use of the Reaper DAW from Cockos Inc.
Reaper is currently the best DAW for working with Ambisonics because it allows for tracks with
arbitrary channel counts. Other DAWs only support well-known surround sound channel
arrangements such as 5.1, while Reaper allows for the four and sixteen channel tracks required
to create FOA and TOA.
In addition to Reaper, you will need a suite of VST plug-ins that support B-format processing. A
number of high quality plug-in suites exist, but we recommend the following options:
For TOA or FOA: Blue Ripple Sound TOA Core VST (Free)
For FOA: ATK Community Ambisonic Toolkit
For HOA or FOA: Ambix Plugin Suite from Matthias Kronlachner
Version 1.5, January 2017
Page 56 of 68
The Blue Ripple Sound plugins will cover all of your needs up to third order processing
(including converting to/from Ambix), and other packages are available for more advanced
functionality. The ATK plugin suite is designed for Reaper only and is quite simple and userfriendly, but the range of functions are limited and you can only mix FOA. The Ambix Plugin
Suite has limited functionality, but can be used to assemble very high order ambisonics (up to
seventh order, with 64 channels per track). The Ambix converter plugin is particularly helpful
when going between different ambisonic representations. This guide assumes you are
producing FOA (four channels), but you can easily switch to TOA if using the Blue Ripple TOA
VST plugins.
Assuming you have some sound recordings from the set of a video shoot (lavs, booms, etc.),
the stems of a multitrack board feed from a live concert, or some sound effects to add to your
mix, you can insert these into your Reaper project using an ambisonic panner plugin. Panners
take a 1-channel input and provide control over yaw and pitch. Set your desired yaw and pitch
to match the objects position within the scene. The output of the panner will be a 4-channel Bformat soundfield. Repeat this process for as many sources as you like.
In order to audition the mix over speakers or, ideally, in headphones, you will need a decoder
plugin inserted on the master bus. Make sure that the source track is a 4-channel track, and the
master is also 4-channels. The binaural decoder plugin takes a 4-channel B-format input and
outputs 2-channel binaural. Route the master bus to your stereo audio device and you can
audition over headphones.
Version 1.5, January 2017
Page 57 of 68
To simulate the effect of head rotation in VR, you can insert an ambisonic rotation plugin in the
chain before the binaural decoder. The rotation plugin will give you control over yaw, pitch, and
roll. It takes a 4-channel B-format input and produces a 4-channel B-format output.
Finally, it can be very helpful to see the soundfield using an ambisonic visualizer plugin. This is
especially useful in debugging your channel routing configuration. Insert a visualizer plugin in
the chain after the rotation plugin and before the binaural decoder. Play a track with a single
panned mono source, and tweak the yaw parameter. You should see the visualizer heatmap
display move from side to side. Tweak the yaw parameter of the rotation plugin, and you should
see the same behavior. The Jaunt Player application also provides a heatmap function which
can be useful in visualizing your mix overlaid onto your video.
Become intimately familiar with panning, rotation, visualization, and decoder plugins. They will
serve as the bread and butter of your ambisonic production work.
Page 58 of 68
When starting with A-format, it is probably easiest to convert to B-format as a separate task, and
work with the B-format audio file in your session. When you do this, be sure to disable any
decoder plugins that might be running on your master bus.
Editing a 4-channel B-format audio file is just like editing any other audio track. You can apply
volume changes, perform cuts, mix multiple sources, etc. Additionally, you can rotate the
recording to fix any camera/microphone alignment issues, or use virtual microphones to extract
specific directional components from the recording. The Blue Ripple TOA Manipulators VST
suite proves an abundance of tools for performing more advanced operations on your B-format
field recordings.
Page 59 of 68
Channel
Left
30
Right
-30
Center
LFE
n/a
Left Surround
110
Right Surround
-110
Page 60 of 68
Once you have validated that your mix sounds correct in VR, you will want to attach it to your
video. If using Jaunt Cloud Services, follow the workflow for uploading the mix as a master to
the project you are working on. You can then assign your mix to the cut in the cloud, and
download a transcode of the video for playback within the Jaunt Player. If you are not using
Jaunt Cloud Services, you can combine the mix with an MP4 video file using a muxing tool such
as iffmpeg. Jaunts specs recommend converting to 4-channel AAC at 320 Kbps bitrate.
When your mix is properly combined with video, it should be contained within an MP4 file in
which stream 0 is a h.264 video transcode and stream 1 is your 4-channel AAC audio. This
video file will play in the Jaunt Player and can be viewed in VR using an Oculus Rift DK 2 or
CV-1. Use the heatmap overlay feature of Jaunt Player to ensure your sound sources have
been properly panned relative to the picture.
Page 61 of 68
Dolby Atmos
As an alternative to B-format, Dolby Atmos provides similar capabilities and fulfills the criteria for
a complete solution for cinematic VR audio. Unlike B-format, Atmos does not encode all the
scene information into a baked PCM audio soundfield. Instead, Atmos utilizes up to 118 object
tracks with spatial metadata in order to convey the full sound scene. For transmission, Atmos
printmasters are encoded to Dolby Digital Plus E-AC3, a codec that is supported by a wide
variety of software and hardware. Using the Dolby Atmos for Virtual Reality Applications
decoding library, E-AC3 streams can be decoded and rendered binaural audio of very high
quality. Since B-format is becoming fairly widespread among 360 video players, the Dolby
Atmos tools also provide a B-format render from the print master, so you can use the Dolby
Atmos authoring tools whether or not your distribution platform supports E-AC3. The Jaunt VR
application supports Dolby Atmos, and Jaunt Cloud Services accepts audio masters in E-AC3
format.
The Atmos authoring tools are a suite of AAX plug-ins for ProTools. If you already mix for film
using ProTools, the Atmos workflow extends your existing setup in order to enable production
for VR experiences. Atmos does not yet natively support B-format inputs, so if you are working
with B-format field recordings you will need to convert these to a surround bed for inclusion
within your mix.
Version 1.5, January 2017
Page 62 of 68
Please refer to Dolbys documentation for further details on producing VR audio with the Dolby
Atmos toolset.
Please refer to Facebooks documentation for further details on producing VR audio with the
Spatial Workstation tools.
Page 63 of 68
Post-Production
Fixing Stitching Artifacts
SO
O
N
Editing
Working with Proxies
Available Tools
Dashwood Stereo VR Toolbox
Mettle Skybox 360
Post Stabilization
IN
Color Correction
Final Conform
O
M
Rendering in 360
Page 64 of 68
O
M
IN
SO
O
N
Interactivity
Page 65 of 68
Appendix
Guidelines for Avoiding Artifacts using the Jaunt ONE
This appendix provides a checklist of best practices to effectively capture VR video with the
Jaunt One camera rig in conjunction with Jaunt Cloud Services (JCS) while avoiding artifacts.
Page 66 of 68
Camera motion - Any motion other than constant velocity in a straight line can lead to
nausea.
Lens flares - Lens flares can cause inconsistencies between individual cameras and should
be avoided wherever possible.
Repeated texture - Repeated similar textures such as a highly repetitive wallpapers can
cause temporal inconsistency and localized stitching artifacts.
Thin structures - Thin structures (e.g. ropes, tree branches, ) are hard to reconstruct
without artifacts. Artifacts can be reduced if thin structures are in front of and close to (similar
depth) a bigger background object. Results also improve by increasing distance between thin
objects and the camera rig. Objects in front or behind thin objects may cause artifacts (e.g.
person behind a mesh fence).
Semi-transparent surfaces - A single depth for each point in the scene is estimated which
can lead to issues for semi-transparent surfaces.
Page 67 of 68
Legal
Any third party marks or other third party intellectual property used herein is owned by their
respec6ve owners. No right to reproduce or otherwise use such marks or property is granted
herein.
This guide is for informa6onal and educa6onal purposes only. No express or implied
representa6ons or warran6es are made herein and are expressly disclaimed.
Any use of the Jaunt ONE camera and any other products or devices referenced herein is subject
to separate specica6ons and use and safety requirements of Jaunt, Inc. and third party
manufacturers.
This Field Guide is not intended to provide legal or safety advice. See manufacturers
specica6ons for further informa6on.
Jaunt (word mark and logo) is TM of Jaunt, Inc.
Use of the Jaunt ONE camera requires compliance with applicable laws.
No endorsements of third party products is intended by this Field Guide. Any cri6ques of third
party products are based solely on the opinions of the author(s) of this Field Guide.
Page 68 of 68