
HOW IT WAS MADE
The Creation of The Sound Library
PRODUCTION
Recording
Location Recording
This stage consisted of recording several clips of natural soundscapes at various lengths in several locations, these included:
- Crombie Country Park, Monikie
- Monikie Country Park, Monikie
- Craigmill Den, Carnoustie
- St. Vigeans Village, Arbroath
My goal at this stage was to collect several examples and representations of specific types of birdsongs at different points of the day. I felt there was a need for more varied birdsong as many sound libraries offer a very limited selection, leading to a large portion of dubbed video in large productions having the same sound effects - an example would be the eagle / hawk sound most popularly found within the Assassin’s Creed games. The differing times of day was observed during my recordings as an important factor as different birds are heard at only specific times of the day, e.g., small songbirds in the morning, owls at night, etc.
Recording took place over several months and was complete using my personal TASCAM DR-40X. To align with broadcast industry, practice all clips were recorded as either a Stereo or Mono file, using the .WAV format, with a sample rate of 48kHz and a bit-depth of 24-bit.

360 Video Recording
As mentioned, to provide evidence of how B-Format audio is appropriately used, it was decided that a series of 360 videos would be recorded at the various locations using a Ricoh Theta camera and a Zoom H3-VR ambisonic microphone.
As seen above, the camera and microphone were mounted to a tripod using an attachment supplied with the Zoom H3-VR. This allows both devices to be mounted as closely as possible, therefore the audio captured is accurate to the image captured by the Ricoh camera. A spirit level on the tripod was used to ensure they were level, a small indicator on the microphone can also be found that shows if it is level - this was also used.
All captured video clips were stitched together using the Ricoh Theta application for Windows, this created a full equirectangular file. Premiere Pro 2018 was used to edit the video files and merge all processed B-Format files.

Premiere can host 4-Channel audio easily, the audio tracks must be customised to allow for this to be easily imported. There is a built-in 360 video mode and ‘monitor ambisonics’ setting. Text was added to provide all information required for each location. The final video was exported using standard settings for YouTube 360 distribution.

Processing
This stage began by locating all previously recorded audio files from past projects, each individual file was placed into a ProTools session for processing. These simple processes included adjustments using; the FabFilter ProQ3 EQ, D3 CL Compressor, and Maxim plug-ins.
As previously stated, all audio files must comply with the DPP file format standards and EBU R128 standards, within this ProTools session these can be followed easily. To ensure the loudness requirements set by the EBU, the Youlean LUFS Metre was used to maintain a maximum integrated reading of -23 LUFS and no file exceeds - 1dB True Peak loudness. All files were exported from the session using the .WAV file format to remain compliant with the DPP.
B-Format
To process ambisonic audio, ProTools Ultimate was used as this version of Avid’s DAW includes the implementation of multi-channel configurations, these include:
- LCR
- Quad
- 5.0
- 5.1
- 7.0
- 7.1
- 7.0.2
- 7.1.2
- 1st Order Ambisonics
- 2nd Order Ambisonics
- 3rd Order Ambisonic
For this project, a 1st Order Ambisonics track was created to place the 4-channel audio captured using the H3-VR. This process automatically creates a 1oA Bus that can be utilised in conjunction with an Aux channel to downmix the ambisonic audio to stereo, allowing for monitoring on a traditional stereo output.
Audio was then processed using the SoundField plug-in by RODE, this tool was used to ensure all audio was encoded correctly. Ambisonic audio can be encoded with 3 methods in this tool; NF-SF1, B-Format (FuMa), or B-Format (AmbiX), for this project B-Format (AmbiX) was selected as all audio was originally recorded and encoded by the Zoom H3-VR using this setting. Further processing included EQ adjustments using the FabFilter ProQ3, compression using the D3 CL, and level adjustments using the Maxim plug-in.
All ambisonic audio was processed to maintain the same EBU and DPP standards as discussed previously.

Organising
The Universal Category System is a public domain initiative formed by industry professionals. This system offers a fantastic solution to accurately labelling audio files based on the contents and method of recording, allowing for several similar audio files to be given a unique line of metadata to enhance organisation and discoverability within the library.
This functions as a spreadsheet, within a list of categories and subcategories are found, with each a list of descriptions and synonyms are provided to help explain the meaning behind each individual category and subcategory pairing. To find the needed subcategory for a given audio file, one would simply search for keywords that best describe the nature of its contents. Once given a category and subcategory pairing, a shortened Category ID is given that would be used at the beginning of the line of metadata.
Example: an audio track of ambience on a farm would assign the category pairing of - Ambience Farm. This would be shortened to the CatID of AMBFarm.
To accurately make use of the UCS naming convention, the application ‘UCS_FileNamer_Ver1.01’ was published to allow the user to create labels using the standardised format - this includes the following data:
- Category ID
- User Category
- Vendor Category
- FX Name
- Creator or Recordist
- Equipment
- And Additional Information
The user and vendor categories allow for additional key characteristics to be listed at the beginning of the file name, an example would be if multiple sounds within the CatID list share the same aspects then a keyword can be used to link them or set them apart. The FX Name is a simple description of the sound itself; this is most likely what the audio file was originally labelled as. The Creator or Recordist and Equipment sections simply assign initials based on who created this file or what equipment was used to originally record the sound. Additional Information was used to list any unique characteristics of this sound, such as location, time of day, audio format (Stereo, Mono, AmbiX), sample rate and bit-depth.
Audacity, the open-source DAW, was then used to monitor all audio a second time to ensure all standards are maintained using the YouLean Loudness Metre. Audacity was also selected as upon exporting files, this application allows for further metadata labelling within the file properties, here additional information about each file was added.
All audio was then placed within a large folder and then organised into sub-folders using their given Category ID’s. This resulted in 86 individual category sub-folders for 364 audio files.
