


LambdaMediaConvert first runs MediaInfo on the source in order to determine the number of audio tracks, pushes a new job to MediaConvert, and stores job data in Amazon DynamoDB.S3 triggers the first AWS Lambda function (LambdaMediaConvert).A new source asset is uploaded to an Amazon S3 bucket.The workflow diagram for this solution is shown below: If the Input Max Loudness value of an audio track is lower than -50 dB, that track is considered silent audio. The produced loudness logs provide the Input Max Loudness value for each audio track that we will compare to a threshold of -50 dB. The workflow is flexible, so we can analyze more than 60s if required. We will be using MediaConvert to analyze loudness levels of the first 60s of media source assets. It also allows logging loudness levels and storing these logs in S3. MediaConvert configuration allows the selected algorithm to only produce loudness measurements. The Audio Normalization feature of MediaConvert makes it easy to correct and measure audio loudness levels, supporting the ITU-R BS.1770-1, -2, -3, and -4 standard algorithms. The workflow is automated by AWS Lambda functions and triggered by file uploads to Amazon S3. In this post, we will create a workflow using AWS Elemental MediaConvert to analyze audio tracks of media assets and measure their loudness. On the other hand, if six tracks have audio and two are silent, the asset should be sent to a 5.1 workflow for processing. From the example above, if six audio tracks are identified as silent, the asset would be processed as a stereo 2.0 asset. To build an automated video processing workflow that makes intelligent decisions based on audio, we need to detect silent audio tracks and their position in the source mezzanine assets. For example, MXF media assets may be produced with eight mono audio tracks for content that contains only 2.0 audio, so only two tracks are actually used while the remaining tracks are filled with silent audio. When sharing video-on-demand (VOD) content, produced mezzanine files often have same number of audio tracks regardless of how many are in the actual content. In file-based video processing workflows, content providers often standardize on the mezzanine file format used to share content with their customers and partners.
