- Introduction to Jukin Media
- Overview: Amazon Web Services Media2Cloud
- Media2Cloud and the SCALE (Searchable Contextualized Assets for Library Enhancement) Project
- Blockers and Challenges Encountered
- TrackIt’s Expertise in M&E
Introduction to Jukin Media
Jukin Media is an online entertainment company that identifies shareable or otherwise compelling user-generated videos, enters into licensing partnerships with the video owners, then licenses the videos for use in advertisements, on TV shows, and in the news media in the US and throughout the world.
Jukin has spent more than 10 years making it easy for amateur video creators to monetize their content. The company markets their videos to media companies and advertisers looking for authentic content with a proven ability to engage audiences. Jukin Media’s library consists of more than 60,000 pieces of carefully curated, user-generated content (UGC). Jukin has paid more than $25 million in royalties to everyday video owners.
Overview: Amazon Web Services Media2Cloud
The AWS Media2Cloud solution is a Machine Learning / Artificial Intelligence serverless ingest and analysis framework for extracting content information that can quickly set up a baseline workflow for placing video and image assets and associated metadata under management control by an AWS customer.
The Media2Cloud service is roughly divided into three workflows: one that ingests source videos and images, one that analyzes and extracts machine learning metadata from your content, and one that creates and manages labeling jobs.
When you upload a video or image to the Amazon Simple Storage Service (S3) ingest bucket, the ingest workflow creates a standardized proxy file and thumbnails for analysis. The analysis workflow analyzes the videos and images and extracts metadata using AWS-native AI services.
The labeling workflow uses Amazon SageMaker Ground Truth to create labeling jobs that are used to tag faces to your face collection, for example.
The S3 ingest bucket has an Amazon S3 lifecycle policy that automates the migration of uploaded videos and images to Amazon S3 Glacier for archiving.The ingest workflow leverages AWS StepFunctions and AWS Lambda to orchestrate the ingest workflow and trigger AWS Elemental MediaConvert to create standardized proxy files and thumbnails of the uploaded videos and images for analysis. Proxy files are stored in an S3 proxy bucket, and media information is stored in Amazon DynamoDB. When video processing is completed, Amazon Simple Notification Service (Amazon SNS) sends notifications to subscribers.
An Amazon API Gateway RESTful API is used for searching results stored in an Amazon Elasticsearch Service (ES) cluster, and AWS IoT Core is used as a publish/subscribe message broker to periodically update workflow progress to connected web clients.
A simple web interface is also provided by Media2Cloud that makes it easy to upload, browse, search video and image files, extract metadata, and create and manage your labeling workforce. The web interface leverages Amazon Cognito for user authentication and is powered by web assets hosted in an Amazon S3 bucket.Amazon CloudFront is used to provide public access to the solution’s website bucket contents.
Media2Cloud and the SCALE (Searchable Contextualized Assets for Library Enhancement) Project
JukinMedia was trying to solve multiple challenges simultaneously using Media2Cloud:
Content identification and Metadata creation
Prior to the adoption of Media2Cloud, JukinMedia’s workflow did not have an automated means to determine whether a video was in their library. The company’s only means of sifting through its video files was to use a highly-ineffective keyword search mechanism which could only search through titles and other very limited metadata.
This led to a host of problems:
• Missed infringement discoveries
• Sourcing of the same videos multiple times
• Purchasing of the same video multiple times
• Missed self-servicing licensing opportunities
Media2Cloud turned out to be the obvious solution that would enableJukin Media to address this issue since it provides the automated creation of metadata using MediaInfo and Machine Learning services. Media2Cloud also collects meaningful additional metadata that could help Jukin Media better match videos against their library to avoid duplicates.
The Clearance Rating
Jukin Media needed a means to scan videos from any programming source to ensure appropriate clearance. This required the ability to detect logos, landmarks, location & music.
Media2Cloud’s Rekognition service would allow them to implement this capability. It provides face and object detection which can be adapted to match custom objects such as logos.
Jukin Media had standardized on Terraform as it’s provisioning tool. Media2Cloud, however, is natively implemented by AWS using a CloudFormation stack. This meant that the first step in the implementation process had to be the translation of each Media2Cloud CloudFormation stack into Terraform modules.
TrackIt translated the CloudFormation stacks and subsequently ran tests to make sure that they worked correctly. The effort was successful and the stack was then refactored to simplify the customization.
Blockers and Challenges Encountered
During the translation phase the TrackIt team experienced some difficulties with the CloudFormation stack because of its use of custom resources. TrackIt had to develop a number of workarounds and refactor some lambda functions.
For example, the custom_resource_function handles:
• The IoT detach policy function (used for notifications on the web application)
• The configure workteam function (used for the labeling jobs)
• The create/delete custom vocabulary function (used by Amazon Transcribe)
• The create index function (used to generated the manifest.js file used by the web application)
TrackIt used Terraform “null_resources” to execute script files to build and archive sources, provision lambda functions, and for the web application.
Redundant lambda functions that were previously used to format strings were also removed because they could be handled by Terraform directly. For example:
• The sanitize functions (that were replaced by variables and regular expressions check)
• The randomize name function (that were replaced by random_uuid Terraform resource)
All of the resource creation that could be handled by Terraform from the custom resources in the CloudFormation stack were migrated. For instance, the user pool domain of AWS Cognito along with other resources, such as those used for CloudFront domain creation.
TrackIt also removed redundant source code to make the solution easier to maintain. For example, bash scripts were used to replace interpolation strings.
TrackIt’s Expertise in M&E
TrackIt is an AWS Advanced Consulting Partner with decades of experience in the Media & Entertainment industry and a wealth of cloud technology design and deployment work performed for many media-centric companies.
TrackIt’s work with JukinMedia required in-depth knowledge and expertise in AWS Media & Entertainment workflows along with the ability to quickly address technical challenges that arose during implementation.
“TrackIt’s work with Amazon to help develop our media AI/ML capabilities really accelerated our efforts to transform our media processes using cutting edge technology. Our collaboration with TrackIt has been a pleasure and their expertise complements our own.” Kris Shinn, VP of Engineering, Jukin Media
Content identification and Metadata creation
Fully-functional AWS Media2Cloud workflow implemented using Terraform