A renowned annual festival, known for its blend of music, film, and interactive media, attracts a global audience. The client required a dedicated on-call team to address any issues that arose during the event. TrackIt addressed this requirement by providing Site Reliability Engineering (SRE) services.
The primary challenge assigned to TrackIt was to maintain high performance under pressure, addressing issues as they arose to ensure seamless reliability throughout the event.
The engagement spanned three weeks: one week prior to the event, the event week itself, and one week afterward. Immediate issue resolution was paramount, requiring a five-minute response time for all issues. Two teams worked in shifts to ensure continuous support for a live streaming pipeline that would be used during the event.
The client had already established the architecture but sought feedback and recommendations for improvement. Artist workstations, running on Amazon EC2 instances with Windows OS and video editing software, were pre-configured, and TrackIt’s responsibility was to ensure their seamless operation. The media upload pipeline was set up with SFTP (Secure File Transfer Protocol) to EC2 instance, mounting an Amazon FSx file for OpenZFS file system, as chosen by the client.
Ensuring Reliability of Services
TrackIt implemented several measures to ensure the reliability of the infrastructure during the event. A well-structured on-call schedule was established, with two dedicated teams working in shifts to provide 24/7 support.
A PagerDuty alerting system was implemented to ensure timely responses, featuring a ticket creation system and an escalation policy, enabling teams to address issues within a stringent five-minute window. The live streaming pipeline was monitored continuously to detect and address potential disruptions in real time. Regular reviews and feedback sessions were conducted to fine-tune the existing architecture, enhancing its robustness and reliability.
Special attention was given to the artist workstations, ensuring that the EC2 instances running Windows OS and video editing software operated without hiccups. For the media upload pipeline, an efficient SFTP to EC2 instance setup was maintained, with seamless mounting to an FSx file system.
Outcome
The proactive and responsive approach adopted by TrackIt ensured a successful and uninterrupted experience for the festival. The well-coordinated on-call support provided by the two teams effectively handled all arising issues within the five-minute response time.
The live-streaming pipeline and artist workstations functioned seamlessly throughout the event, allowing artists to focus on their creative work without technical hindrances.
Additionally, TrackIt implemented a Well-Architected Framework Review (WAFR) to help optimize the client’s AWS infrastructure. The review helped bolster security and further optimize costs.
TrackIt’s involvement resulted in a high-pressure scenario being managed effectively, with no disruptions to the event. The client expressed satisfaction with the support and improvements, recognizing the value brought by TrackIt’s expertise and dedication.
Metrics
- The five-minute SLA for incident response was consistently met throughout the event.
- System uptime increased by 15%, ensuring no disruptions to the live streaming service.
- Infrastructure costs were optimized by 10% through the implementation of recommendations from the Well-Architected Framework Review (WAFR).
About TrackIt
TrackIt is an international AWS cloud consulting, systems integration, and software development firm headquartered in Marina del Rey, CA.
We have built our reputation on helping media companies architect and implement cost-effective, reliable, and scalable Media & Entertainment workflows in the cloud. These include streaming and on-demand video solutions, media asset management, and archiving, incorporating the latest AI technology to build bespoke media solutions tailored to customer requirements.
Cloud-native software development is at the foundation of what we do. We specialize in Application Modernization, Containerization, Infrastructure as Code and event-driven serverless architectures by leveraging the latest AWS services. Along with our Managed Services offerings which provide 24/7 cloud infrastructure maintenance and support, we are able to provide complete solutions for the media industry.