Operationalize Twitter’s Observability Infrastructure
Crest Data offered a comprehensive solution for managing Twitter's Observability and Splunk Infrastructure.
Home > Case Studies > Operationalize Twitter’s Observability Infrastructure
Executive Summary
To enhance visibility and ensure reliability, Crest Data provided a comprehensive solution for managing Twitter's Observability and Splunk Infrastructure.
This solution integrated Splunk and an in-house observability platform to effectively monitor and maintain Twitter's platform performance.
Business Challenge
Manage and Maintain 24x7 Splunk operations including Onboarding of New Data, Modifying existing Application.
Effectively manage Massive amount of data Ingestion into Twitter’s Splunk Cluster
Reduce Downtime during updates
Effective On-Call support and better handling of high Severity Incidents for Splunk and Observability Platform.
Optimize infrastructure while Reducing overall Costs.
Customer Solution
Crest Data offered a comprehensive solution for managing Twitter's Observability and Splunk Infrastructure. The Observability infrastructure, which plays a crucial role in monitoring Twitter's platform, encompasses the utilization of Splunk and an in-house observability platform.
To streamline the configuration process, Puppet is employed to handle Splunk Day 0 and Day 1 configurations once the infrastructure is deployed. Additionally, Airflow is utilized to automate Day 2 Changes, including Splunk Upgrades. Jenkins Parameterized Pipelines are employed to facilitate the deployment of Puppet and Airflow.
Crest excels in handling incidents, leveraged automation to minimize such incidents in future and ensure system stability.
Re -architected Observability and Splunk Infrastructure to minimize costs and maximize ROI.
The Crest Difference
The distinctive aspect of Crest Data lies in our ability to handle and uphold Twitter's Splunk and Observability infrastructure, alongside leveraging open-source tools like Airflow and automation tools like Puppet. Through this approach, we achieve large-scale automation by effectively managing the environment, preventing deviations, and streamlining the deployment of changes. Furthermore, we conduct thorough root cause analyses of incidents to promote stability and minimize overall incidents. Leveraging our expertise in Splunk, we reconstructed the infrastructure and optimized ROI.