Databricks: Splunk Integration for Security Use Cases
Crest developed Databricks notebooks to collect and parse AWS Cloud Trail , AWS VPC logs and Syslogs data from S3 buckets into Databricks environment for further processing.
Home > Case Studies > Databricks: Splunk Integration for security use cases
Executive Summary
Databricks customers previously needed
to manually configure S3 bucket.
Integrations and create Databricks notebooks from scratch to get and parse data into Databricks environment. The Databricks Notebooks created would simplify the operation and reduce the manual tasks for customers to configure data collection and parsing. The splunk App will allow customers to run queries and jobs in the form of search queries and reduce dependency on access to Databricks Instance for running them.
Business Challenge
Databricks customers previously needed to manually configure S3 bucket.
Integrations and create Databricks notebooks from scratch to get and parse AWS Cloud Trail , AWS VPC logs and Syslogs data into Databricks environment for further processing and analytics. Also the customers previously needed to manually create jobs and queries from UI to run them.
Customer Solution
Crest Data wrote Databricks notebooks to collect and parse AWS Cloud Trail , AWS VPC logs and Syslogs data from S3 buckets into Databricks environment be used for further processing. Crest created Databricks Notebooks to push specified data from Databricks to ingest into Splunk and pull data present into Splunk to Tables in Databricks environment. Crest also helped build Splunk app for Databricks which allows splunk admins to run queries in Databricks tables and execute Databricks Jobs and notebooks using custom commands from Splunk.
The following custom commands were implemented:
Databricksquery : query data present in the Databricks table from Splunk.
Databricksrun : submit a one-time run without creating a job on Databricks.
Databricksjob : run an already created job now from Splunk.
The Crest Difference
The Databricks notebooks helped:
The Databricks Notebooks helped the manual effort for collecting and parsing the data from S3 buckets.
The custom commands in splunk helped run queries and jobs and reduce dependency on access to Databricks instance.