OpenStack service controllers produce large amounts of log data, and processing these logs can be a time consuming and difficult task. Data processing applications excel at automating these types of tasks and providing new windows into the underlying trends in the data produced.
We will discuss how to configure your stack to produce log data that can be consumed and analyzed in real-time by Spark applications running on the OpenStack Data processing service (sahara). These techniques can be used to inspect the current status and health of your services, decreasing the time spent between problem detection and solution. This also creates an exciting platform for developers to explore new transformations of log data that will unlock endless possibilities such as failure analysis, performance enhancement, and fraudulent activity detection.
This talk will cover a broad range of OpenStack and Apache technologies used in the development of this solution. We will discuss the following as they pertain to the greater problem space:
- Logging service data to Zaqar and Manila
- Launching Spark clusters with Sahara
- Visualizing and experimenting on data with Zeppelin
- Creating Spark jobs to deploy through Sahara
- Storing processed data to Trove databases
There are many paths to achieving improved log processing in OpenStack. Attendees should expect to learn about how they can improve the stability and performance of their stacks through the advanced techniques discussed in this session.