Treceți offline cu aplicația Player FM !
Real-time Threat Detection Using Machine Learning and Apache Kafka
Manage episode 348328528 series 2355972
Can we use machine learning to detect security threats in real-time? As organizations increasingly rely on distributed systems, it is becoming more important to analyze the traffic that passes through those systems quickly. Confluent Hackathon ’22 finalist, Géraud Dugé de Bernonville (Data Consultant, Zenika Bordeaux), shares how his team used TensorFlow (machine learning) and Neo4j (graph database) to analyze and detect network traffic data in real-time. What started as a research and development exercise turned into ZIEM, a full-blown internal project using ksqlDB to manipulate, export, and visualize data from Apache Kafka®.
Géraud and his team noticed that large amounts of data passed through their network, and they were curious to see if they could detect threats as they happened. As a hackathon project, they built ZIEM, a network mapping and intrusion detection platform that quickly generates network diagrams. Using Kafka, the system captures network packets, processes the data in ksqlDB, and uses a Neo4j Sink Connector to send it to a Neo4j instance. Using the Neo4j browser, users can see instant network diagrams showing who's on the network, allowing them to detect anomalies quickly in real time.
The Ziem project was initially conceived as an experiment to explore the potential of using Kafka for data processing and manipulation. However, it soon became apparent that there was great potential for broader applications (banking, security, etc.). As a result, the focus shifted to developing a tool for exporting data from Kafka, which is helpful in transforming data for deeper analysis, moving it from one database to another, or creating powerful visualizations.
Géraud goes on to talk about how the success of this project has helped them better understand the potential of using Kafka for data processing. Zenika plans to continue working to build a pipeline that can handle more robust visualizations, expose more learning opportunities, and detect patterns.
EPISODE LINKS
- Ziem Project on GitHub
- ksqlDB 101 course
- ksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work together ft. Simon Aubury
- Real-Time Stream Processing, Monitoring, and Analytics with Apache Kafka
- Application Data Streaming with Apache Kafka and Swim
- Watch the video version of this podcast
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
Capitole
1. Intro (00:00:00)
2. What is the Ziem Project? (00:05:03)
3. How do you use ksqlDB? (00:12:15)
4. Creating network visualizations with Neo4j and Neovis.js (00:14:00)
5. Machine learning plans with Ziem (00:17:50)
6. Supervised vs. non-supervised machine learning (00:20:07)
7. Future use cases for Ziem (00:22:30)
8. How to get started with TensorFlow (00:24:29)
9. It's a wrap! (00:27:33)
265 episoade
Manage episode 348328528 series 2355972
Can we use machine learning to detect security threats in real-time? As organizations increasingly rely on distributed systems, it is becoming more important to analyze the traffic that passes through those systems quickly. Confluent Hackathon ’22 finalist, Géraud Dugé de Bernonville (Data Consultant, Zenika Bordeaux), shares how his team used TensorFlow (machine learning) and Neo4j (graph database) to analyze and detect network traffic data in real-time. What started as a research and development exercise turned into ZIEM, a full-blown internal project using ksqlDB to manipulate, export, and visualize data from Apache Kafka®.
Géraud and his team noticed that large amounts of data passed through their network, and they were curious to see if they could detect threats as they happened. As a hackathon project, they built ZIEM, a network mapping and intrusion detection platform that quickly generates network diagrams. Using Kafka, the system captures network packets, processes the data in ksqlDB, and uses a Neo4j Sink Connector to send it to a Neo4j instance. Using the Neo4j browser, users can see instant network diagrams showing who's on the network, allowing them to detect anomalies quickly in real time.
The Ziem project was initially conceived as an experiment to explore the potential of using Kafka for data processing and manipulation. However, it soon became apparent that there was great potential for broader applications (banking, security, etc.). As a result, the focus shifted to developing a tool for exporting data from Kafka, which is helpful in transforming data for deeper analysis, moving it from one database to another, or creating powerful visualizations.
Géraud goes on to talk about how the success of this project has helped them better understand the potential of using Kafka for data processing. Zenika plans to continue working to build a pipeline that can handle more robust visualizations, expose more learning opportunities, and detect patterns.
EPISODE LINKS
- Ziem Project on GitHub
- ksqlDB 101 course
- ksqlDB Fundamentals: How Apache Kafka, SQL, and ksqlDB Work together ft. Simon Aubury
- Real-Time Stream Processing, Monitoring, and Analytics with Apache Kafka
- Application Data Streaming with Apache Kafka and Swim
- Watch the video version of this podcast
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
Capitole
1. Intro (00:00:00)
2. What is the Ziem Project? (00:05:03)
3. How do you use ksqlDB? (00:12:15)
4. Creating network visualizations with Neo4j and Neovis.js (00:14:00)
5. Machine learning plans with Ziem (00:17:50)
6. Supervised vs. non-supervised machine learning (00:20:07)
7. Future use cases for Ziem (00:22:30)
8. How to get started with TensorFlow (00:24:29)
9. It's a wrap! (00:27:33)
265 episoade
Toate episoadele
×Bun venit la Player FM!
Player FM scanează web-ul pentru podcast-uri de înaltă calitate pentru a vă putea bucura acum. Este cea mai bună aplicație pentru podcast și funcționează pe Android, iPhone și pe web. Înscrieți-vă pentru a sincroniza abonamentele pe toate dispozitivele.