[This is a free translation of a text published here in Portuguese]
Who is curious about Big Data and data management in general always try to be ahead of new solutions in this world like CosmosDB , ElasticSearch, among others. In this post, the focus is on one of these new solutions, Apache Kafka.
Kafka – What is it?
Kafka is a distributed streaming platform. And what does it mean? Basically, it’s a streaming platform that can grow and shrink accord to demand.
What’s the use?
A Streaming Platform is something that by definition of the documentation should be able to basically three things:
- It allows you to publish and subscribe to a stream of records.
- It allows you to save your data “in a fault-tolerant way”, that is, without errors. (or, in case the error happens, the information is not lost).
- Lets you process a stream of data in real time.
And why is Kafka so good? Because it is simple. Kafka is not a giant framework (and still is not) full of commands/functions, but a simple and easy to use form (although there are some very complex concepts to grasp when you’re digging on it to do more than a few simple tasks).
From my own experience with Kafka, you can learn about it and use it in a basic way in 1 week. Of course, you’re going to make a lot of mistakes and probably create more data than you need xD’ (as I did while I was learning myself), but it’s going to be an awesome experiment in a tool that is already in Apache as a high-level project.
I hope you enjoyed! If you have experience with Kafka, share it here! 🙂