Why is Kafka such a big thing for those that enjoy Big Data

[This is a free translation of a text published here in Portuguese]

Who is curious about Big Data and data management in general always try to be ahead of new solutions in this world like CosmosDB , ElasticSearch, among others. In this post, the focus is on one of these new solutions, Apache Kafka.

Kafka – What is it?

Kafka is a distributed streaming platform. And what does it mean? Basically, it’s a streaming platform that can grow and shrink accord to demand.

What’s the use?

A Streaming Platform  is something that by definition of the documentation should be able to basically three things:

  1. It allows you to publish and subscribe to a stream of records.
  2. It allows you to save your data “in a fault-tolerant way”, that is, without errors. (or, in case the error happens, the information is not lost).
  3. Lets you process a stream of data in real time.

And why is Kafka so good? Because it is simple. Kafka is not a giant framework  (and still is not) full of commands/functions, but a simple and easy to use form (although there are some very complex concepts to grasp when you’re digging on it to do more than a few simple tasks).

From my own experience with Kafka, you can learn about it and use it in a basic way in 1 week. Of course, you’re going to make a lot of mistakes and probably create more data than you need xD’ (as I did while I was learning myself), but it’s going to be an awesome experiment in a tool that is already in Apache as a high-level project.

I hope you enjoyed! If you have experience with Kafka, share it here! 🙂

Deixe um comentário

Preencha os seus dados abaixo ou clique em um ícone para log in:

Logo do WordPress.com

Você está comentando utilizando sua conta WordPress.com. Sair /  Alterar )

Imagem do Twitter

Você está comentando utilizando sua conta Twitter. Sair /  Alterar )

Foto do Facebook

Você está comentando utilizando sua conta Facebook. Sair /  Alterar )

Conectando a %s