Posts Tagged hadoop

Feb 17 2014

Stream-based real-time architecture against a Lambda architecture

This is less of an informative post than usual. I’ll repost a question I’ve asked in Programmers Stack Exchange hoping to trigger some interesting discussion.
I’m not a real-time architectures expert, I’d like only to throw a couple of personal considerations and evaluate what others would suggest or point out.

Let’s imagine we’d like to design a real-time analytics system. Following, Lambda architecture Nathan Marz’s definition, in order to serve the data we would need a batch processing layer (i.e. Hadoop), continuously recomputing views from a dataset of all the data, and a so-called speed layer (i.e. Storm) that constantly processes a subset of the views (made by the events coming in after the last full recomputation of the batch layer). You query your system by merging the results of the two together.


Globetrotter Software Engineer. I try to conjugate where the ambitions lead me with an environment that makes me feel happy. I love sun and sea. My mind is always spinning, for good and for bad. I enjoy traveling and experiencing new places by being constantly surprised by things I would have never even conceived.