David is Coding

Just another techie blog

Real-Time Twitter Analysis 4: Displaying the Results

In the previous post, we processed a stream of tweets in real-time with Spark Streaming in order to calculate son information such as tops and counters. Now is the turn of displaying this data in an easier way to be consumed by humans. Along with this post, we’ll create a simple web-based Dashboard by using […]

Real-Time Twitter Analysis 3: Tweet Analysis on Spark

Real-Time Analysis on Spark

We already got a Twitter Stream ingested in our cluster using Flume and Kafka, as was described in my previous post. The next step is to process and analyze tweets taken from a Kafka topic with Apache Spark Streaming. Our goal here is to make some calculations on top of the received tweets in order […]

Real-Time Twitter Analysis 1: Introduction

After setting up the Cloudera’s Quickstart VM, as described in my previous post, it’s time to show some hands-on experience about Data Engineering. For this purpose, I opted for performing a real-time sentiment analysis over this social media. The idea is to put into play different tools and skills I got during the Big Data […]