I wanted to use the now generally available Cloud Spanner database to write an app that would track stock prices and social media sentiment to identify potential correlation. To test even the validity of this approach I put together a Go app that subscribes to Twitter stream for all companies defined in the Stocks table and scores each event against the Google NLP API while comparing the user sentiment against the stock ask price against Yahoo API.
As part of my ramp up on Google APIs I wanted to create a project that would allow me some practical exercise in a context of a real application. TFeel (short for Twitter Feeling) is a simple sentiment analyses over tweeter data for specific Twitter search terms using Google Cloud services:
I had the opportunity to attend Google Next this year. Week after this event I joined Google. Here are some quick notes in no particular order: Registration was a pain, long lines. My first tech conference where I had to go through a metal detector.
Data is growing at an exponential pace. Based on recent numbers from IDC, the total amount of data in 2015 (4.4ZB) will grow to 44ZB in 2020. Franky, how much is in Zettabyte is almost inconsequential. It is the fact that all of the data generated since the beginning of time (at least the electronic part), will grow 10x in just the next four years that’s shocking!
The presentation that goes along with this post is available here In my last post I went over the value cycle of machine generated data. In this post, I want to follow up with a few ideas on how to further amplify value of that data by expanding its context beyond the walls of owning organization, in a construct we came to know as Data Exchange, and list a few innovation opportunities in each one of these areas.
This is one of those posts you write on your phone while getting sprayed for 11+ hours with bathroom chemicals in minimally reclining seat in a second to last row of a transatlantic flight. Still, I’m going to try to be as constructive as my thumbs allow.
Over the holidays, as many of us do, I embarked on a little extra-curriculum development effort I called thingz.io. I was driven by the pattern I’d observed in Data Center (DC) monitoring products; although that pattern also exists in many of today’s Internet of Things (IoT) solutions.
As part of my recent solution review, I wanted to compare a few performance metrics specific to multi-node data service deployment on different clouds. This post is about my experience with Google Compute Engine (GCE) as part of that evaluation.
I am excited to share with you today that starting Monday I will be joining the Big Data team at Intel. Yes, Intel. While not traditionally known for its software offerings, Intel has recently entered the Big Data space by introducing their own, 100% open source Hadoop distribution with unique security and monitoring features.
The “high-priests” of Big Data have spoken. Hadoop Distributed File System (HDFS) is now the de facto standard platform for data storage. You may have heard this “heresy” uttered before. But, for me, it wasn’t until the recent Strata conference that I began to really understand how prevalent this opinion actually is.