Over the last decade, the access to best-of-bread data technologies has become easier. This is due mainly to the increasing popularity of open source software (OSS). While this phenomenon holds true in other areas like operating systems, application servers, development frameworks or even monitoring tools, it is perhaps most prevalent in the area of data.
Over eight months ago, I joined Intel to work on their next-generation data analytics platform. In large, my decision was based on Intel’s desire to address the “voodoo magic” in Big Data: the complexities that require deep technical skills which are preventing domain experts from gaining access to large volumes of data.
Last week I had a chance to attend the 3rd AWS re:invent conference in Vegas. I’m not a big fan of that city myself, but, as in previous years, re:invent has not disappointed. Much coverage has been dedicated to the newly introduced services; I won’t bore you with that.
After a pretty positive experience with influxdb I wanted to create a super simple telemetry producer (this one in Node.js) to spotlight a few types of time series data query supported in influxdb. (Source code available on GitHub) To get live data for this demo, I created a simple script that generates metric data for CPU Utilization and Free Memory on your local machine at 1 sec resolution.
Data science is not a new discipline. In some sense it even pre-dates application development. You wouldn’t know that by looking at the average enterprise nowadays. The popular perception is that enterprises today struggle to deliver any tangible value from Big Data.
Seems like every week we hear about yet another new open source Machine/Deep Learning library or Analytical Framework. Talking to people at Strata this week only confirmed for me that in the midst of what can only be described as virtual gluttony of open-source software, there is massive number of organizations who find it increasingly harder to implement these technologies.
As a long-term Cloud storage user I recently wanted to re-evaluate my options. New content management providers became available and I wanted to make sure I wasn’t missing on the new shinny tech out there. As I was considering the pros and cons of each option, I realized the apparent shift in my personal attitude towards cloud data storage over last few years.
About a year and a half ago, I wrote about Big Data Opportunities, focusing primarily on Leveraging Unused Data, Driving New Value From Summary Data and Harnessing Heterogeneous Data Sensors (more recently known as Internet of Things). Since that post, data space has exploded with numerous solutions addressing many of these areas.
In general, Platform as a Service (PaaS) is developed by developers for developers. Of course they’re going to love it. It enables them to focus on the nuances of their applications — not on the day-to-day pointless activities that so often take their time away from solving real problems.