Mark Chmarny

How Lufthansa has made me question the value of Star Alliance

This is one of those posts you write on your phone while getting sprayed for 11+ hours with bathroom chemicals in minimally reclining seat in a second to last row of a transatlantic flight. Still, I’m going to try to be as constructive as my thumbs allow. ...

Vision of smarter thingz - project in adaptive metric flow modeling

Over the holidays, as many of us do, I embarked on a little extra-curriculum development effort I called thingz.io. I was driven by the pattern I’d observed in Data Center (DC) monitoring products; although that pattern also exists in many of today’s Internet of Things (IoT) solutions. ...

My experience with Google Compute Engine

As part of my recent solution review, I wanted to compare a few performance metrics specific to multi-node data service deployment on different clouds. This post is about my experience with Google Compute Engine (GCE) as part of that evaluation. ...

Thinking Big about Data at Intel

I am excited to share with you today that starting Monday I will be joining the Big Data team at Intel. Yes, Intel. While not traditionally known for its software offerings, Intel has recently entered the Big Data space by introducing their own, 100% open source Hadoop distribution with unique security and monitoring features. ...

HDFS has won, now de facto standard for centralized data storage

The “high-priests” of Big Data have spoken. Hadoop Distributed File System (HDFS) is now the de facto standard platform for data storage. You may have heard this “heresy” uttered before. But, for me, it wasn’t until the recent Strata conference that I began to really understand how prevalent this opinion actually is. ...

Don't use yesterday's database to develop tomorrow's solutions

We are in a midst of drastic shift in application development landscape. Developers entering the market today use different tools and follow different patterns. One of the core patterns of on-line application development today is cloud scale design. While traditionally architectures would rely on more powerful servers, today, that approach simply does not scale. We have reached that point where, in many cases, there are no powerful enough servers, or their cost would be prohibitive. Considering the unpredictable usage patterns today’s on-line applications also must be flexible to address demand spikes and assure efficient service during low utilization. ...

Data-related investments shift from tech to skills — talent new differentiator

Over the last decade, the access to best-of-bread data technologies has become easier. This is due mainly to the increasing popularity of open source software (OSS). While this phenomenon holds true in other areas like operating systems, application servers, development frameworks or even monitoring tools, it is perhaps most prevalent in the area of data. ...

Intel doubles down on open, easier to use, and performance-optimized Hadoop

Over eight months ago, I joined Intel to work on their next-generation data analytics platform. In large, my decision was based on Intel’s desire to address the “voodoo magic” in Big Data: the complexities that require deep technical skills which are preventing domain experts from gaining access to large volumes of data. The idea was that by leveraging the distributed data processing capabilities of Apache Hadoop, and combining them with Intel’s breadth of infrastructure experience, we could make Big Data analytics more accessible and therefore more prevalent. ...

Smaller, single-purpose, atomic functions core ingredient of tomorrow’s computing

Last week I had a chance to attend the 3rd AWS re:invent conference in Vegas. I’m not a big fan of that city myself, but, as in previous years, re:invent has not disappointed. Much coverage has been dedicated to the newly introduced services; I won’t bore you with that. Instead, I wanted to share with you a few higher-level thoughts I captured at the event. ...

Time series data management using InfluxDB

After a pretty positive experience with influxdb I wanted to create a super simple telemetry producer (this one in Node.js) to spotlight a few types of time series data query supported in influxdb. (Source code available on GitHub) To get live data for this demo, I created a simple script that generates metric data for CPU Utilization and Free Memory on your local machine at 1 sec resolution. ...