Intel doubles down on open, easier to use, and performance-optimized Hadoop

Over eight months ago, I joined Intel to work on their next-generation data analytics platform. In large, my decision was based on Intel’s desire to address the “voodoo magic” in Big Data: the complexities that require deep technical skills which are preventing domain experts from gaining access to large volumes of data. The idea was that by leveraging the distributed data processing capabilities of Apache Hadoop, and combining them with Intel’s breadth of infrastructure experience, we could make Big Data analytics more accessible and therefore more prevalent.

Last week, Intel demonstrated just how serious it is about this vision by announcing a strategic partnership with Cloudera, the largest distributor of Hadoop.

Much has been already written about this partnership. To me, this single largest data center technology investment demonstrates the level of Intel’s commitment to deliver on the promise of open, performance optimized platform for big data analytics. As part of this deal, Cloudera will optimize its software to take greater advantage of the features found in Intel processors, which already power the majority of data centers.

One of the fastest growing areas and biggest opportunity for Hadoop optimization is Internet of Things (IoT). Whether it is edge signal aggregation, stream processing, or scalable storage later, the use of Hadoop in IoT currently demands a substantial layer of specialized code. The problem is that this software layer is too complex to develop. Intel’s collaboration with Cloudera will greatly simplify analysis of machine-generated data and become an intrinsic part of the next-generation IoT analytics platform.

The wider Hadoop ecosystem will benefit too.

The leaders of both companies are already talking about a two-year optimization roadmap and their commitment to release these enhancements upstream into the open source community.

By making this platform generally available, Intel will assure that in near future, you will be able to build innovative data solutions that are less expensive and easier to implement, while still realizing its rapid performance improvements in Hadoop.