Red Hat Connects Open Source Software Defined Storage to Hadoop
Red Hat (RHT) wants to integrate open source distributed computing and software-defined storage more tightly than ever. On Monday, it debuted a plug-in to connect Apache Hadoop to the GlusterFS file system, tying these two major open source platforms yet closer together.
Red Hat (RHT) wants to integrate open source distributed computing and software-defined storage more tightly than ever. On Oct. 28, it debuted a plug-in to connect Apache Hadoop to the GlusterFS file system, tying these two major open source platforms yet closer together.
The connector, which Red Hat has plainly dubbed the Apache Hadoop Plug-in, allows the seamless transfer of data that lives in GlusterFS distributed software-defined storage clusters to Hadoop, where it can be processed on a distributed computing network. Red Hat announced the plug-in on the first day of the Strata Conference + Hadoop World event in New York, which is taking place this week.
From a technical standpoint, the plugin promises to reduce the overhead involved in Hadoop and GlusterFS deployments. It makes it possible to feed data directly to Hadoop using a variety of popular protocols, including NFS, SMB and object access with SWIFT, as well as through the Hadoop file system API. It also allows Hadoop users to take advantage of the fault-tolerance features in GlusterFS, adding data assurance to the distributed-computing functionality of Hadoop.
From the perspective of the channel, the plug-in is also significant because it knits even more closely the open source ecosystem that surrounds distributed computing and software-defined storage. As massively popular open source technologies that play an important role in building infrastructure for the cloud and Big Data, Hadoop and GlusterFS are key to the business of open source vendors such as Red Hat and its wide network of partners. Making it easier to connect the two platforms will make them even more attractive solutions.
It's no surprise, by the way, that Red Hat has chosen to focus on enhancing GlusterFS deployability. The company has been placing its bets on the software-defined storage platform since the beginning, and its investment in Hadoop integration should help it to compete with other open source vendors, including Canonical, that have invested in the ecosystem surrounding Ceph, another open source distributed storage system that is an alternative to GlusterFS.
Red Hat also released code that integrates GlusterFS with Intel's distribution of Hadoop, which is designed to cater to Big Data analytics. Alongside the generic Hadoop plug-in for GlusterFS, the integration with Intel's (INTC) Hadoop platform provides Red Hat with another strength in the data analytics world.