Open Source Ceph FS Popular in the Cloud, Big Data
Ceph, the open source distributed file system, has only recently become ready for prime-time production deployment. Thanks to a "Ceph Census" conducted a couple weeks ago, however, the channel now has a concrete view of just how many organizations are using Ceph, and how they're doing so. And the results suggest that, while Ceph may not be a monopolizer just yet, it is becoming an increasingly important part of the distributed-computing world.
The census results, which the Ceph team released to the public March 1, are based on voluntary responses to a survey of Ceph developers and users conducted Feb. 13-18, 2013. The most important findings include:
- Twenty-six percent of survey respondents are currently using Ceph in production environments, compared to 44 percent who are evaluating it for potential production use and another 30 percent running Ceph in pre-production. It seems clear, then, that even within the Ceph community, users are still learning how best to take advantage of the platform–but expanded production deployment appears to be on the way.
- The most popular use case for Ceph was in private clouds (37 percent), followed by backup (20 percent), public clouds (16 percent) and Big Data (14 percent). That surprises me a bit, since I would have expected Big Data deployments, where Ceph offers some technical advantages over the traditional Hadoop Distributed File System (HDFS), to account for a larger slice of the pie. But as Ceph gains greater exposure, it likely will cut into HDFS's imprint in the Big Data scene. (Side note: It's also difficult to say, based on the survey results, what exactly Ceph is being used for in private and public clouds, whose purposes can vary widely.)
- OpenStack is the most popular cloud stack in which Ceph is deployed, although other platforms, including ProxMox and Apache CloudStack, are in use as well within the Ceph community. In any case, OpenStack's dominance is not really too surprising, since it is rapidly emerging as the open source channel's preferred cloud stack.
Gesturing toward a measure of Ceph's overall market share, the census also found that it is being used within 21 production clusters, which account for a total raw storage size of 1,154TB. Since the survey was voluntary and was conducted only within Ceph community channels, however, it's hard to extrapolate that figure too far to say exactly where Ceph ranks among other distributed file systems.
Still, all available indicators suggest that Ceph's popularity is established and growing. That's good news for Inktank, the company founded in 2011 to provide support services related to Ceph. It also bodes well for the open source community in general, for which Ceph constitutes another rapidly evolving tool for building flexible Big Data and cloud infrastructures based on open source technology.