Cloudera Search Adds Search Engine Ease to Data on HDFS and Apache HBase

Cloudera, a solutions provider for analytic data management powered Apache Hadoop, has recently announced the public beta of Cloudera Search, the company's fully integrated search engine for interactive exploration of data stored in the Hadoop Distributed File System (HDFS) and Apache HBase.

CJ Arlotta, Associate Editor

June 5, 2013

2 Min Read
Cloudera CEO Mike Olson said the company is quotbringing the band back togetherquot
Cloudera CEO Mike Olson said the company is "bringing the band back together."

Cloudera, provider of analytic data management powered Apache Hadoop, has announced the public beta of Cloudera Search, its first fully-integrated search engine for interactive exploration of data stored in the Hadoop Distributed File System (HDFS) and Apache HBase. The new solution is designed to simplify Big Data systems with an easier way for non-tehnical users to search for data stored in Hadoop, the company said. What else is provided in Cloudera Search? We’ll provide the details.

Cloudera Search includes Apache Solr and other search-related open source projects, to support a comprehensive big data infrastructure. Besides being easier to use for non-techies, the simplified system can be used to reduce the costs that come with maintaining the many systems businesses may use to execute queries.

Cloudera CEO Mike Olson said in his prepared remarks that the idea behind Cloudera Search was to enable those without any programming expertise to obtain the right insight needed from their assests.

According to Cloudera, the solution offers the following key features:

  • Scalable, reliable index storage in HDFS: integrates index storage and serving directly into HDFS;

  • Batch indexing via MapReduce: allows for index creation of data stored in HDFS and HBase as scalable and robust as MapReduce;

  • Real-time indexing at collection: makes an event searchable as it is stored into Hadoop through near real-time indexing features powered by Apache Flume;

  • Easy interaction and data exploration via Cloudera Hue: provides a plug-in application for Hue and easy-to-install capabilities for standard Hue servers to query data and view result files, and enables faceted exploration;

  • Simplified field extraction and cross-platform data processing: allows for quick and easy field extraction of any data stored in HDFS using optimized Hadoop file formats such as Apache Avro and promotes reusable configurations and processing activities with the new processing framework, Cloudera Morphlines; and

  • Unified management and monitoring with Cloudera Manager: provides a centralized management and monitoring experience that makes it as easy to deploy, configure and monitor search services as it is to manage CDH deployments and other services on the Hadoop cluster.

“Year after year we continue to push the boundaries of what is possible with Hadoop; we have the best minds in data management focused on advancing business transformation,” Olsen said.

Cloudera Search is immediately available for Cloudera Enterprise subscribers as a supplemental module.

TechNavio reported back in April of 2013 that the global Hadoop-as-a-service (HaaS) market will grow tenfold over the next few years and top $1.9 billion in 2016.

About the Author(s)

CJ Arlotta

Associate Editor, Nine Lives Media, a division of Penton Media

Free Newsletters for the Channel
Register for Your Free Newsletter Now

You May Also Like