IBM will invest some $300 million over the next few years in the Apache Spark open source real-time data analysis platform, in an endorsement that includes contributing software coders, certain technology and education programs.

DH Kass, Senior Contributing Blogger

June 17, 2015

2 Min Read
IBM Invests $300 Million in Apache Spark for Data Analytics

IBM (IBM) said it will invest some $300 million over the next few years in the Apache Spark open source real-time data analysis platform, in an endorsement that includes contributing software coders, certain technology and education programs.

In calling its backing of the University of California, Berkeley-developed Apache Spark the “most important new open source project in a decade that is being defined by data,” IBM said it will embed Spark into its Analytics and Commerce platforms and offer the technology as a cloud service on its Bluemix platform.

“Anyone that’s going to be using data in the future is going to be leveraging Spark,” Rob Thomas, IBM Analytics product development vice president, told the Wall Street Journal.

The vendor also said it will assign some 3,500 IBM researchers and developers to work on Spark-related projects at more than a dozen labs worldwide, donate its SystemML machine learning technology to the Spark open source ecosystem, and, educate more than one million data scientists and data engineers on Spark through partnerships with AMPLab, DataCamp, MetiStream, Galvanize and Big Data University MOOC.

IBM also plans to open a Spark Technology Center in San Francisco aimed at the data science and developer community staffed initially by 20 people but expected to grow to as many as 300 over time, according to Beth Smith IBM Analytics Platform general manager.

“IBM has been a decades long leader in open source innovation,” Smith said. “We believe strongly in the power of open source as the basis to build value for clients, and are fully committed to Spark as a foundational technology platform for accelerating innovation and driving analytics across every business in a fundamental way,” she said.

“Our clients will benefit as we help them embrace Spark to advance their own data strategies to drive business transformation and competitive differentiation,” said Smith.

Spark was developed at UC Berkeley’s Algorithms, Machines and People Lab (AMPLab).

Ion Stoica, chief executive of Databricks, which was spawned from AMPLab two years ago and offers a cloud service based on Spark, told the New York Times that IBM’s investment is a “great validation for Spark,” adding that the “magnitude is impressive.”

Read more about:

AgentsMSPsVARs/SIs

About the Author(s)

DH Kass

Senior Contributing Blogger, The VAR Guy

Free Newsletters for the Channel
Register for Your Free Newsletter Now

You May Also Like