Want to read more about big data? Check out Penton Technology's special report: Drowning in Data.
Building on its investment in Apache Spark, IBM launched a cloud-based real-time, high performance analytics development environment on Tuesday called the Data Science Experience. The new open and collaborative environment is designed to make it easier and faster for developers on the IBM Cloud Bluemix platform to embed data and machine learning into cloud applications.
IBM made a $300 million investment in Spark last year to grow the ecosystem of what the company calls the “analytics operating system.” To that end, the company has made contributions to SparkR, SparkSQL, and Apache SparkML to extend Spark to the R statistical programming language. This will give R data scientists faster access to more data, and therefore more immediate and accurate insights, IBM says.
“With Apache Spark, we see an opportunity to significantly transform the role of the data scientist by providing access to curated data sets, open source tools and a collaborative platform to accelerate innovation,” said Bob Picciano, Senior Vice President, IBM Analytics. “IBM’s Digital Science Experience is the killer enterprise app for Apache Spark, and gives data scientists new opportunities to deliver insight-driven models to developers, and opens the door for unprecedented innovation from the open source community.”
The Data Science Experience includes 250 curated data sets and open source tools along with the collaborative workspace. Open source resources from H2O, RStudio, Jupyter Notebooks, and Apache Spark are available alongside those from IBM in the secure, managed environment, and the company is continuing to collaborate with Galvanize, H2O.ai, LightBend, and RStudio “to promote an integrated and unified data science ecosystem,” according to the announcement. IBM also announced it will join the R Consortium to help bring big data to enterprises.
“Just as IBM played a critical role in the development of Computer Science, we can see many similarities today. Computer Science went mainstream with the introduction of the PC,” said Picciano. “With Data Science, the major roadblock is having access to large data sets and having the ability to work with so much data. With today’s announcement, clients can have both.”
Among hyperscale cloud providers, Google launched a pair of big data services last August to appeal to enterprises and the developers building applications for them.