Apache Hadoop 3 has been released

Late in 2017, a long-expected Apache Hadoop 3.0.0., a new version of the Hadoop framework designed for the processing of large amounts of both structured and unstructured data, was released. Our big data solution specialist Johnson Darkwah will tell you what's new about it and why we can't wait to start working with it.

22. January 2018

Author: Gauss Algorithmic Category: Blog

Why is it so important that the new version has been released?

Apache Hadoop 3.0.0 is a brand new release whose commercial distribution will start roughly in the middle of 2018. Hadoop is a technology that strongly influenced the way we work with big data. And now its new version with crucial updates is out.
 

What's so important about the new version?

I think the most important new feature is HDFS Erasure Coding, which will decrease the use of disk space by about 50 %. This means one very important thing – lower costs of data storage.


So is it primarily about higher speed and lower costs?

In recent years, Hadoop was usually installed as on-premise software at customers' data centers. It was necessary to update Hadoop and push it forward. Saving data to certain cloud storages gives a better cost to performance ratio, and so it is only logical that Hadoop aimed at cutting down initial and operating expenses.


What are other new features of the new version?

Changes in YARN are important for us, specifically the support of Docker containers, which will hopefully enable faster development on Hadoop.


What does the new version bring to customers?

Faster, cheaper and more secure processing of data on all levels. Customers should know that in Gauss Algorithmic, we are very carefully monitoring the development of the new Hadoop release, and once we consider the system advanced enough, we will add it to our portfolio and start offering it as a part of our data lake. We have a thing for Hadoop – we've been using it in most of our solution, and our Czech and Slovak customers will be the first to be offered version 3.


What role does the company Cloudera play here?

Although Hadoop is an open-source technology, Cloudera strongly influences the direction of its development. Moreover, Doug Cutting, who is one of the founders of Hadoop and a board member of the Apache Hadoop Foundation, works for Cloudera. Gauss Algorithmic is also Cloudera's official partner for Central and Eastern Europe, enabling us to deliver robust solutions in a very short time.


More information about the new version can be found on the Cloudera blog or on the Apache Hadoop website.

 

Are you interested in big data solutions, Hadoop, our collaboration with Cloudera or anything else?

Contact Us – Gauss Algorithmic

Share with your friends
Czech