Saturday 31 January 2015

Hadoop Introduction


Today, many organizations are facing the Big Data problem. Managing and processing Big Data can incur a lot of challenges for traditional data processing platforms such as relational database systems. Hadoop was designed to be a distributed and scalable system for dealing with Big Data problems.

The design, implementation, and deployment of a Big Data platform require a clear definition of the Big Data problem by system architects and administrators. A Hadoop-based Big Data platform uses Hadoop as the data storage and processing engine.

Apache Hadoop is a free and open source implementation of frameworks for reliable, scalable, distributed computing and data storage. It enables applications to work with thousands of nodes and petabytes of data, and as such is a great tool for research and business operations. Hadoop was inspired by Google’s MapReduce and Google File System papers.

No comments:

Post a Comment