How to Learn Hadoop

             Big data is one the hottest area in IT, so everyone is wanting to learn Hadoop.   I am constantly being asked how to…

Hadoop Distributions

Below are the companies offering commercial implementations and/or providing support for Apache Hadoop, which is the base for all the below. Cloudera offers CDH (Cloudera's Distribution including Apache Hadoop) and…

Hadoop Commands

hadoop command [genericOptions] [commandOptions] hadoop fs Usage: java FsShell  [-ls <path>]  [-lsr <path>]  [-df [<path>]]  [-du <path>]  [-dus <path>]  [-count[-q] <path>]  [-mv <src> <dst>]  [-cp <src> <dst>]  [-rm [-skipTrash] <path>]…

What is Hadoop

         Apache Hadoop is, an open-source software framework, written in Java, by Doug Cutting and Michael J. Cafarella, that supports data-intensive distributed applications, licensed under the Apache v2 license.…

Running Hadoop on CentOS Linux (Single-Node Cluster) Follow the bellow steps: 1).Install CentOS iso image in VM Workstation. 2).download java1.6 based on CentOS 32 bit or 64 bit for 32…

Running Hadoop on CentOS Linux (Multi-Node Cluster) In this tutorial I will describe the required steps for setting up a distributed, multi-node Apache Hadoopcluster backed by the Hadoop Distributed File System (HDFS),…