MapReduce Overview

What is MapReduce ? Map reduce is an algorithm or concept to process Huge amount of data in a faster way. As per its name you can divide it Map…

Dancing with Sqoop

What is Sqoop? Sqoop is a command-line interface application for transferring data between relational databases and Hadoop. (or) Import/Export data from RDBMS to Hadoop(HDFS) by using sqoop. You can use…

The Hadoop Distributed File System

Introduction HDFS, the Hadoop Distributed File System, is a distributed file system designed to hold very large amounts of data (terabytes or even petabytes), and provide high-throughput access to this…

What is Unstructured Data

The phrase "unstructured data" usually refers to information that doesn't reside in a traditional row-column database. As you might expect, it's the opposite of structured data -- the data stored…