System requirement:

1. mkdir /usr/local/hadoop-2.6.0

2. cd /usr/local/hadoop-2.6.0

3. wget http://mirror.metrocast.net/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

4. tar -xzfv hadoop-2.6.0.tar.gz

5. add new user

$ usergroup hadoop
$ useradd -g hadoop hduser
to change primary group
usermod -g primarygrpname username
to change secondary group
usermod -G secondarygrpname username

6. Install ssh-server
$ apt-get install openssh-server

7. generate ssh key
$ su - hduser
$ ssh-key gen
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
$ ssh hduser@localhost

Disabling IPv6

Open config file: sudo gedit /etc/sysctl.conf

Add these 3 lines at the end of the file:

#disable ipv6; net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1

after adding the following code, reload the settings using- source ~/.bashrc and source ~/.profile

Configuring hadoop Configuration file

Change directory using cd /usr/local/hadoop/etc/hadoop

$ vi yarn-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
</configuration

Create direcotry as below

Format file system

cd /usr/local/hadoop-2.6.0

./hadoop namenode -format

Go to sbin and start all demons

cd /usr/local/hadoop-2.6.0/sbin

$ ./start-all.sh

to check if all demons are running

$jps

If any daemon doesn't start, start them manually
hadoop-daemon.sh start namenode
hadoop-daemon.sh start datanode
yarn-daemon.sh start resourcemanager
yarn-daemon.sh start nodemanager
mr-jobhistory-daemon.sh start historyserver

Hadoop Web Interfaces.
Namenode - http://localhost:50070/
Secondary Namenode - http://localhost:50090
Most important is jps. Use jps to check which daemons are running.

$chown -R hduser:hadoop /usr/local/hadoop-2.6.0
$chmod +x -R /usr/local/hadoop-2.6.0
Setting Global Variable
$ vi /home/hduser/.bashrc
export HADOOP_PREFIX=/usr/local/hadoop-2.6.0
export HADOOP_HOME=/usr/local/hadoop-2.6.0
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
# Native Path
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"
#Java path
export JAVA_HOME='/usr/lib/jvm/java-7-oracle'
# Add Hadoop bin/ directory to PATH

export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_PATH/bin:$HADOOP_HOME/sbin

$vi /home/hduser/.profile
export HADOOP_PREFIX=/usr/local/hadoop-2.6.0
export HADOOP_HOME=/usr/local/hadoop-2.6.0
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
# Native Path
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"
#Java path
export JAVA_HOME='/usr/lib/jvm/java-7-oracle'
# Add Hadoop bin/ directory to PATH

export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_PATH/bin:$HADOOP_HOME/sbin

$/usr/local/hadoop-2.6.0/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-oracle

<configuration>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

$vi core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-2.6.0/tmp</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>

</configuration>

$ vi mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>

</configuration>

$ vi hdfs-site.xml
mkdir -p $HADOOP_HOME/yarn_data/hdfs/datenode
mkdir -p $HADOOP_HOME/yarn_data/hdfs/namenode

Offline Image Viewer Guide

-rw-r--r-- 1 hduser hadoop 100722 Oct 7 20:49 fsimage_0000000000000008804
-rw-r--r-- 1 hduser hadoop 62 Oct 7 20:49 fsimage_0000000000000008804.md5
drwxrwxr-x 3 hduser hduser 4096 Oct 8 22:49 ..
-rw-r--r-- 1 hduser hadoop 100722 Oct 8 22:49 fsimage_0000000000000008805
-rw-r--r-- 1 hduser hadoop 62 Oct 8 22:49 fsimage_0000000000000008805.md5
-rw-rw-r-- 1 hduser hduser 202 Oct 8 22:49 VERSION
-rw-r--r-- 1 hduser hadoop 5 Oct 8 22:49 seen_txid
-rw-r--r-- 1 hduser hadoop 1048576 Oct 8 22:49 edits_inprogress_0000000000000008806
drwxrwxr-x 2 hduser hduser 12288 Oct 8 22:49 .
hduser@ubuntu:/usr/local/hadoop-2.6.0/tmp/dfs/name/current$ cat fsimage_0000000000000008805.md5
929bde84fb1432baba3228dc78b3b6d8 *fsimage_0000000000000008805
hduser@ubuntu:/usr/local/hadoop-2.6.0/tmp/dfs/name/current$ hdfs oiv -i fsimage_0000000000000008805
15/10/08 23:02:31 INFO offlineImageViewer.FSImageHandler: Loading 2 strings
15/10/08 23:02:31 INFO offlineImageViewer.FSImageHandler: Loading 1273 inodes.
15/10/08 23:02:31 INFO offlineImageViewer.FSImageHandler: Loading inode references
15/10/08 23:02:31 INFO offlineImageViewer.FSImageHandler: Loaded 0 inode references
15/10/08 23:02:31 INFO offlineImageViewer.FSImageHandler: Loading inode directory section
15/10/08 23:02:31 INFO offlineImageViewer.FSImageHandler: Loaded 164 directories
15/10/08 23:02:31 INFO offlineImageViewer.WebImageViewer: WebImageViewer started. Listening on /127.0.0.1:5978. Press Ctrl+C to stop the viewer.
15/10/08 23:04:27 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=GETFILESTATUS target=/user/hduser
15/10/08 23:04:27 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/user/hduser
15/10/08 23:04:51 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=GETFILESTATUS target=/user/hduser
15/10/08 23:04:51 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/user/hduser
15/10/08 23:05:41 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=GETFILESTATUS target=/user/hduser
15/10/08 23:05:42 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/user/hduser
15/10/08 23:05:42 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/user/hduser/input
15/10/08 23:05:42 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/user/hduser/input1
15/10/08 23:05:42 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/user/hduser/input2
15/10/08 23:05:42 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/user/hduser/input3
15/10/08 23:06:47 INFO offlineImageViewer.FSImageHandler: 200 method=GET op=LISTSTATUS target=/

6 comments:

BhupendraSeptember 14, 2015 at 3:36 AM
Thank you for acknowledgment
UnknownSeptember 28, 2015 at 5:37 AM
Managing a business data is not an easy thing, it is very complex process to handle the corporate information both Hadoop and cognos doing this in a easy manner with help of business software suite, thanks for sharing this useful post….
Regards,
cognos Training Chennai|cognos Training|cognos tm1 Training in Chennai
UnknownSeptember 29, 2015 at 5:35 AM
A table is the basic unit of data storage in an oracle database. The table of a database hold all of the user accesible data. Table data is stored in rows and columns. But what is all about the clusters and how to handle it using oracle database system? Expecting a right answer from you. By the way you are maintaining a great blog. Thanks for sharing this in here.
Oracle Training in Chennai | Oracle Course in Chennai | Oracle Training Center in Chennai
UnknownDecember 25, 2015 at 2:43 AM
Maharashtra Police Patil Recruitment 2016

Good Post, I’ll bookmark your blog and take a look at again right here regularly......
UnknownJuly 28, 2016 at 4:53 AM
The strategy you posted was nice. The people who want to shift their career to the IT sector then it is the right option to go with the ethical hacking course.
Ethical hacking course in Chennai | Ethical hacking training in chennai
unknownSeptember 30, 2019 at 11:27 PM
Informative post indeed, I’ve being in and out reading posts regularly and I see alot of engaging people sharing things and majority of the shared information is very valuable and so, here’s my fine read.
click here gif icon
click here gift
click here got em
click here gif buttonclick here
visit here

Monday, February 16, 2015

Hadoop Single Node Setup

Offline Image Viewer Guide

6 comments: