April 29, 2015 Installing Hadoop - WordPress

3y ago
14 Views
2 Downloads
815.33 KB
7 Pages
Last View : 6d ago
Last Download : 3m ago
Upload by : Julia Hutchens
Transcription

April 29, 2015Installing HadoopHortonworks HadoopVERSION 1.0Mogulla, Deepak Reddy

Table of ContentsGet Linux platform ready .2Update Linux .2Update/install Java: .2Setup SSH Certificates .3Setup SSH connection from parent OS .3Download Hadoop in Linux .3Configure files in Hadoop .4Updating Hadoop configuration files .4Hadoop-env.sh . 4Core-site.xml . 4Hdfs-site.xml . 5Mapred-site.xml . 6Fire up Hadoop .6Shutdown Hadoop .6Page 1

Get Linux platform ready Download a virtual machine software - either Oracle VirtualBox or VMWare Fusion ifrunning windows, mac or other OSs.Install the virtual machine and make it ready to install Ubuntu or other Linux platformson the VM or the machine, depending on the choice you make.If installing Ubuntu on a VM then download the .iso file from Ubuntu website, for thelatest version, and move it to the Desktop (for easy access).Create a new VM and select the .iso file to install in the new VM. It automatically installsand Ubuntu is ready.Update Linux Open the terminal application in UbuntuEnter the commands to update Linux: sudo apt-get upgrade sudo apt-get dist-upgradeUpdate/install Java: Check the Java version: java –version If there is no java then download and install it with the following command: sudo apt-get install default-jdk Check the java version with the same command as we did in the first step and setup theJAVA HOME variable. To check where java is installed type in the below for the path: update-alternatives --config java JAVA HOME is everything before the “/jre/bin/java”. In this case JAVA HOME isusr/lib/jvm/java-7-openjdk-amd64. So set JAVA HOME using the below: export JAVA HOME /usr/lib/jvm/java-7-openjdk-amd64Page 2

Setup SSH Certificates The below commands sets up SSH certificates so Hadoop can access the nodes with sshwithout asking for passwords. Type in the following to set it up: sudo apt-get install ssh sudo apt-get install rsync ssh-keygen -t rsa -P '' -f /.ssh/id rsa cat /.ssh/id rsa.pub /.ssh/authorized keysSetup SSH connection from parent OS If using OS X, then the connection can be accomplished using ssh in the Terminal: ssh username @ ip addr of Linux For Windows, we may have to use Putty to setup a connection.Download Hadoop in Linux Check the latest stable release and copy the mirror URL and see the contents inside to getthe tar.gx file for hadoop. The current URL when this document was written 6.0/hadoop2.6.0.tar.gz So now we have to use the wget command to get the tar file and then untar the package. wget /hadoop-2.6.0.tar.gz tar xfz hadoop-2.6.0.tar.gz Then we need to move the untared file to /usr/local location. We make a new folder inthe local directory. Use the command: mv hadoop-2.6.0 /usr/local/hadoop Add a dedicated user group to run this hadoop directory. sudo addgroup hadoop Make the owner of all the files is the user chosen and the group is hadoop. sudo chown -R hduser:hadoop hadoopPage 3

Configure files in Hadoop Open the .bashrc file in editing mode. nano /.bashrc Add the following at the end of the file#HADOOP VARIABLES STARTexport JAVA HOME /usr/lib/jvm/java-7-openjdk-amd64export HADOOP INSTALL /usr/local/hadoopexport PATH PATH: HADOOP INSTALL/binexport PATH PATH: HADOOP INSTALL/sbinexport HADOOP MAPRED HOME HADOOP INSTALLexport HADOOP COMMON HOME HADOOP INSTALLexport HADOOP HDFS HOME HADOOP INSTALLexport YARN HOME HADOOP INSTALLexportHADOOP COMMON LIB NATIVE DIR HADOOP INSTALL/lib/nativeexport HADOOP OPTS "-Djava.library.path HADOOP INSTALL/lib"#HADOOP VARIABLES END Save the file using the command CTRL X.After that use the following command to acknowledge the updates variables: source /.bashrcUpdating Hadoop configuration filesHadoop-env.sh Open the file in edit mode: nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh Add the below line where it says ‘# The java implementation to use’export JAVA HOME /usr/lib/jvm/java-7-openjdk-amd64 Save the file by pressing CTRL X.Core-site.xml Open the file in edit mode: nano /usr/local/hadoop/etc/hadoop/core-site.xmlPage 4

Add the lines to the file at the end: configuration property name fs.default.name /name value hdfs://localhost:9000 /value /property /configuration Save the file using the command CTRL X.Hdfs-site.xml Add the namenode and datanode directory properties and then change owner for thesedirectories. sudo chown -R deepak:deepak namenode sudo chown -R deepak:deepak datanode Open the file in edit mode: nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml Add the lines to the file at the end: configuration property name dfs.replication /name value 1 /value /property property name dfs.namenode.name.dir /name value file:/usr/local/hadoop dir/hdfs/namenode /value /property property name dfs.datanode.data.dir /name value file:/usr/local/hadoop dir/hdfs/datanode /value /property /configuration Save the file using the command CTRL X.Page 5

Mapred-site.xml Make the template and actual file. cp n the file in edit mode: nano /usr/local/hadoop/etc/hadoop/mapred-site.xml Add the lines to the file at the end: configuration property name mapred.job.tracker /name value localhost:9001 /value /property property name mapreduce.framework.name /name value yarn /value /property /configuration Save the file using the command CTRL X.Fire up Hadoop Format namenode before starting hadoop: hdfs namenode -format Start hadoop: start-dfs.sh Check the instances running in hadoop: jpsShutdown Hadoop stop-all.shPage 6

Page 2 Get Linux platform ready Download a virtual machine software - either Oracle VirtualBox or VMWare Fusion if running windows, mac or other OSs. Install the virtual machine and make it ready to install Ubuntu or other Linux platforms on the VM or the machine, depending on the choice you make. If installing Ubuntu on a VM then download the .iso file from Ubuntu website, for the

Related Documents:

1: hadoop 2 2 Apache Hadoop? 2 Apache Hadoop : 2: 2 2 Examples 3 Linux 3 Hadoop ubuntu 5 Hadoop: 5: 6 SSH: 6 hadoop sudoer: 8 IPv6: 8 Hadoop: 8 Hadoop HDFS 9 2: MapReduce 13 13 13 Examples 13 ( Java Python) 13 3: Hadoop 17 Examples 17 hoods hadoop 17 hadoop fs -mkdir: 17: 17: 17 hadoop fs -put: 17: 17

2006: Doug Cutting implements Hadoop 0.1. after reading above papers 2008: Yahoo! Uses Hadoop as it solves their search engine scalability issues 2010: Facebook, LinkedIn, eBay use Hadoop 2012: Hadoop 1.0 released 2013: Hadoop 2.2 („aka Hadoop 2.0") released 2017: Hadoop 3.0 released HADOOP TIMELINE Daimler TSS Data Warehouse / DHBW 12

The hadoop distributed file system Anatomy of a hadoop cluster Breakthroughs of hadoop Hadoop distributions: Apache hadoop Cloudera hadoop Horton networks hadoop MapR hadoop Hands On: Installation of virtual machine using VMPlayer on host machine. and work with some basics unix commands needs for hadoop.

The In-Memory Accelerator for Hadoop is a first-of-its-kind Hadoop extension that works with your choice of Hadoop distribution, which can be any commercial or open source version of Hadoop available, including Hadoop 1.x and Hadoop 2.x distributions. The In-Memory Accelerator for Hadoop is designed to provide the same performance

Configuring SSH: 6 Add hadoop user to sudoer's list: 8 Disabling IPv6: 8 Installing Hadoop: 8 Hadoop overview and HDFS 9 Chapter 2: Debugging Hadoop MR Java code in local eclipse dev environment. 12 Introduction 12 Remarks 12 Examples 12 Steps for configuration 12 Chapter 3: Hadoop commands 14 Syntax 14 Examples 14 Hadoop v1 Commands 14 1 .

-Type "sudo tar -xvzf hadoop-2.7.3.tar.gz" 6. I renamed the download to something easier to type-out later. -Type "sudo mv hadoop-2.7.3 hadoop" 7. Make this hduser an owner of this directory just to be sure. -Type "sudo chown -R hduser:hadoop hadoop" 8. Now that we have hadoop, we have to configure it before it can launch its daemons (i.e .

Installing on a Desktop or Laptop 23 Installing Hortonworks HDP 2.2 Sandbox 23 Installing Hadoop from Apache Sources 29 Installing Hadoop with Ambari 40 Performing an Ambari Installation 42 Undoing the Ambari Install 55 Installing Hadoop in the Cloud Using Apache Whirr 56 Step 1: Install Whirr 57 Step 2: Configure Whirr 57

Chapter 1: Getting Started with Hadoop 2.X 1 Introduction1 Installing single-node Hadoop Cluster 2 Installing a multi-node Hadoop cluster 9 Adding new nodes to existing Hadoop clusters 13 Executing balancer command for uniform data distribution 14 Entering and exiting from the safe mode in a Hadoop cluster 17 Decommissioning DataNodes 18