MapReduce, Hadoop And Amazon AWS - University Of California, Irvine

1y ago

11 Views

2 Downloads

704.26 KB

33 Pages

Last View : 1m ago

Last Download : 3m ago

Upload by : Nora Drum

Report this link

Download PDF

Transcription

MapReduce, Hadoop andAmazon AWSYasser Ganjisaffarhttp://www.ics.uci.edu/ yganjisaFebruary 2011

What is Hadoop? A software framework that supports data-intensive distributedapplications. It enables applications to work with thousands of nodes and petabytes ofdata. Hadoop was inspired by Google's MapReduce and Google File System(GFS). Hadoop is a top-level Apache project being built and used by a globalcommunity of contributors, using the Java programming language. Yahoo! has been the largest contributor to the project, and uses Hadoopextensively across its businesses.

Who uses Hadoop?http://wiki.apache.org/hadoop/PoweredBy

Who uses Hadoop? Yahoo!– More than 100,000 CPUs in 36,000 computers. Facebook– Used in reporting/analytics and machine learning and alsoas storage engine for logs.– A 1100-machine cluster with 8800 cores and about 12 PBraw storage.– A 300-machine cluster with 2400 cores and about 3 PB rawstorage.– Each (commodity) node has 8 cores and 12 TB of storage.

Very Large Storage Requirements Facebook has Hadoop clusters with 15 PB ofraw storage (15,000,000 GB). No single storage can handle this amount ofdata. We need a large set of nodes each storing partof the data.

HDFS: Hadoop Distributed File System1. filename, indexClientNamenode2. Datanodes, Blockid3. Read data132123132Data Nodes

Terabyte Sort Benchmark http://sortbenchmark.org/ Task: Sorting 100TB of data and writing resultson disk (10 12 records each 100 bytes). Yahoo’s Hadoop Cluster is the current winner:– 173 minutes– 3452 nodes x (2 Quadcore Xeons, 8 GB RAM)This is the first time that a Java program has won this competition.

Counting Words by MapReduceHello WorldBye WorldHello HadoopGoodbye HadoopHello WorldBye WorldSplitHello HadoopGoodbye Hadoop

Counting Words by MapReduceHello WorldBye WorldHello, 1 World, 1 Bye, 1 World, 1 MapperSort & MergeCombinerNode 1Bye, 1 Hello, 1 World, 1, 1 Bye, 1 Hello, 1 World, 2

Counting Words by MapReduceBye, 1 Hello, 1 World, 2 Sort & MergeGoodbye, 1 Hadoop, 2 Hello, 1 Bye, 1 Goodbye, 1 Hadoop, 2 Hello, 1, 1 World, 2 Bye, 1 Goodbye, 1 Hadoop, 2 SplitHello, 1, 1 World, 2

Counting Words by MapReduceNode 1Bye, 1 Goodbye, 1 Hadoop, 2 ReducerBye, 1 Goodbye, 1 Hadoop, 2 part‐00000Bye1Goodbye 1Hadoop 2Write on DiskNode 2Hello, 1, 1 World, 2 Reducerpart‐00001Hello, 2 World, 2 HelloWorld22

Writing Word Count in Java Download hadoop core (version doop/core/ It would be something like:– hadoop-0.20.2.tar.gz Unzip the package and extract:– hadoop-0.20.2-core.jar Add this jar file to your project class pathWarning! Most of the sample codes on web are for older versions of Hadoop.

Word Count: MapperSource files are available at: http://www.ics.uci.edu/ v1-src.zip

Word Count: Reducer

Word Count: Main Class

My Small Test Cluster 3 nodes– 1 master (ip address: 50.17.65.29)– 2 slaves Copy your jar file to master node:– Linux:– scp WordCount.jar john@50.17.65.29:WordCount.jarWindows (you need to download pscp.exe): pscp.exe WordCount.jar john@50.17.65.29:WordCount.jar Login to master node:– ssh john@50.17.65.29

Counting words in U.S. Constitution! Download text version:wget http://www.usconstitution.net/const.txt Put input text file on HDFS:hadoop dfs -put const.txt const.txt Run the job:hadoop jar WordCount.jar edu.uci.hadoop.WordCount const.txt word-count-result

Counting words in U.S. Constitution! List my files on HDFS:– Hadoop dfs -ls List files in word-count-result folder:– Hadoop dfs -ls word-count-result/

Counting words in U.S. Constitution! Downloading results from HDFS:hadoop dfs -cat word-count-result/part-r-00000 word-count.txt Sort and view results:sort -k2 -n -r word-count.txt more

Hadoop Map/Reduce - Terminology Running “Word Count” across 20 files is onejob Job Tracker initiates some number of maptasks and some number of reduce tasks. For each map task at least one task attemptwill be performed more if a task fails (e.g.,machine crashes).

High Level Architecture of MapReduceMaster ve NodeTaskTrackerTaskSlave NodeTaskTrackerTaskTaskSlave Node

High Level Architecture of HadoopMapReduce layerHDFS layerMaster NodeSlave NodeSlave odeJobTrackerNameNodeDataNode

Web based User interfaces JobTracker: http://50.17.65.29:9100/ NameNode: http://50.17.65.29:9101/

Hadoop Job Scheduling FIFO queue matches incoming jobs toavailable nodes– No notion of fairness– Never switches out running job Warning! Start your job as soon as possible.

Reporting ProgressIf your tasks don’t report anything in 10 minutes they would be killed by Hadoop!Source files are available at: http://www.ics.uci.edu/ v2-src.zip

Distributed File Cache The Distributed Cache facility allows you totransfer files from the distributed file systemto the local file system (for reading only) of allparticipating nodes before the beginning of ajob.

TextInputFormat offset1, line1 LineRecordReader offset2, line2 offset3, line3 SplitFor more complex inputs,You should extend: InputSplit RecordReader InputFormat

Part 2: Amazon Web Services(AWS)

What is AWS? A collection of services that together make upa cloud computing platform:– S3 (Simple Storage Service)– EC2 (Elastic Compute Cloud)– Elastic MapReduce– Email Service– SimpleDB– Flexibile Payments Service–

Case Study: yelp Yelp uses Amazon S3 to store daily logs and photos,generating around 100GB of logs per day. Features powered by Amazon Elastic MapReduceinclude:––––––People Who Viewed this Also ViewedReview highlightsAuto complete as you type on searchSearch spelling suggestionsTop searchesAds Yelp runs approximately 200 Elastic MapReduce jobsper day, processing 3TB of data.

Amazon S3 Data Storage in Amazon Data CenterWeb Service interface99.99% monthly uptime guaranteeStorage cost: 0.15 per GB/Month S3 is reported to store more than 102 billionobjects as of March 2010.

Amazon S3 You can think of S3 as a big HashMap whereyou store your files with a unique key:– HashMap: key - File

References Hadoop Project Page:http://hadoop.apache.org/ Amazon Web Services:http://aws.amazon.com/

Part 2: Amazon Web Services (AWS) What is AWS? A collection of services that together make up a cloud computing platform: . Amazon S3 Data Storage in Amazon Data Center Web Service interface 99.99% monthly uptime guarantee Storage cost: 0.15 per GB/Month

Related Documents:

hadoop - riptutorial.com

1: hadoop 2 2 Apache Hadoop? 2 Apache Hadoop : 2: 2 2 Examples 3 Linux 3 Hadoop ubuntu 5 Hadoop: 5: 6 SSH: 6 hadoop sudoer: 8 IPv6: 8 Hadoop: 8 Hadoop HDFS 9 2: MapReduce 13 13 13 Examples 13 ( Java Python) 13 3: Hadoop 17 Examples 17 hoods hadoop 17 hadoop fs -mkdir: 17: 17: 17 hadoop fs -put: 17: 17

35 Views

1y ago

Outline of Tutoria Hadoop and Pig Overview Hands-on

Hadoop and Pig Overview Lavanya Ramakrishnan Shane Canon . Source: Hadoop: The Definitive Guide Zoo Keeper 13 Constantly evolving! Google Vs Hadoop Google Hadoop MapReduce Hadoop MapReduce GFS HDFS Sawzall Pig, Hive . Hadoop on Amazon – Elastic MapReduce 19 .

40 Views

3y ago

HOG: Distributed Hadoop MapReduce on the Grid - Illinois Institute of ...

As Hadoop MapReduce became popular, the number and scale of MapReduce programs became increasingly large. To utilize Hadoop MapReduce, users need a Hadoop plat-form which runs on a dedicated environment like a cluster or cloud. In this paper, we construct a novel Hadoop platform, Hadoop on the Grid (HOG), based on the OSG [6] which

12 Views

5m ago

Real Time Micro-Blog Summarization based on Hadoop/HBase

Introduction Apache Hadoop . What is Apache Hadoop? MapReduce is the processing part of Hadoop HDFS is the data part of Hadoop Dept. of Computer Science, Georgia State University 05/03/2013 5 Introduction Apache Hadoop HDFS MapReduce Machine . What is Apache Hadoop? The MapReduce server on a typical machine is called a .

19 Views

1y ago

AWS Training, Certification & Services - flane.de

4 AWS Training & Services AWS Essentials Training AWS Cloud Practitioner Essentials (CP-ESS) AWS Technical Essentials (AWSE) AWS Business Essentials (AWSBE) AWS Security Essentials (SEC-ESS) AWS System Architecture Training Architecting on AWS (AWSA) Advanced Architecting on AWS (AWSAA) Architecting on AWS - Accelerator (ARCH-AX) AWS Development Training

25 Views

1y ago

Hadoop Overview - NERSC

Source: Hadoop: The Definitive Guide Zoo Keeper 12 Constantly evolving! Google Vs Hadoop Google Hadoop MapReduce Hadoop MapReduce GFS HDFS Sawzall Pig, Hive BigTable Hbase Chubby Zookeeper Pregel Hama, Giraph . Hadoop on Amazon – Elastic MapReduce 18 . Other Related Projects [2/2]

26 Views

3y ago

Splunk App for AWS Comprehensive AWS Visibility AWS Data Sources AWS EC2 Amazon EMR Amazon Kinesis Amazon R53 Amazon VPC Amazon ELB Amazon S3 CloudFront AWS CloudTrail Amazon . Planning the Largest AWS Splunk Migration Do we age out? -Support dying infrastructure that is almost out of support for an additional 18 months?

31 Views

1y ago

MapReduce Online - University of California, Berkeley

2.2 Hadoop Architecture Hadoop is composed of Hadoop MapReduce, an im-plementation of MapReduce designed for large clusters, and the Hadoop Distributed File System (HDFS), a ﬁle system optimized for batch-oriented workloads such as MapReduce. In most Hadoop jobs, HDFS is used to store both the input to the map step and the output of the .

7 Views

1y ago

Recent Views

Consumer Guide to Auto Insurance - csimt.gov

consumer guide to auto insurance contents introduction to auto insurance 1 understanding your auto insurance policy 2 required auto insurance 3 optional types of auto insurance 4-5 getting the right coverage 6 accidents and violations 7 how to shop for auto insurance 8 shopping tips 9 frequently asked questions 10-11 insurance complaints/when you have a problem 12

2y ago

805 Views

your guide to understanding auto ins in nh - New Hampshire

Hampshire Insurance Department does not mandate or set Auto Insurance Rates. Auto Insurance Rates will vary by insurance company. This guide is intended to give New Hampshire consumers basic information on auto insurance. It suggests ways to: Lower the cost of your auto insurance, shop for Auto insurance and, file an auto insurance claim.

1y ago

449 Views

OWNER'S GUIDE - NinjaKitchen

auto auto auto. frozen drinks smoothies puree med high pulse low / dough. auto auto auto. frozen drinks smoothies puree med high pulse low / dough. auto auto auto. frozen drinks smoothies puree med high pulse low / dough. auto auto auto. please keep these important safeguards in mind when using the . appliance: mportant: make sure that the .

1y ago

285 Views

Quotes within Quotes: When Single (') and Double (") Quotes . - SAS

Here the outside double quotes are replaced by a single quote and the apostrophe is replaced by two single quotes. This works because when the parser sees two single (or double) quotes immediately following each other, the parser resolves them into one quote mark after the closing quote has been determined.

1y ago

237 Views

What These Inspirational Quotes Say

Self Motivation Quotes Success Quotes Teacher Quotes And after reading all of these inspirational quotes you’d like to share which quotation is . -- Brian Tracy "You must constantly ask yourself these questions: Who am I around? What are they doing to me? Wha

2y ago

302 Views

Consumer Guide Auto Insurance - Tennessee

Auto insurance doesn't cover paying off your loan if your car is damaged and its market value is less than what you owe. Auto dealers and lenders may offer guaranteed auto protection (GAP) insurance for this purpose. Your auto insurance will cover you if you drive into Canada. To drive into Mexico, however, you'll need to buy Mexican auto .

1y ago

199 Views

NAIC Consumer Shopping Tool for Auto Insurance

Whether you are buying auto insurance for the first time, or shopping to be sure you are getting the best deal, you already know how important auto insurance is. By law in most states, if you own a car, you must have some auto insurance. Remember, there is no such thing as a "full coverage" auto insurance policy. Policies are made up of

1y ago

185 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

673 Views

REVIEW OF AUTOMOBILE INSURANCE RATES - Consumers' Association of Canada

In the summer of 2003 the Association compiled over 7,000 auto insurance rate quotes from sources across Canada. In the case of those provinces in which private insurers provide auto insurance the study ensured that the rate quotes obtained reflected the range of prices likely to be found in those markets.

1y ago

213 Views

Broadway towing winchester ky

MO 77 Motors: Rock Hill, SC 7th Avenue Auto Salvage: Fargo, ND 81 Auto Parts & Recycling : Salem, VA 82 Auto Wrecking: Brookfield, OH #9 Truck & Auto Parts (No US Shipping) : Tottenham, ON 97 Auto Wrecking Shull's Towing: Brewster , WA 98 Auto Recyclers: Brooksville, FL 99 Auto Dismantler: Stockton, CA A & A Auto & Truck LLC:

2y ago

465 Views

All about auto insurance - Option Consommateurs

of insurance companies with which they have agreements. Insurance agents: agents work for a specific insurance company. Before you decide to do business with either a broker or an agent, check out prices, the products being proposed and the quality of the service. Buying auto insurance 4 All about auto insurance

1y ago

230 Views

A Message from Our President - Fox Valley Corvette

Bob Jass Chev-rolet 630-365-6481 Auto Parts 25% in most cas-es Ron Westphal Chevrolet 630-898-9630 Auto Parts 25% in most cas-es Thomsons Auto Parts 630-879-6363 Auto Parts 10% in most cas-es American Mod-ern Insurance Co. Collector Car Auto Insurance 10% on Collector Auto Polic

2y ago

225 Views

Quotations - Free Website Builder: Create free websites

cards, but sometimes, playing a poor hand well." . 50th Birthday Quotes 60th Birthday Quotes And there are more. Funny Birthday Quotes Cute Birthday Quotes . it a try, itʼs free. Triumph over failure can be a

2y ago

267 Views

The Top 100 Motivational & Inspirational Quotes for 2015

I've spent hours crawling through the web trying to find the best quotes to keep me motivated and inspired all throughout the New Year. I've saved hundreds of quotes on my laptop and figured that words alone could motivate and inspire me. but if I couple the quotes

2y ago

329 Views

Inspirational Quotes - Guideposts

Inspirational Quotes Inspiring quotes are like vitamins for the soul. From the heartfelt to the humorous, the words of wisdom you’ll find here will strengthen your faith, lift your spirits, and even spark a positive change in your life. This collection of some our favorite inspirational quotes from religious figures, world leaders, authors,

2y ago

553 Views

MapReduce, Hadoop And Amazon AWS - University Of California, Irvine

It looks like you're using an ad-blocker