The Hadoop Ecosystem - York University

2y ago

7 Views

3 Downloads

1.15 MB

16 Pages

Last View : 21d ago

Last Download : 3m ago

Upload by : Farrah Jaffe

Report this link

Download PDF

Transcription

The Hadoop EcosystemEECS 4415Big Data SystemsTilemachos Pechlivanogloutipech@eecs.yorku.ca

A lot of toolsdesigned to workwith Hadoop2

HDFS, MapReduce Hadoop Distributed File System– Core Hadoop component– Distributed storage and I/O for Hadoop MapReduce– Core Hadoop component– Software framework for data processing3

YARN Yet Another Resource Negotiator– Resource allocation and scheduling– Core Hadoop component Components: ResourceManager, NodeManager– ResourceManager: receives processing requests passes the parts of requests to corresponding NodeManagers Has Schedulers that allocate resources, time based on application requirements Has ApplicationsManager that monitors running jobs– NodeManager: Handles requests at every DataNode4

Apache Pig SQL-like command structure in Hadoop– Much more condensed (10 pig latin lines 200 Map-Reduce lines)– Allows actions like grouping, filtering etc.– Developed by Yahoo Pig Runtime and Pig Latin language– Analogy to Java: Pig Runtime - JVM, Pig Latin - Java– Compiler internally converts pig latin to MapReduce5

Apache HIVE SQL queries in Hadoop:– Uses Hive Query Language(HQL), very similar to SQL– Highly scalable, both batch and real-time processing support– Supports all SQL types, most commands etc. JDBC/ODBC driver and Hive Command Line :– Java Database Connectivity (JDBC), Object Database Connectivity (ODBC) Used to establish connection with data storage– Developed by Facebook6

Apache Mahout Machine Learning in Hadoop– Provides built-in algorithms for machine learning problems– Executed through a command line Supported algorithms:– Collaborative filtering: mining patterns/behaviors, makes predictions and recommendations Amazon product recommendation– Clustering: finding groups of similar data recommending groups in social media– Classification: classifying and categorizing data into various sub-departments identifying objects in image recognition7

Apache Spark Framework for real time data analytics– Executes in-memory computations, high-speed data processing (100x faster than MapReduce)– Written in Scala, but supports many languages Contains high-level libraries, processing based on DataFrames8

Apache HBASE Non-relational distributed database (No-SQL)– All types of data, absolutely everything is supported– Provides fault tolerance and fast retrieval of data– Open source, based on Google’s BigTable Runs on top of Hadoop, provides BigTable - like capabilities– Written in Java9

Apache Zookeeper, Oozie Zookeeper: Hadoop job coordination– Coordination between different distributed Hadoop jobs/services– Things like addresses, start-up/shutdown, configurations– Used in Rackspace, Yahoo, eBay Oozie: Hadoop clock/alarm– Oozie Workflow: sequential acts to be performed– Oozie Coordinator: triggers job execution when data is available10

Apache Flume, Sqoop Flume: Unstructured data ingestion– Handles the entry of data in the system– Collects, aggregates and moves large amounts of data– Handles real-time input streams Sqoop: Import/export structured data– Also handles data ingestion– Moves data from RDBMS or Enterprise data warehouses to HDFS or vice versa11

Apache Solr & Lucene Searching and indexing– Used for different data search tasks– Solr is the application, Lucene is the engine/kernel12

Apache Ambari Managing the whole ecosystem13 Hadoop cluster provisioning– Step by step process for installing hadoop on many hosts– Handles Hadoop cluster configurations Hadoop cluster management– Provides central management service for starting, stopping and re-configuring Hadoop services Hadoop cluster monitoring– Dashboard for monitoring cluster health and status– Amber Alert framework for notifying if something is wrong

Honorable mentions14 Avro: data serialization ( JSON) Cassandra: reliable NoSQL distributed database Cloudera: Hadoop environment management, commercial vendor Chukwa: data collection system Impala: analytic database Kafka: Hadoop messaging Tajo: robust big data relational and distributed data warehouse Tez: generalized data-flow programming framework

An example Hadoop system15

Thank you!Based p://www.bmc.com/guides/hadoop-ecosystem.html16

Apache HIVE 6 SQL queries in Hadoop: – Uses Hive Query Language(HQL), very similar to SQL – Highly scalable, both batch and real-time processing support – Supports all SQL types, most commands etc. JDBC/ODBC driver and Hive Command Line : – Java Database Connectivity (JDBC), Object Database Connectivity (ODBC)

Related Documents:

hadoop - riptutorial.com

1: hadoop 2 2 Apache Hadoop? 2 Apache Hadoop : 2: 2 2 Examples 3 Linux 3 Hadoop ubuntu 5 Hadoop: 5: 6 SSH: 6 hadoop sudoer: 8 IPv6: 8 Hadoop: 8 Hadoop HDFS 9 2: MapReduce 13 13 13 Examples 13 ( Java Python) 13 3: Hadoop 17 Examples 17 hoods hadoop 17 hadoop fs -mkdir: 17: 17: 17 hadoop fs -put: 17: 17

37 Views

1y ago

Nonprofit Self-Assessment Checklist

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

1.4K Views

2y ago

Name of thé élément in thé language and script of thé ... - UNESCO

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

117 Views

9m ago

[Kl - Mauritius

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

470 Views

1y ago

Employee Benefits Event - Schneider Downs Tax Services

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

329 Views

1y ago

Study Investigating thè Effect of E- Service Quality on Customer's ...

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

127 Views

9m ago

Lecture @Dhbw: Data Warehouse Part Vii: Hadoop

2006: Doug Cutting implements Hadoop 0.1. after reading above papers 2008: Yahoo! Uses Hadoop as it solves their search engine scalability issues 2010: Facebook, LinkedIn, eBay use Hadoop 2012: Hadoop 1.0 released 2013: Hadoop 2.2 („aka Hadoop 2.0") released 2017: Hadoop 3.0 released HADOOP TIMELINE Daimler TSS Data Warehouse / DHBW 12

15 Views

1y ago

Big Data Analytics - learnerspoint.org

The hadoop distributed file system Anatomy of a hadoop cluster Breakthroughs of hadoop Hadoop distributions: Apache hadoop Cloudera hadoop Horton networks hadoop MapR hadoop Hands On: Installation of virtual machine using VMPlayer on host machine. and work with some basics unix commands needs for hadoop.

11 Views

1y ago

Recent Views

Career Options for In-House Counsel

Association of Corporate Counsel 1025 Connecticut Avenue, NW, Suite 200 Washington, DC 20036 USA tel 1 202.293.4103, fax 1 202.293.4701 www.acc.com By in-house counsel, for in-house counsel. Association of Corporate Counsel 1025 Connecticut Avenue, NW, Suite 200 Washington, DC 20036 USA tel 1 202.293.4

2y ago

186 Views

Corporate Counsel College

CORPORATE COUNSEL TRAINING ACADEMY For in-house counsel newer to the role. For more information, please view the Corporate Counsel Training Academy brochure on www.iadclaw.org. 5:00 - 6:30 p.m. COCKTAIL RECEPTION THURSDAY, APRIL 7, 2022 7:15 - 8:00 a.m. BREAKFAST 8:00 - 8:15 a.m. OPENING REMARKS John T. Lay, Jr., Corporate Counsel College Dean .

1y ago

119 Views

Session 102 How to Become Insurance Panel Counsel & Tips on Ethical .

The retained counsel maintains a relationship between the insured client(s) and the carrier with the common goal of resolving the litigation or claim(s) asserted against the insured. In such a relationship, the carrier pays the defense cost and the legal fees of the panel counsel. However, the panel counsel/staff counsel

1y ago

128 Views

OFFICE OF THE GENERAL COUNSEL MEMORANDUM GC 15- 04 March 18, 2015

OFFICE OF THE GENERAL COUNSEL MEMORANDUM GC 15- 04 March 18, 2015 TO: All Regional Directors, Officers-in-Charge, and Resident Officers FROM: Richard F. Griffin, Jr., General Counsel SUBJECT: Report of the General Counsel Concerning Employer Rules Attached is a report from the General Counsel concerning recent employer rule cases. Attachment

1y ago

111 Views

Corporate Counsel: In the Crosshairs of a Criminal Ivestigation

Corporate counsel are expected, and in some cases required, to act independently of the very executives to whom they report. The fiduciary duties of corporate counsel now dic-tate that, at the first signs of suspicious activity, corporate counsel are expected to consult with outside counsel, initi-

1y ago

108 Views

Summaries of Published Successful Ineffective Assistance of Counsel .

innocence; counsel thought petitioner believed what he was saying but counsel disbelieved it, and counsel's approach was not designed to avoid suborning perjury but rather to avoid a death sentence. SCOTUS not apply did . Strickland. here "[b]ecause a client's autonomy, not counsel's competence, is in issue." 138 S. Ct. at 1510- 11.

1y ago

91 Views

SM Recruiting & Retaining In-House Counsel

May 30, 2013 · By in-house counsel, for in-house counsel. Association of Corporate Counsel 1025 Connecticut Avenue, NW, Suite 200 Washington, DC 20036 USA tel 1 202.2

2y ago

130 Views

Assistant General Counsel for Litigation, Employment and .

The Assistant General Counsel for Litigation, Employment, and Oversight (AGC/LEO) is the principal assistant and advisor to the General Counsel and Deputy General Counsel on legal aspects of the Department’s activities in the fields of employment, labo

2y ago

114 Views

Case: 15-6397 Document: 24 Filed: 02/04/2016 Page: 1 .

AMICUS CURIAE IN SUPPORT OF THE APPELLANT . ANNE K. SMALL General Counsel . SANKET J. BULSARA Deputy General Counsel . MICHAEL A. CONLEY Solicitor . WILLIAM K. SHIREY Assistant General Counsel . STEPHEN G. YODER Senior Litigation Counsel . Securities and Exchange

2y ago

109 Views

USCA Case #13-5252 Document #1455974 Filed: 09/11/2013 .

1615 H St., NW Washington, DC 20062 202.463.5337 Counsel for Appellant the Chamber of Commerce of the United States of America Of Counsel: Quentin Riegel National Association of Manufacturers 733 10th St., NW Suite 700 Washington, DC 20001 202.637.3000 Counsel for Appellant the National Association of Manufacturers Of Counsel: Maria Ghazal

2y ago

328 Views

OUTSIDE COUNSEL GUIDELINES - Government of New Jersey

counsel shall designate a Relationship Attorney to be the Designated Attorney's principal contact. Outside counsel may expect the Designated Attorney to provide clear, specific instructions; communicate the State's objectives; closely monitor the management plan and budget; follow the progress of the matter; keep outside counsel informed of .

1y ago

109 Views

Waiver of Counsel in Juvenile Court

Waiver of Counsel . 3 Waiver of Counsel in Juvenile Court . The Sixth Amendment states "[i]n all criminal prosecutions, the accused shall enjoy the right . . . to have the Assistance of Counsel for his defence." (U.S. Constit, amend. VI). This right is part of the Constitutional jurisdiction of the Court (Johnson v. Zerbst, 1938). Without it, the

1y ago

118 Views

Should Compliance Report to the General Counsel?

than 800 responses, 88% are opposed to the corporate counsel serving as the compliance officer, and 80% oppose having com-pliance report to the corporate counsel's office. Detailed Findings o Survey respondents were strongly opposed to the idea of corporate counsel also serving as the compliance officer.

1y ago

116 Views

The General Counsel Report 2021 Rising To Today's Challenges and .

general counsel evolved from the office of "no," to one of significant strategic influence. Once largely viewed as a cost center, or barrier to corporate progress, the general counsel of today are business drivers in their own right. This evolution for the general counsel came in the nick of time for the turmoil of 2020.

1y ago

110 Views

Leveraging Legal Leadership: The General Counsel as a Corporate Culture .

counsel and legal department, but the failure to draw that link may prove shortsighted on the part of the board. Given the importance of the general counsel in matters of ethics, compliance, corporate governance, and risk and reputation management, the general counsel should be a key ally and partner in establishing a

1y ago

156 Views

The Hadoop Ecosystem - York University

It looks like you're using an ad-blocker