Machine Learning On Spark - UC Berkeley AMP Camp

3y ago

40 Views

2 Downloads

1.19 MB

34 Pages

Last View : 1m ago

Last Download : 3m ago

Upload by : Jamie Paz

Report this link

Download PDF

Transcription

Machine Learning onSparkShivaram VenkataramanUC Berkeley

Machine learningComputer ScienceStatistics

Spam filtersClick predictionMachine learningRecommendationsSearch ranking

ClassificationClusteringMachine learningtechniquesRegressionActive learningCollaborative filtering

Implementing Machine Learning§ Machine learning algorithms are- Complex, multi-stage- Iterative§ MapReduce/Hadoop unsuitable§ Need efficient primitives for data sharing

Machine Learning using Spark§ Spark RDDs à efficient data sharing§ In-memory caching accelerates performance- Up to 20x faster than Hadoop§ Easy to use high-level programming interface- Express complex algorithms 100 lines.

ClassificationClusteringMachine learningtechniquesRegressionActive learningCollaborative filtering

K-Means Clustering using SparkFocus: Implementation and Performance

Grouping data according tosimilarityDistance NorthClusteringE.g. archaeological digDistance East

Benefits Popular Fast Conceptually straightforwardDistance NorthK-Means AlgorithmE.g. archaeological digDistance East

K-Means: preliminariesdata lines.map(line parseVector(line))Feature 2Data: Collection of valuesFeature 1

K-Means: preliminariesdist p.squaredDist(q)Feature 2Dissimilarity:Squared Euclidean distanceFeature 1

K-Means: preliminariesData assignments to clustersS1, S2,. . ., SKFeature 2K Number of clustersFeature 1

Initialize K cluster centers Repeat until convergence:Assign each data point tothe cluster with the closestcenter.Assign each cluster centerto be the mean of itscluster’s data points.Feature 2K-Means AlgorithmFeature 1

Initialize K cluster centerscenters data.takeSample(false,K,seed) Repeat until convergence:Assign each data point tothe cluster with the closestcenter.Assign each cluster centerto be the mean of itscluster’s data points.Feature 2K-Means AlgorithmFeature 1

Initialize K cluster centerscenters data.takeSample(false,K,seed) Repeat until convergence:closest data.map(p (closestPoint(p,centers),p))Assign each cluster centerto be the mean of itscluster’s data points.Feature 2K-Means AlgorithmFeature 1

Initialize K cluster centerscenters data.takeSample(false,K,seed) Repeat until convergence:closest data.map(p (closestPoint(p,centers),p))Feature 2K-Means AlgorithmpointsGroup closest.groupByKey()Feature 1

Initialize K cluster centerscenters data.takeSample(false,K,seed) Repeat until convergence:closest data.map(p (closestPoint(p,centers),p))pointsGroup Feature 2K-Means Algorithmclosest.groupByKey()newCenters pointsGroup.mapValues(ps average(ps))Feature 1

Initialize K cluster centerscenters data.takeSample(false,K,seed) Repeat until convergence:while(dist(centers,newCenters) ɛ)closest data.map(p (closestPoint(p,centers),p))pointsGroup Feature 2K-Means Algorithmclosest.groupByKey()newCenters pointsGroup.mapValues(ps average(ps))Feature 1

centers data.takeSample(false,K,seed)while(d ɛ){closest data.map(p (closestPoint(p,centers),p))pointsGroup closest.groupByKey()Feature 2K-Means SourcenewCenters pointsGroup.mapValues(ps average(ps))d distance(centers,newCenters)centers newCenters.map( )}Feature 1

Ease of use§ Interactive shell:Useful for featurization, pre-processing data§ Lines of code for K-Means- Spark 90 lines – (Part of hands-on tutorial !)- Hadoop/Mahout 4 files, 300 lines

PerformanceLogistic Regression2550100Number of machines[Zaharia et. al, ion time MemSpark250HadoopHadoopBinMemSpark197Iteration time (s)300274K-Means50100Number of machines

Conclusion§ Spark: Framework for cluster computing§ Fast and easy machine learning programs§ K means clustering using Spark§ Hands-on exercise this afternoon !Examples and more: www.spark-project.org

Machine learning! techniques! Classiﬁcation! Regression! Clustering! Active learning! Collaborative ﬁltering! Implementing Machine Learning!! Machine learning algorithms are!- Complex, multi-stage!- Iterative!!! MapReduce/Hadoop unsuitable!! Need efﬁcient primitives for data sharing!! Spark RDDs " efﬁcient data sharing!! In-memory .

Related Documents:

Data Analytics Python

Contents at a Glance Preface xi Introduction 1 I: Spark Foundations 1 Introducing Big Data, Hadoop, and Spark 5 2 Deploying Spark 27 3 Understanding the Spark Cluster Architecture 45 4 Learning Spark Programming Basics 59 II: Beyond the Basics 5 Advanced Programming Using the Spark Core API 111 6 SQL and NoSQL Programming with Spark 161 7 Stream Processing and Messaging Using Spark 209

81 Views

3y ago

Big Data Hadoop Spark Storm and Scala Training - Intellipaat

The overview of Spark and how it is better Hadoop, deploying Spark without Hadoop, Spark history server and Cloudera distribution Spark Basics Spark installation guide, Spark configuration, memory management, executor memory vs. driver memory Working with Spark Shell, the concept of resilient distributed datasets (RDD) Learning to do functional .

11 Views

1y ago

Spark and Spark SQL - Department of Computer Science, University of Oxford

Spark vs. MapReduce (2/2) Amir H. Payberah (SICS) Spark and Spark SQL June 29, 2016 23 / 71. Spark vs. MapReduce (2/2) Amir H. Payberah (SICS) Spark and Spark SQL June 29, 2016 23 / 71. Challenge How to design a distributed memory abstraction that is bothfault tolerantande cient? Solution

17 Views

1y ago

Spark English Teachers Manual - Shubharambha Publication

2. Spark English-Teachers Manual Book II 10 3. Spark English-Teachers Manual Book III 19 4. Spark English-Teachers Manual Book IV 31 5. Spark English-Teachers Manual Book V 45 6. Spark English-Teachers Manual Book VI 59 7. Spark English-Teachers Manual Book VII 73 8. Spark English-Teachers Manual Book VIII 87 Revised Edition, 2017

30 Views

1y ago

MLlib and Distributing the Singular Value Decomposition - Stanford

A General Platform Spark Core Spark Streaming" real-time Spark SQL structured GraphX graph MLlib machine learning Standard libraries included with Spark MLlib History MLlib is a Spark subproject providing machine learning primitives Initial contribution from AMPLab, UC Berkeley Shipped with Spark since Sept 2013 MLlib: Available algorithms

5 Views

1y ago

Performance Tuning Tips for Apache SPARK Machine Learning workloads

Performance Tuning Tips for SPARK Machine Learning Workloads 12 Bottom Up Approach Methodology: Alternating Least Squares Based Matrix Factorization application Optimization Process: Spark executor Instances Spark executor cores Spark executor memory Spark shuffle location and manager RDD persistence storage level Application

6 Views

8m ago

IG–19 IGNITION SYSTEM – 2JZ–GTE (2JZ–GTE)

(d) Reinstall the spark plug. 3. REMOVE SPARK PLUGS 4. VISUALLY INSPECT SPARK PLUGS Check the spark plug for thread damage and insulator dam-age. If abnormal, replace the spark plug. Recommended spark plug: ND PK20R11 NGK BKR6EP11 5. INSPECT ELECTRODE GAP Maximum electrode gap for used spark plug

61 Views

2y ago

Adobe Spark Video Tip Sheet - Montgomery College

Here is a tip sheet guide to get you started using Adobe Spark. Adobe Spark Video is a free App that can be accessed either through your Web Browser at . https://spark.adobe.com, or, you can download the Spark Video app for any mobile device. To download Adobe Spark to your mobile device, visit your App store and search for “Adobe Spark video”

23 Views

2y ago

Recent Views

The Family and Civil Law Needs of Aboriginal People

2 ABORIGINAL USE OF LEGAL AID CIVIL AND FAMILY LAW SERVICES 41 2.1 Legal Aid for Civil Law Matters 2.1.1 Applications for Civil Aid 2.1.2 Applications for Civil Aid by Gender 2.1.3 Successful Grants of Legal Aid for Civil Law Matters 2.1.4 Grants of Civil Aid by Gender 2.2 The Provision of Minor Assistance for Civil Law Matters

1y ago

133 Views

What is Civil Engineering? - Memphis

What is Civil Engineering? Civil Engineering: The Present The first self-proclaimed civil engineer was John Smeaton (1724 -1792). What is Civil Engineering? Civil Engineering: The Present In 1818 the Institution of Civil Engineers was founded in London and received a Royal Charter in 1828, formally recognizing civil engineering as a profession.File Size: 2MBPage Count: 17Explore furtherIntroduction to Civil DF] Civil Engineering Books Huge Collection (Subject g Books Recommended to you b

2y ago

209 Views

WHAT LAW IS ? An Introduction to Law

common law system civil law system!! sources of law in civil law !! a1. primary: statutes (written law) enacted by legislative power are the principal source of law. ! a2. two subsidiary sources of law: ! a2.1 administrative regulations a.2.2 customs!! ! sources of law in common law !!! b1. two primary sources of

2y ago

385 Views

The Civil Code of the Republic of Azerbaijan - ASK

7.3. Civil law may not have retroactive effect where it causes harm to subjects of the civil law or worsens their position. Article 8. Territorial Application of Civil Law 8.1. Civil law is effective throughout the territory of the Republic of Azerbaijan without exception. 8.2. Rights specified by civil law are freely exercised and obligatorily .

1y ago

121 Views

American Legion Post 210 - s3-us-west-2.amazonaws

Bockus, John Civil War 0-48 Knapp, Leonard Civil War 0-62 Bryson, Frank T. Civil War 0-6 Lampson, G. W. Civil War 0-25 Burkley, John I. Civil War 0-65A Martin, Jacob A. Civil War 0-49 Carr, Asa M. Civil War 0-39 Martin, Pembrooke Civil War 0-9A Carr, Julius Civil War 0-39 Mather, Jonathan War of 1812 0-78

1y ago

140 Views

Faculty of Juridical, Social and Political Sciences Year .

Law L Law IV 8 Drept procesual civil II / Civil Procedure Law II 5 Law L Law IV 8 Dreptul comerțului internațional / International ommercial Law 4 Law L Law IV 8 riminalistică / Forensics 4 Law L Law IV 8 Practică de cercetare pentru elaborarea lucrării de lincență(3 săptămân

2y ago

384 Views

Intermediate Law Law and You Worksheet 3: Australian law - Home Affairs

4. There are different kinds of law to deal with different kinds of problems. Four important kinds of law are civil law, criminal law, family law and administrative law. Civil law deals with disputes between individuals; for example, if someone sells you goods that are faulty, or that cause you injury or damage, you can take that person to court.

4m ago

110 Views

12 PUBLIC LAW AND PRIVATE LAW - Home: The National .

INTRODUCTION TO LAW MODULE - 3 Public Law and Private Law Classification of Law 164 Notes z define Criminal Law; z list the differences between Public and Private Law; and z discuss the role of Judges in shaping Law 12.1 MEANING AND NATURE OF PUBLIC LAW Public Law is that part of law, which governs relationship between the State

3y ago

745 Views

Dr. Ram Manohar Lohiya National Law University, Lucknow

2. Health and Medicine Law 3. Int. Commercial Arbitration 4. Law and Agriculture IXth SEMESTER 1. Consumer Protection Law 2. Law, Science and Technology 3. Women and Law 4. Land Law (UP) Xth SEMESTER 1. Real Estate Law 2. Law and Economics 3. Sports Law 4. Law and Education **Seminar Courses Xth SEMESTER (i) Law and Morality (ii) Legislative .

3y ago

496 Views

Civil Law's Influence on American Constitutionalism

6 Experts in Roman law and civil law may object to this very broad use of the phrase "civil law tradition." Strictly speaking, "civil law" (ius civile) refers to law governing the individual relations of members of a state or commonwealth (civitas). Dig.1.1.1; Dig. 1.1.9 (G. Inst. 1). But I hope that they will understand why I have

1y ago

122 Views

Direito Civil Brasileiro - Vol 1

DIREITO CIVIL 1. Conceito de direito civil 2. Histórico do direito civil 3. A codificação 4. O Código Civil brasileiro 4.1. O Código Civil de 1916 4.2. O Código Civil de 2002 4.2.1. Estrutura e conteúdo 4.2.2. Princípios básicos 4.2.3. Direito civil-constituci

2y ago

176 Views

Civil Code of Georgia Law of Georgia - International Labour Organization

Article 10 - Independence of civil rights from political rights; imperative norms of civil law 1. The exercise of civil rights shall not depend on political rights regulated by the Constitution or by other laws of public law. 2. Participants in a civil relationship may exercise any action not prohibited by law, including any action not .

1y ago

128 Views

Companies Law - Cayman Islands dollar

Law 1 of 1971-15th December, 1970 Law 7 of 2000- 20th July, 2000 Law 7 of 1973-28th June, 1973 Law 5 of 2001-20th April, 2001 Law 24 of 1974-22nd November, 1974 Law 10 of 2001-25th May, 2001 Law 25 of 1975-9th December, 1975 Law 29 of 2001-26th September, 2001 Law 19 of 1977-10th November, 1977 Law 46 of 2001-14th January, 2002

3y ago

454 Views

It’s the Law!

ciples stated in Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law. Students will be able to explain the application of Boyle’s Law, Charles’ Law, Gay-Lussac’s Law, Henry’s Law, and Dalton’s Law to observations or events related to SCUBA diving. MateriaLs None audio/visuaL MateriaLs None teachinG tiMe

2y ago

378 Views

Common-Law Courts in a Civil-Law System: The Role of United Stat-es .

He learns the law, not by reading statutes that promulgate it or treatises that summarize it, but rather by studying the judicial opinions that invented it. This is the famous case-law method, 1 Oliver Wendell Holmes, Jr., The Common Law (1881). · : .·· ' COMMON-LAW COURTS IN A CIVIL-LAW SYSTEM pioneered by Harvard Law School in the last .

1y ago

197 Views

Machine Learning On Spark - UC Berkeley AMP Camp

It looks like you're using an ad-blocker