Lecture 3 Aggregation And Least Squares - Historyofdsc

1y ago

10 Views

2 Downloads

6.47 MB

38 Pages

Last View : 14d ago

Last Download : 3m ago

Upload by : Grady Mosby

Report this link

Download PDF

Transcription

Lecture 3Aggregation and Least SquaresHistory of Data Science, Spring 2022 @ UC San DiegoSuraj Rampure

Announcements Homework 3 is released and is due on Sunday, April 17th at 11:59PM. Homework 1 is graded! Make sure to look at the solutions, posted on Slackand on the course website.

Agenda Pythagorean means. Tycho Brahe’s use of the mean. A pre-cursor to least squares – Boscovich’s method. Legendre, Gauss, and least squares.

Means

Means The concept of the “arithmetic mean” was known to the Pythagoreans – infact, they are known for establishing three types of means. However, means were not used for the purposes of summarizing data untilmuch, much later.

Pythagorean meansmeana,b2Ca2bFrom Archytas (member of the Pythagorean school of thought)1:“There are three 'means' in music: one is the arithmetic, the second is the geometric, and the third is thesubcontrary, which they call 'harmonic'. The arithmetic mean is when there are three terms showingsuccessively the same excess: the second exceeds the third by the same amount as the first exceeds thesecond. In this proportion, the ratio of the larger numbers is less, that of the smaller numbers greater.”--c-b a2c -C/C a 1. http://www.cs.uni.edu/ campbell/stat/pyth.htmlGeneral!9taz a,

Pythagorean meanscFrom Archytas (member of the Pythagorean school of thought)1:bC ca"The geometric mean is when the second is to the third as the first is to the second; in this, the greaternumbers have the same ratio as the smaller numbers.”c-§ I aba. .c-:*Tab"“The subcontrary, which we call harmonic, is as follows: by whatever part of itself the first term exceeds tofthethird.Inthisproportion,theratioofthelarger numbers is larger, and of the lower numbers less.”--- 1. http://www.cs.uni.edu/ campbell/stat/pyth.html 1-E- 5--1

Generalharmonicmean:aInat .

Tycho Brahe Recall, Tycho Brahe (1546-1601) was a Danish astronomer.1 He was a pioneer in measuring the positions of stars in thenight sky, without the use of telescopes. Kepler used Brahe’s data when creating his laws of planetarymotion. He is also one of the earliest scientists documented ashaving used the mean to combine observations.2 Also supposedly lost his nose in a fight and wore a fakenose.1. anish-astronomer2. Pearson and Kendall, Studies in the History of Probability and Statistics, p122-123Tycho Brahe’s triangular sextant

Right ascension One of the earliest documented examples of combiningobservations is in the work of Tycho Brahe, who was measuring theright ascension of α Arietis (a star). Right ascension is the celestial equivalent of longitude on Earth. It is measured in units of time, relative to when a reference point(the “vernal equinox”) passes overhead. e.g. if an object’s right ascension is 2 hours and 15 minutes,you will see it pass directly above you 2 hours and 15 minutesafter the reference point does. Similar to GMT-8 meaning “8 hours before GreenwichMeridian Time.”tapparent pathofsun

Brahe collected several measurements forthe right ascension of α Arietis from1582-1588, with the goal of coming upwith a single value. He selected 3 values from 1582, and 12values from the next 6 years, each of whichwas the mean of two other observations. Question: how do we interpret thesenumbers and verify that he did indeedtake the mean of each pair?Source: Pearson and Kendall, Studies in the History of Probability and Statistics, p122-123

Aside: measuring time in degrees360. 24hours-24-241 hour Right ascension is measured in time, and can vary from 0 hours to 24 hours (because oneµ 4minurotation of the Earth takes 24 hours). A circle has 360º degrees in it, so one way of describing time is as using360º 24 hours This means that 15º 1 hour, and 1º 4 minutes.degree into 60 arcminutes, denoted by ‘, and each arc minute We can further subdivide each into 60 arcseconds, denoted by ‘’. As an example, let’s try and convert the following measurement into regular minutes:E)i-f.to/;i--li-.j--82º 15’ 10”

10--4minutes):-.itI" (82º 15’ 10”- ①convert82 15degreesto'lo" / 82 61-0.15 * -② Convert4. (82 4 1%0)tominutes(82 4- 3,4).to) & mi①" ""minute,328 1 4-0 329 4minute

34.13'4"ÉÑs①Converttodegrees34②Convert 0 6 minutesto4 ( 34 Ig)136 1,3- 4,122hours,16 minutes[ (E) Jain

Back to Brahe’s data Now that we know how to interpret these80numbers, we can verify that the operation Braheused on each pair was the mean. Strategy: to compute mean(d1, d2): Convert d1 and d2 to minutes (i.e. regularnumbers) and compute their mean. Convert the mean back into degreesarcminutes-arcseconds. Let’s try this in a Jupyter Notebook!

Reducing observational error The values in the right-most column are far lessspread out than the values in the middle column. As such, Brahe used the mean to eliminatesystematic errors.1É The final right ascension that Brahe reported was26º 0’ 30”, which is very close to both the meanof all 15 numbers in the right column and themean of just the bottom 12. Per his biographer1, the correct value of the rightascension of α Aries at the time was 26º 0’ 45”,which is quite close.1. Pearson and Kendall, Studies in the History of Probability and Statistics, p122-123

The mean and least squares

For context Without proper context, it may not be clear what aggregation (e.g. taking themean or median of a set of values) has anything to do with least squares(which you learned in DSC 10 is the foundation of linear regression). This connection is made more clear in DSC 40A. We’ll spend a little bit of time providing this context, as we move into theorigins of least squares.

Making predictionsi# As you’ve seen in DSC 10, the slope and intercept of the line of best fit come from findingthe values of a and b that minimize mean squared error.n12MSE yi (a bxi))(n i 1 What if we want to use a more simple prediction technique – what if we want to make aconstant prediction, for each observation? To do this, we’d need to find the constant c that minimizes mean squared error.n12MSE (yi c)n i 1

R (c) §Gi c)" Yi yz ;. y-the"squarederror,"least squares "aggregation

Other types of error Why do we minimize mean squared error? Instead of squaring the errors before taking the mean, is there anotheroperation we could apply?meanabsolutecould 'veerror:usedabsolute/ yiInF- Ivalue-c/

Mean squared error vs. sum of squared errors Minimizing mean squared error is thet.EEsame as minimizing the sum of squarederrors.5-9 1 Key idea: the value of x that minimizes.f(x) is the same value of x that minimizesc f(x), if c is some positive constant. Many of the original authors we willminimizingstudy aimed to minimize the sum ofsquared errors, not the mean – but this isthe same task.sumis n.meanthe same!

Boscovich’s method

The length of a meridian arc A meridian arc is a curve drawn between two points on thesurface of the Earth that have the same longitude. In the mid-1700s, geodesists were concerned with studyingthe shape of Earth.": Earth is an ellipsoid that is slightly flatter at the poles than itis at the equator. Their goal at the time was to determine the relationshipbetween the length of one degree of latitude near theNorth Pole and the length of one degree of latitudeelsewhere on Earth. To do this, they measured the lengths of several meridian arcs.

Boscovich’s dataCroatiai(1711-1787) was a Dalmatian Roger Joseph BoscovichEcuadorastronomer, mathematician, andJesuit priest. He obtained data containing thelength of one degree of latitude atfive different spots on Earth. tFinland- -Source: Stigler, Studies in the History of Probability and Statistics, p. 43

The modelA: arc0:2- ,lengthlatitudey:(known)( known)uns A rough approximation for the length of an arc is2a z y sin θ--where z is the length of a degree at theequator and y is the “excess”. If y 0, then the Earth is a perfect sphere, andmeridian arcs are of the same length (z) at anylatitude. If y 0, the Earth is flatter towards the poles,and meridian arcs range from length z at theequator to length z y at the North Pole.Source

An abundance of data2a z y sin θ If Boscovich had just 2 observations, he’dhave a system of two equations and twounknowns, and would be able to solve forz and y. However, he had 5 observations, and hadto deduce a method of computing z and yusing all 5 observations. Ideas?

Boscovich’s method For each of our five observations (θi, ai), we can write2ai z y sin θi Boscovich’s described a method for selecting z and y:21. For each i, write ei ai z y sin θi.sumofabsoluteerrors ei is minimized.2. Choose z and y such thatei 0 and i What does this resemble?iaint

Least squares

Legendre Adrien-Marie Legendre (1752-1833) was a Frenchmathematician who was also active in the field ofgeodesy1. In 1791, the French Academy of Sciencedefined a meter as being one ten millionth ofthe length of the meridian arc starting at theNorth Pole, passing through Paris, and endingat the equator. He helped measure the length of a meter.1. Legendre

Legendre’s least squares In a 1805 paper about measuring the orbits of comets, Legendre published anappendix titled “Sur la Methode des moindres quarres”, which detailed ageneral procedure for estimating coefficients of linear equations. He wrote (translated):“Of all the principles which can be proposed for [making estimates from asample], I think there is none more general, more exact, and more easy ofapplication, than that of which we have made use which consists ofrendering the sum of the squares of the errors a minimum.”

Gauss Carl Friedrich Gauss (1777-1855)1 was a German mathematician, and is one of the most accomplishedmathematicians of all time. He is known for developing or contributing to: Least squares. The normal (Gaussian) distribution. Algebra and number theory. He supposedly summed the positive integers between 1 and 100 very quickly. Electromagnetism. Not Gaussian elimination!1. h-Gauss

Gauss and least squares In 1809, Gauss published “Theory of the Motion of the HeavenlyBodies Moving About the Sun in Conic Sections”, and in it he used themethod of least squares to calculate the shapes of orbits. Legendre published about least squares in 1805, 4 years before.However, Gauss claimed to have known about least squares in 1795. Evidence: Gauss was able to predict the precise location ofplanetoid Ceres using his method of least squares. Ceres was observed on January 1st, 1801 for a period of 40 days.Several astronomers competed to predict where it would bespotted again, and Gauss’ guess was the only correct one2.1. h-Gauss2. uss-and-the-method-of-least-squaresSource

Error distributions One of the key differences between the approaches to least squares by Gauss andLegendre was that Gauss linked the theory of least squares to probability theory. Specifically, he posed the least squares model whereyi a bxi ϵiwhere ϵi is a random variable that follows the following error distribution:ϕ(x μ, σ) 12πσ 2e(x μ)2 22σ

We will study Gauss’ derivation of the (now-called) Gaussian/normal distribution in Lecture 5.

Summary, next time

Summary, next time Much of the advances regarding aggregation and statistical estimation in the1500-1800s was motivated by geodesy and astronomy. Tycho Brahe’s use of the mean. Boscovich’s method regarding meridian arcs. Legendre’s method of least squares. Next time: Percentiles and regression.

Bodies Moving About the Sun in Conic Sections", and in it he used the method of least squares to calculate the shapes of orbits. Legendre published about least squares in 1805, 4 years before. However, Gauss claimed to have known about least squares in 1795. .

Related Documents:

CHEMICAL REACTION ENGINEERING

Introduction of Chemical Reaction Engineering Introduction about Chemical Engineering 0:31:15 0:31:09. Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Lecture 25 Lecture 26 Lecture 27 Lecture 28 Lecture

100 Views

2y ago

LECTURE NOTES on PROGRAMMING & DATA STRUCTURE Course Code : BCS101

Lecture 1: A Beginner's Guide Lecture 2: Introduction to Programming Lecture 3: Introduction to C, structure of C programming Lecture 4: Elements of C Lecture 5: Variables, Statements, Expressions Lecture 6: Input-Output in C Lecture 7: Formatted Input-Output Lecture 8: Operators Lecture 9: Operators continued

60 Views

1y ago

MSE 460: Electronic Materials, Devices, and Processing

Lecture 1: Introduction and Orientation. Lecture 2: Overview of Electronic Materials . Lecture 3: Free electron Fermi gas . Lecture 4: Energy bands . Lecture 5: Carrier Concentration in Semiconductors . Lecture 6: Shallow dopants and Deep -level traps . Lecture 7: Silicon Materials . Lecture 8: Oxidation. Lecture

155 Views

2y ago

【E-book】Texts & Questions of 50 Lectures for TOEFL ...

TOEFL Listening Lecture 35 184 TOEFL Listening Lecture 36 189 TOEFL Listening Lecture 37 194 TOEFL Listening Lecture 38 199 TOEFL Listening Lecture 39 204 TOEFL Listening Lecture 40 209 TOEFL Listening Lecture 41 214 TOEFL Listening Lecture 42 219 TOEFL Listening Lecture 43 225 COPYRIGHT 2016

149 Views

2y ago

Partial Differential Equations MSO-203-B - IIT Kanpur

Partial Di erential Equations MSO-203-B T. Muthukumar tmk@iitk.ac.in November 14, 2019 T. Muthukumar tmk@iitk.ac.in Partial Di erential EquationsMSO-203-B November 14, 2019 1/193 1 First Week Lecture One Lecture Two Lecture Three Lecture Four 2 Second Week Lecture Five Lecture Six 3 Third Week Lecture Seven Lecture Eight 4 Fourth Week Lecture .

39 Views

11m ago

5. Digital platforms and media - commercial relationships and monetisation

News referral services can take the form of media aggregation services, online search services or social media services. These are explained below. 5.1.1 Media aggregation services A digital platform that supplies a media aggregation service collects and presents news content from across the internet. Most providers of media aggregation .

13 Views

1y ago

Alcatel-Lucent 7705 Service Aggregation Router

Alcatel-Lucent 7705 Service Aggregation Router Overview The Alcatel-Lucent 7705 Service Aggregation Router (SAR) is an edge aggregation platform providing superior IP/MPLS and pseudowire capabilities. It addresses your need for a cost-effective, scalable mobile radio access network (RAN) transport solution. The 7705 SAR excels at con-

29 Views

2y ago

Aggregation in Social LCA Case Studies - GreenDelta

Aggregation in Social LCA studies, SETAC CPH Nov 2012 1 Aggregation over the entire life cycle: In order to indeed get a holistic picture of the social impacts over the entire life cycle, aggregation is needed, because a life cycle model provides information for its smallest elements, processes, which are usually grouped into life

15 Views

1y ago

Recent Views

IN THIS ISSUE CAR WASH INSIGHT Recent, Notable M&A Transactions .

9/8/2022 Club Car Wash Sites of Tidal Wave Express Car Wash 8 8/29/2022 Take 5 Car Wash Soft Touch Car Wash, Auto Oasis Car Wash, Clearwater Car Wash and Birdie's Car Wash 5 8/25/2022 WhiteWater Express Geaux Clean Car Wash 7 8/19/2022 ModWash Home Team Car Wash 3 8/18/2022 Splash In ECO Car Wash (Wills Group) Blue Hen Car Wash 2

9m ago

100 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

673 Views

ESSENTIAL PLAN - Discovery

Car insurance only Car and home insurance Car insurance only Car and home insurance 12.5% 25% 5% 10% YOUR FUEL CASH BACK PERCENTAGE GET TO THE HIGHEST CASH BACK PERCENTAGE Add at least R250 000 of home insurance (household contents, buildings or both) Take your car to Tiger Wheel & Tyre and pass the Annual MultiPoint check

1y ago

269 Views

CAR INSURANCE EVERYTHING EXPLAINED - RSA Insurance Group

CAR INSURANCE 93013821.indd 1 15/03/2018 10:46. 2 WELCOME TO µ CAR INSURANCE Thank you for choosing µ to protect you and your car. This booklet is intended to help you check your cover and to reassure you that µ will give you the protection you need for the year ahead. First of all, to help you understand your car insurance policy we want to .

1y ago

274 Views

Describe types and purposes of insurance.

D.O. CAPS Consumer Skills: Insurance—10E 3 Your car - The car you drive can also affect your insurance rates. Insurance companies place certain kinds of cars in special risk categories. You should ask your insurance agent before making a car purchase to make sure you aren't getting a car that will cost you extra for your liability insurance.

1y ago

233 Views

Life Insurance Buyer's Guide Life Insurance - National Association of .

Life Insurance uers uide Naional ssociaion of Insurance Commissioners Compare the Different Types of Insurance Policies There are many types of life insurance pol-icies. You should choose a policy with fea-tures that fit your individual needs. Some things to consider are: Term Insurance vs. Cash Value In-surance. Term insurance is intended to

1y ago

520 Views

Contours Options Infant Car Seat Adapter Instruction Sheet

your Infant Car Seat, as described in the instruction manual provided by the Infant Car Seat manufacturer. † WHEN USING ONLY ONE INFANT CAR SEAT ADAPTER OR TWO FOR TWINS, THE FOLLOWING INFANT CAR SEATS CAN BE USED: † If your Infant Car Seat is not one of the models listed above, DO NOT use your infant car seat with this car seat adapter.

2y ago

564 Views

Microsoft Advertising Travel Update

last minute cruise deals -58.50% Car Rental Queries WoW Change car rental -43.80% rental cars -46.30% car rentals -40.60% cheap car rentals -48.00% car rentals cheapest rates -52.20% rent a car- 40.30% cheap rental cars -45.60% rental car -41.80% car rental deals -49.30% rental cars lowest price -53.90% Flight Queries WoW Change cheap flights .

1y ago

337 Views

Design and development of lift for an automatic car parking system

1. Stacker type car parking system 2. Puzzle type car parking system 3. Level type car parking system 4. Chess type car parking system 5. Rotary type car parking system 6. Tower type car parking system But lift is used only in tower type car parking system. Objectives:-

6m ago

172 Views

Gold Tier - MAPFRE Insurance

Foy Insurance of MA, LLC 198 Frank Consolati Insurance Agency, Inc. 198 County Insurance Agency, Inc. 198 Woodrow W Cross Agency 214 Woodland Insurance Agency, Inc. 214 Tegeler Insurance Services of CT, Inc. 214 Pantano/VonKahle Insurance Agency, Inc. 214 . Hanson Insurance Agency, Inc. 287 J.H. Slattery Insurance Agency, Inc. 287

1y ago

565 Views

Money Online Price Comparison - WordPress

you to compare car insurance quotes. You'll notice at the top of the screen is a warning regarding telling the truth when completing any form of car insurance quote as something withheld, which later becomes known, can void an insurance claim. 7 The process of completing a car insurance price comparison is broken down into 4

1y ago

174 Views

Better car deals - Consumer Affairs Victoria

Insurance protects you against costs and liabilities if the car is stolen, vandalised or damaged in a crash. When budgeting, consider taking out at least third party car property insurance. It may be cheaper to arrange your own insurance than taking it out through the trader. Contact insurance companies to compare premiums and policy coverage.

1y ago

153 Views

Car Insurance This booklet covers:Car Rapid Bonus Business

Car Insurance This booklet covers:Car Rapid Bonus Business RAC Direct Insurance is a trading name of London and Edinburgh Insurance Company Limited. Registered in England No 924430. Registered Office: 8 Surrey Street, Norwich NR1 3NG. Member of the Aviva Group. Authorised and regulated by the Financial Services Authority. RAC052(V27)-1971-06.06 .

1y ago

218 Views

Root Insurance (ROOT) - Citron Research

Root Insurance (ROOT) Leveling the Playing Field of Car Insurance What every trader needs to know about one of the mostheavily shorted stocks in the market Traditional Credit-Based Car Insurance PerpetuatesEconomic and Racial Inequalities as one in three American cannot affordessentials because of car insurance premiums

1y ago

209 Views

Life Cycle Analysis: Uber vs. Car Ownership

(LCA) will be performed to compare ridesharing services versus car ownership. We will compare per mile average cost and CO 2 emissions . assumption of 15 years being a car's lifetime and calculated average costs for car maintenance, repairs, insurance, gas and registration. We used Economic Input Output Life Cycle Assessment .

1y ago

122 Views

Lecture 3 Aggregation And Least Squares - Historyofdsc

It looks like you're using an ad-blocker