Spatial Big Data - University Of Helsinki

1y ago

22 Views

2 Downloads

1.35 MB

47 Pages

Last View : 25d ago

Last Download : 3m ago

Upload by : Aiyana Dorn

Report this link

Download PDF

Transcription

Spatial Big DataJoe Niemi

Contents1) Introduction-2)3)4)5)what is Spatial Big Data?motivationuse casesCloud partitioningPAIRS (A scalable Spatial Big Data analytics platform)AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)Summary

Spatial Data All types of data objects or elements that have geographical information present Enables the global finding and locating of individuals or devices Also known as geospatial data, spatial information, geographic information1. Introduction

Spatial DataRaster data Geoimages (obtained by satellites for example)3D objectsVector data Points, Lines, PolygonsGraph data Road networks (an edge a road segment and a node intersection)Topological coverage1. Introduction

Topological CoverageContains both the location and attribute data1. Introduction

Spatial Big DataSpatial Big Data exceeds the capacity of commonly used spatial computing systems due to volume, variety and velocitySpatial Big Data comes from many different sources satellites, drones, vehicles, geosocial networking services, mobile devices, camerasA significant portion of big data is in fact spatial big data1. Introduction

Types of Spatial Big Data Speed every minute for everyroad-segmentGPS trace data from cell-phonesEngine measurements of fuelconsumption (can be estimated from fuellevels, distance travelled and engine idling fromengine RPM) Greenhouse gas emissions1. Introduction

Motivation1. Introduction

MotivationSBD or GIS (Geographic Information System) helps with Better decision makingSaves cost from greater efficiencyFrom ‘s ArcGIS: “Just about every problem and situation has a location aspect.”analyze spatial connectionsget information in real timespot location-related patterns that might previously have been undetected1. Introduction

Use cases for Spatial Big Data1)2)3)4)Eco routingTracking Endangered SpeciesBetter crop production, reducing costsDetecting extreme events1. Introduction

Eco routing Next generation routing service avoids congestionreduces idling at red lightsavoids left turnsEstimation: in 2020 about 600 billion is saved annually in terms of fuel and timeTakes into account various datasets real-time and historic traffic data of engine measurementsspeed-limitsroad types“rush hour vs non-rush hour”1. Introduction

Eco routing1. Introduction

Tracking endangered speciesMovebank: a free online database of animal tracking data1. Introduction2013: 970 studies over 250contributors, 41,170 tracks and 61million locations

Better crop production“If you can grow crop fast in these circumstances, query for similiar places”1. Introduction

Detecting extreme events EarthquakesWildfiresFloodingOther calamitiesHow to detect Built-in motion detectors in mobile phonesUsing unstructured data sets can be used such as tweets1. Introduction

Future New Datasets - need to rapily integrate new datasets and algorithms Computational cost increases as the diversity of Spatial Big Data grows Easy to collect, sensors (or sensor networks) are becoming more and morecommon (Internet of things)1. Introduction

Features of Spatial Big Data Access of data depends on the daytime of where it is used Changes dynamically Recent Spatial Big Data is usually being generated at a very high speed1. Introduction

Challenges of Spatial Big Data1) Retaining computational efficiency2) Storing Spatial Big Data into the cloud3) Applying new data when Spatial Big Data or change old data repartitioning isneeded1. Introduction

Cloud partitioning of Spatial Big Data If partitions are not being accessed, servers remain idle and the user is stillcharged.Most of the existing partitioning approaches co-locate frequently accessed datatogether to minimize distributed transactionsCloud providers often offer time-based pricing models - users are gettingcharged even when servers idle or have low CPU usage2. Cloud partitioning

Bad example: partitioning of Spatial Big Data5 servers store data in Europe, 5 servers store data in USA half of the servers are idle for almost a day.2. Cloud partitioning

Good example: partitioning of Spatial Big Data10 servers store data with diverse access patterns to minimize server idle-time Main drawback: Lag or latency problems due to data communication costWe need a cache for servers in Europe to contain frequently accessed data partitions in USA and vise versa2. Cloud partitioning

Good example: partitioning of Spatial Big Data6 servers store data with diverse access patterns to minimize server idle-time Main drawback: Lag or latency problems due to data communication costWe need a cache for servers in Europe to contain frequently accessed data partitions in USA and vise versa2. Cloud partitioning

Efficient partitioning method1) Split dataset to partitions based on spatial proximity minimizes query throughput2) Find partitions of diverse access patterns and combine them minimizes server idle time and maximizes server utilizationA flatness metric is used to find best possible pair. It shows how diverse access patterns are.Tabu search algorithm is used that takes into account the history of moves and prevents non-improving movesfrom happeningSaves up to 40% cost2. Cloud partitioning

An easier way to maximize server utilizationIn Amazon, based on user defined rules, scale down to a cheaper server if CPU usage isless than 40 percent does not take into account server idle-time (they still have to pay for the cheapestserver)2. Cloud partitioning

PAIRSis a cloud service deployed on top of Hadoop and HBase PAIRS Physical Analytics Integrated Repository and ServicesAutomatically updates, joins and homogenizes historical and real-time spatial bigdata that is then available for real-time modeling and analyticsData is indexed globallyData queries of an area or a single point parallelized by MapReduce for example a query for a single point (latitude, longitude) for a data layer with daily informationfor 10 year period, can be retrieved in less than 1 second.3. PAIRS (A scalable Spatial Big Data analytics platform

Global indexing3. PAIRS (A scalable Spatial Big Data analytics platform

PAIRS Eliminates data preprocessing by having all data layers curated and homogenizedbefore being uploaded to the platformData curation means “organization and integration of data collected from varioussources so that the value of the data is maintained over time, and the data remainsavailable for reuse and preservation”The challenging task is to process unstructured data3. PAIRS (A scalable Spatial Big Data analytics platform

PAIRS3. PAIRS (A scalable Spatial Big Data analytics platform

Pairs architecture as a cloud service where a query retrieves metadata from a relationaldatabase (PostgreSQL) and pulls spatial data from HBase3. PAIRS (A scalable Spatial Big Data analytics platform

AQWAAdaptive Query-Workload-Aware partitioning of Spatial Big Data

MotivationExisting cluster-based systems for processing spatial big data uses static partitioning methods that cannot efficiently react to data changes SpatialHadoop supports static partitioning to handle spatial big data Query workload is bad4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data

Overview of AQWATwo main components:1) a k-d tree of the data2) a set of Main-Memory structures- statistics of data distribution andthe queries to data- flushed to a disk in the case of asystem failure4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

Overview of AQWAFour processes:1) Initialization2) Query Execution3) Data Acquisition4) Repartitioning4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

Partitioning of AQWA“Partitioned areas that are queried with high frequency need to be partitioned muchmore often in comparison to other less queried areas” significant savings in query processing time4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

Partitioning of AQWAAn example of a k-d tree with 7 leaf partitions4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

Partitioning of AQWARepartitioning of the spatial big data helps with query workload4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

Partitioning of AQWA1) How do I know manyqueries overlap a square?2) Why not split all of the datainto small pieces?3) How to efficientlydetermine the best split?4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

1) How do I know how many queries overlap a square?You can get the answer in constant time O(1)For each grid, the main memory has info ofqueries count and data items count4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

2) Why not just split all of the data into small pieces?Main memory becomes aperformance bottleneck we have max size for eachpartition (the block size forexample 128MB in HDFS isthe minimum size for apartition)4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

3) How to efficiently determine the best split? PriorityqueueHistory of all queries thathave been processedTime-Fading Weights to avoid unnecessarypartitioningCost function integrates the datadistribution and the queryworkload4. AQWA (Adaptive Query-Workload-Aware partitioning of Spatial Big Data)

SummaryUsage of spatial big data depends on the location of the userthe daytime of accessMost of the spatial big data is dynamic query workload of spatial big data can change and you should react to itnew data applied on hourly / daily basisSpatial big data has many different use cases

SummaryTo efficiently handle spatial big data the data should have diverse access patterns in each clusterit needs to be repartitioned according to query workload changes areas that are queried with high frequency should be partitioned more often in comparison to lessqueries areasavoid partitioning from a scratchuse history of the workload with fading weights

ReferencesSpatial big-data challenges intersecting mobility and cloud computing, Authors: Shekhar, Shashi and Gunturi,Viswanath and Evans, Michael R and Yang, KwangSoo, Year 2012Geospatial big data: challenges and opportunities, Authors: Lee, Jae-Gil and Kang, Minseo, Year 2015PAIRS: A scalable geo-spatial data analytics platform, Authors: Klein, Levente J and Marianno, Fernando J andAlbrecht, Conrad M and Freitag, Marcus and Lu, Siyuan and Hinds, Nigel and Shao, Xiaoyan and BermudezRodriguez, Sergio and Hamann, Hendrik F, Year 2015Cost-efficient partitioning of spatial data on cloud, Authors: Akdogan, Afsin and Indrakanti, Saratchandra andDemiryurek, Ugur and Shahabi, Cyrus, Year 2015AQWA: adaptive query workload aware partitioning of big spatial data, Authors: Aly, Ahmed M and Mahmood,Ahmed R and Hassan, Mohamed S and Aref, Walid G and Ouzzani, Mourad and Elmeleegy, Hazem and Qadah,Thamir, Year 2015

Questions?

Spatial Big Data Spatial Big Data exceeds the capacity of commonly used spatial computing systems due to volume, variety and velocity Spatial Big Data comes from many different sources satellites, drones, vehicles, geosocial networking services, mobile devices, cameras A significant portion of big data is in fact spatial big data 1. Introduction

Related Documents:

The Era of Big Spatial Data: A Survey

The importance of big spatial data, which is ill-supported in the systems mentioned above, motivated many researchers to extend these systems to handle big spatial data. In this paper, we survey the ex-isting work in the area of big spatial data. The goal is to cover the different approaches of processing big spatial data in a distributed en-

15 Views

1y ago

Spatial Big Data Analytics for Urban Informatics

and novel applications of Spatial Big Data Analytics for Urban Informatics. In this thesis, we de ne spatial big data and propose novel approaches for storing and analyzing two popular spatial big data types: GPS trajectories and spatio-temporal networks. We conclude the thesis by exploring future work in the processing of spatial big data. iii

11 Views

1y ago

Application in Augmented Reality for Learning Mathematical Functions: A ...

The term spatial intelligence covers five fundamental skills: Spatial visualization, mental rotation, spatial perception, spatial relationship, and spatial orientation [14]. Spatial visualization [15] denotes the ability to perceive and mentally recreate two- and three-dimensional objects or models. Several authors [16,17] use the term spatial vis-

16 Views

1y ago

SQL SUPPORTED SPATIAL ANALYSIS FOR WEB-GIS - Purdue University College ...

advanced spatial analysis capabilities. OGIS SQL standard contains a set of spatial data types and functions that are crucial for spatial data querying. In our work, OGIS SQL has been implemented in a Web-GIS based on open sources. Supported by spatial-query enhanced SQL, typical spatial analysis functions in desktop GIS are realized at

14 Views

1y ago

Big Data Analytics Turning Big Data Into Big Money

The Rise of Big Data Options 25 Beyond Hadoop 27 With Choice Come Decisions 28 ftoc 23 October 2012; 12:36:54 v. . Gauging Success 35 Chapter 5 Big Data Sources.37 Hunting for Data 38 Setting the Goal 39 Big Data Sources Growing 40 Diving Deeper into Big Data Sources 42 A Wealth of Public Information 43 Getting Started with Big Data .

55 Views

1y ago

Application and Platform Design of Geospatial Big Data

of geospatial basic big data, a complete geospatial big data is formed, which provides the basic data source for the following geospatial big data application, national spatial information infrastructure platform, projectinformation system, etc. 3. APPLICATIONS OF GEOSPATIAL BIG DATA The geospatial big data is widely used in the Internet, obile M

35 Views

1y ago

On the Spatial Graph

Spatial graph is a spatial presen-tation of a graph in the 3-dimensional Euclidean space R3 or the 3-sphere S3. That is, for a graph G we take an embedding / : G —» R3, then the image G : f(G) is called a spatial graph of G. So the spatial graph is a generalization of knot and link. For example the figure 0 (a), (b) are spatial graphs of a .

14 Views

1y ago

41-188.2 Overfill Prevention

Tank Gauge) API 2350 categorizes storage tanks by the extent to which personnel are in attendance during receiving operations. The overfill prevention methodology is based upon the tank catagory. Category 1 Fully Attended Personnel must always be on site during the receipt of product, must monitor the receipt continuously during the first and last hours, and must verify receipt each hour .

62 Views

3y ago

Recent Views

Microsoft Advertising Travel Update

last minute cruise deals -58.50% Car Rental Queries WoW Change car rental -43.80% rental cars -46.30% car rentals -40.60% cheap car rentals -48.00% car rentals cheapest rates -52.20% rent a car- 40.30% cheap rental cars -45.60% rental car -41.80% car rental deals -49.30% rental cars lowest price -53.90% Flight Queries WoW Change cheap flights .

1y ago

337 Views

IN THIS ISSUE CAR WASH INSIGHT Recent, Notable M&A Transactions .

9/8/2022 Club Car Wash Sites of Tidal Wave Express Car Wash 8 8/29/2022 Take 5 Car Wash Soft Touch Car Wash, Auto Oasis Car Wash, Clearwater Car Wash and Birdie's Car Wash 5 8/25/2022 WhiteWater Express Geaux Clean Car Wash 7 8/19/2022 ModWash Home Team Car Wash 3 8/18/2022 Splash In ECO Car Wash (Wills Group) Blue Hen Car Wash 2

9m ago

100 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

673 Views

Decision Tree Tutorial by Kardi Teknomo - TAN THIAM HUAT 陳添發

Male 1 Cheap Medium Bus Female 1 Cheap Medium Train Female 0 Cheap Low Bus Male 1 Cheap Medium Bus Male 0 Standard Medium Train Female 1 Standard Medium Train Female 1 Expensive High Car Male 2 Expensive Medium Car Female 2 Expensive High Car Based on above training data, we can induce a decision tree as the following:

10m ago

84 Views

ESSENTIAL PLAN - Discovery

Car insurance only Car and home insurance Car insurance only Car and home insurance 12.5% 25% 5% 10% YOUR FUEL CASH BACK PERCENTAGE GET TO THE HIGHEST CASH BACK PERCENTAGE Add at least R250 000 of home insurance (household contents, buildings or both) Take your car to Tiger Wheel & Tyre and pass the Annual MultiPoint check

1y ago

269 Views

CAR INSURANCE EVERYTHING EXPLAINED - RSA Insurance Group

CAR INSURANCE 93013821.indd 1 15/03/2018 10:46. 2 WELCOME TO µ CAR INSURANCE Thank you for choosing µ to protect you and your car. This booklet is intended to help you check your cover and to reassure you that µ will give you the protection you need for the year ahead. First of all, to help you understand your car insurance policy we want to .

1y ago

274 Views

Describe types and purposes of insurance.

D.O. CAPS Consumer Skills: Insurance—10E 3 Your car - The car you drive can also affect your insurance rates. Insurance companies place certain kinds of cars in special risk categories. You should ask your insurance agent before making a car purchase to make sure you aren't getting a car that will cost you extra for your liability insurance.

1y ago

233 Views

Cruising for Customers - Experian

1. airline tickets 2. cheap airline tickets 3. rental cars 4. car rental 5. flights 6. hotels 7. cruises 8. cheap hotels 9. airlines 10. car rentals last minute travel last minute cruises all inclusive resorts cheap vacation packages cruise deals all inclusive vacations vacation rentals

1y ago

202 Views

-xglfldo:Dwfk Xjxvw Wkurxjk)2,

Affordable Care Act - insurance comparison, cheapest insurance, cheap health insurance NJ, cheapest insurance company Priority One High Volume - Washington state health insurance plans, affordable health insurance The best performing ad copy included those that made specific reference to finding "health insurance" for

1y ago

259 Views

Contours Options Infant Car Seat Adapter Instruction Sheet

your Infant Car Seat, as described in the instruction manual provided by the Infant Car Seat manufacturer. † WHEN USING ONLY ONE INFANT CAR SEAT ADAPTER OR TWO FOR TWINS, THE FOLLOWING INFANT CAR SEATS CAN BE USED: † If your Infant Car Seat is not one of the models listed above, DO NOT use your infant car seat with this car seat adapter.

2y ago

564 Views

Design and development of lift for an automatic car parking system

1. Stacker type car parking system 2. Puzzle type car parking system 3. Level type car parking system 4. Chess type car parking system 5. Rotary type car parking system 6. Tower type car parking system But lift is used only in tower type car parking system. Objectives:-

6m ago

172 Views

Gold Tier - MAPFRE Insurance

Foy Insurance of MA, LLC 198 Frank Consolati Insurance Agency, Inc. 198 County Insurance Agency, Inc. 198 Woodrow W Cross Agency 214 Woodland Insurance Agency, Inc. 214 Tegeler Insurance Services of CT, Inc. 214 Pantano/VonKahle Insurance Agency, Inc. 214 . Hanson Insurance Agency, Inc. 287 J.H. Slattery Insurance Agency, Inc. 287

1y ago

565 Views

Bilinear Prediction Using Low-Rank Models

car insurance geico insurance need cheap auto insurance geico com car insurance coupon code Inderjit S. Dhillon Dept of Computer Science UT Austin Low-Rank Bilinear Prediction. Modern Prediction Problems in Machine Learning Wikipedia Tag Recommendation Learning in computer vision

1y ago

133 Views

Car Insurance This booklet covers:Car Rapid Bonus Business

Car Insurance This booklet covers:Car Rapid Bonus Business RAC Direct Insurance is a trading name of London and Edinburgh Insurance Company Limited. Registered in England No 924430. Registered Office: 8 Surrey Street, Norwich NR1 3NG. Member of the Aviva Group. Authorised and regulated by the Financial Services Authority. RAC052(V27)-1971-06.06 .

1y ago

218 Views

Root Insurance (ROOT) - Citron Research

Root Insurance (ROOT) Leveling the Playing Field of Car Insurance What every trader needs to know about one of the mostheavily shorted stocks in the market Traditional Credit-Based Car Insurance PerpetuatesEconomic and Racial Inequalities as one in three American cannot affordessentials because of car insurance premiums

1y ago

209 Views

Spatial Big Data - University Of Helsinki

It looks like you're using an ad-blocker