Introduction And Database Technology - Universiteit Leiden

1y ago

10 Views

2 Downloads

1.72 MB

86 Pages

Last View : 17d ago

Last Download : 3m ago

Upload by : Axel Lin

Report this link

Download PDF

Transcription

IntroductionandDatabase TechnologyBy EM BakkerSeptember 11, 2012Databases and Data Mining1

DBDM Introduction Databases and Data Mining Projects at LIACS Biological and Medical Databases and Data Mining CMSB (Phenotype Genotype), DIAL CGH DB Cyttron: Visualization of the CellGRID Computing VLe: Virtual Lab e-Science environments DAS3/DAS4 super computerResearch on Fundamentals of Databases and DataMining Database integrationData Mining algorithmsContent Based RetrievalSeptember 11, 2012Databases and Data Mining2

DBDMDatabases (Chapters 1-7): The Evolution of Database Technology Data Preprocessing Data Warehouse (OLAP) & Data Cubes Data Cubes Computation Grand Challenges and State of the ArtSeptember 11, 2012Databases and Data Mining3

DBDMData Mining (Chapters 8-11): Introduction and Overview of Data Mining Data Mining Basic Algorithms Mining data streams Mining Sequence Patterns Graph MiningSeptember 11, 2012Databases and Data Mining4

DBDMFurther Topics Mining object, spatial, multimedia, text and Web data Mining complex data objects Spatial and spatiotemporal data mining Multimedia data mining Text mining Web miningApplications and trends of data mining Mining business & biological data Visual data mining Data mining and society: Privacy-preserving data miningSeptember 11, 2012Databases and Data Mining5

[R] EvolutionofDatabase TechnologySeptember 11, 2012Databases and Data Mining6

Evolution of Database Technology 1960s: 1970s: (Electronic) Data collection, database creation, IMS (hierarchicaldatabase system by IBM) and network DBMSRelational data model, relational DBMS implementation1980s: RDBMS, advanced data models (extended-relational, OO,deductive, etc.)Application-oriented DBMS (spatial, scientific, engineering, etc.)September 11, 2012Databases and Data Mining7

Evolution of Database Technology 1990s: Data mining, data warehousing, multimedia databases, and Webdatabases2000s Stream data management and mining Data mining and its applications Web technology XML Data integration Social Networks September 11, 2012Cloud Computingglobal information systemsDatabases and Data Mining8

The Future of the Past The Past and Future of 1997: Database Systems: ATextbook Case of Research Paying Off. By: J.N. Gray,Microsoft 1997The Future of 1996: Database Research: Achievementsand Opportunities Into the 21st Century. By: Silberschatz,M. Stonebraker, J. Ullman. Eds. SIGMOD Record, Vol. 25,No. pp. 52-63 March 1996“One Size Fits All”: An Idea Whose Time Has Come andGone. By: M. Stonebraker, U. Cetintemel. Proceedings ofThe 2005 International Conference on Data Engineering,April 2005, http://ww.cs.brown.edu/ ugur/fits all.pdfSeptember 11, 2012Databases and Data Mining9

“Database Systems:A Textbook Case of Research Paying Off”,J.N. Gray, zowska/cra/database.htmlSeptember 11, 2012Databases and Data Mining10

Industry Profile (1994) (1/2) The database industry 7 billion in revenue in1994, growing at 35% per year. Second only to operating system software. All of the leading corporations are US-based: IBM,Oracle, Sybase, Informix, Computer Associates,and Microsoft Specialty vendors: Tandem: fault-toleranttransaction processing systems; AT&T-Teradata:data mining systemsSeptember 11, 2012Databases and Data Mining11

Industry Profile (1994) (2/2) Small companies for application-specificdatabases: -- text retrieval, spatial andgeographical data, scientific data, image data, etc. Emerging group of companies: object-orienteddatabases. Desktop databases an important market focusedon extreme ease-of-use, small size, anddisconnected operation.September 11, 2012Databases and Data Mining12

Worldwide Vendor Revenue Estimates from RDBMS Software,Based on Total Software Revenue, 2006 (Millions of Dollars)20062006 MarketShare (%)20052005 Market Share(%)2005-2006Growth 4.2CompanyOtherVendorsTotalSource: Gartner Dataquest (June 2007)September 11, 2012Databases and Data Mining13

Historical Perspective36 years of Database ResearchPeriod 1960 - 1996September 11, 2012Databases and Data Mining14

Historical Perspective (1960-) Companies began automating theirback-office bookkeeping in the 1960s.COBOL and its record-oriented filemodel were the work-horses of thiseffort.Typical work-cycle:1. a batch of transactions was applied to theold-tape-master2. a new-tape-master produced3. printout for the next business day. COmmon Business-Oriented Language(COBOL 2002 standard)September 11, 2012Databases and Data Mining15

COBOLA quote by Prof. dr. E.W. Dijkstra,18 June 1975:“The use of COBOL cripples themind; its teaching should,therefore, be regarded as acriminal offence.”But: In 2012 still vacanciesavailable for COBOL programmers.September 11, 2012Databases and Data Mining16

COBOL Code (just an example!)01 LOAN-WORK-AREA.03 LW-LOAN-ERROR-FLAG03 LW-LOAN-AMT03 LW-INT-RATE03 LW-NBR-PMTS03 LW-PMT-AMT03 LW-INT-PMT03 LW-TOTAL-PMTS03 LW-TOTAL-INT*004000-COMPUTE-PAYMENT.*MOVE 0 TO LW-LOAN-ERROR-FLAG.PIC 9(01)COMP.PIC 9(06)V9(02) COMP.PIC 9(02)V9(02) COMP.PIC 9(03)COMP.PIC 9(06)V9(02) COMP.PIC 9(01)V9(12) COMP.PIC 9(06)V9(02) COMP.PIC 9(06)V9(02) COMP.IF (LW-LOAN-AMT ZERO)OR(LW-INT-RATE ZERO)OR(LW-NBR-PMTS ZERO)MOVE 1 TO LW-LOAN-ERROR-FLAGGO TO 004000-EXIT.COMPUTE LW-INT-PMT LW-INT-RATE / 1200ON SIZE ERRORMOVE 1 TO LW-LOAN-ERROR-FLAGGO TO 004000-EXIT.September 11, 2012Databases and Data Mining17

Historical Perspective (1970’s) Transition from handling transactions in daily batches tosystems that managed an on-line database that couldcapture transactions as they happened. At first these systems were ad hoc Late in the 60’s, "network" and "hierarchical" databaseproducts emerged. A network data model standard (DBTG) was defined,which formed the basis for most commercial systemsduring the 1970’s.In 1980 DBTG-based Cullinet was the leading softwarecompany. September 11, 2012Databases and Data Mining18

Network Model hierarchical model: a tree of records, with each record havingone parent record and many childrennetwork model: each record can have multiple parent and childrecords, i.e. a lattice of recordsSeptember 11, 2012Databases and Data Mining19

Historical PerspectiveDBTG problems: DBTG used a procedural language that was low-levelrecord-at-a-time The programmer had to navigate through the database,following pointers from record to record If the database was redesigned, then all the old programshad to be rewrittenSeptember 11, 2012Databases and Data Mining20

The "relational" data modelThe "relational" data model, by Ted Codd in his landmark 1970article “A Relational Model of Data for Large Shared DataBanks", was a major advance over DBTG. The relational model unified data and metadata only oneform of data representation. A non-procedural data access language based on algebra orlogic. The data model is easier to visualize and understand than thepointers-and-records-based DBTG model. Programs written in terms of the "abstract model" of the data,rather than the actual database design programsinsensitive to changes in the database design.September 11, 2012Databases and Data Mining21

The "relational" data model success Both industry and university research communitiesembraced the relational data model and extendedit during the 1970s. It was shown that a high-level relational databasequery language could give performancecomparable to the best record-oriented databasesystems. (!) This research produced a generation of systemsand people that formed the basis for IBM's DB2,Ingres, Sybase, Oracle, Informix and others.September 11, 2012Databases and Data Mining22

The "relational" data model successSQL The SQL relational database language wasstandardized between 1982 and 1986. By 1990, virtually all database systems providedan SQL interface (including network, hierarchicaland object-oriented database systems).September 11, 2012Databases and Data Mining23

Ingres at UC Berkeley in 1972 (1/2)Inspired by Codd's work on the relational databasemodel, (Stonebraker, Rowe, Wong, and others) aproject that resulted in: the design and build of a relational databasesystemthe query language (QUEL)relational optimization techniquesa language binding techniquestorage strategiespioneering work on distributed databasesSeptember 11, 2012Databases and Data Mining24

Ingres at UC Berkeley in 1972 (2/2)The academic system evolved into Ingres fromComputer Associates.Nowadays: PostgreSQL; also the basis for a newobject-relational system.Further work on: distributed databases database inference active databases (automatic responsing) extensible databases.September 11, 2012Databases and Data Mining25

IBM: System R (1/2) Codd's ideas were inspired by the problems withthe DBTG network data model and with IBM'sproduct based on this model (IMS). Codd's relational model was very controversial: too simplistic could never give good performance. IBM Research chartered a 10-person effort toprototype a relational system a prototype,System R (evolved into the DB2 product)September 11, 2012Databases and Data Mining26

IBM: System R (2/2) Defined the fundamentals on: query optimization, data independence (views), transactions (logging and locking), and security (the grant-revoke model). SQL from System R became more or less thestandard. The System R group further research: distributed databases (project R*) and object-oriented extensible databases (projectStarburst).September 11, 2012Databases and Data Mining27

The database research agenda of the 1980’sExtending Relational Databases geographically distributed databases parallel data access. Theoretical work on distributed databases led toprototypes which in turn led to products. Note: Today, all the major database systems offer the ability todistribute and replicate data among nodes of a computer network.Execution of each of the relational data operatorsin parallel hundred-fold and thousand-foldspeedups. Note: The results of this research appear nowadays in theproducts of several major database companies. Especiallybeneficial for data warehousing, and decision support systems;effective application in the area of OLTP is challenging.September 11, 2012Databases and Data Mining28

USA funded database researchperiod 1970 - 1996: Projects at UCLA Teradata Projects at CCA (SDD-1, Daplex, Multibase, andHiPAC): Projects at Stanford: distributed database technologyobject-oriented database technologydeductive database technologydata integration technologyquery optimization technology.Projects at CMU: general transaction models Transarc Corporation.September 11, 2012Databases and Data Mining29

The Future of 1997 (Gray)Conclusions Database systems continue to be a key aspect ofComputer Science & Engineering. Representing knowledge within a computer is one of thecentral challenges of the field. Database research has focused primarily on thisfundamental issue.(1/4)September 11, 2012Databases and Data Mining30

The Future of 1997 (Gray)Conclusions There continues to be active and valuableresearch on: Representing and indexing data,adding inference to data search: inductivereasoning compiling queries more efficiently, executing queries in parallelSeptember 11, 2012Databases and Data Mining(2/4)31

The Future of 1997 (Gray)ConclusionsThere continues to be active and valuable researchon: integrating data from heterogeneous data sources, analyzing performance, and extending the transaction model to handle longtransactions and workflow (transactions that involvehuman as well as computer steps).The availability of very-large-scale (tertiary) storagedevices has prompted the study of models for querieson very slow devices.(3/4)September 11, 2012Databases and Data Mining32

The Future of 1997 (Gray)Conclusions Unifying object-oriented concepts with therelational model. New datatypes (image, document, drawing) arebest viewed as the methods that implement themrather than the bytes that represent them. By adding procedures to the database system,one gets active databases, data inference, anddata encapsulation. The object-oriented approachis an area of active research.(4/4)September 11, 2012Databases and Data Mining33

The Future of 1996Database Research: Achievements andOpportunities Into the 21st Century.Silberschatz, M. Stonebraker, J. Ullman Eds.SIGMOD Record, Vol. 25, No. 1pp. 52-63March 1996September 11, 2012Databases and Data Mining34

New Database Applications (1996) EOSDIS (Earth Observing System Data andInformation System) Electronic Commerce Health-Care Information Systems Digital Publishing Collaborative DesignSeptember 11, 2012Databases and Data Mining35

EOSDIS (Earth Observing SystemData and Information System)Challenges: Providing on-line access topetabyte-sized databasesand managing tertiarystorage effectively. Supporting thousands ofinformation consumerswith very heavy volume ofinformation requests,including ad-hoc requestsand standing orders fordaily updates. Providing effectivemechanisms for browsingand searching for thedesired data,September 11, 2012Databases and Data Mining36

Electronic CommerceHeterogeneous informationsources must beintegrated. For example,something called a"connector“ in one catalogmay not be a "connector“in a different catalog "schema integration“ is awell-known and extremelydifficult problem.Electronic commerce needs: Reliable Distributed Authentication Funds transfer.September 11, 2012Databases and Data Mining37

Health-Care Information SystemsTransforming the healthcare industry to takeadvantage of what isnow possible will have amajor impact on costs,and possibly on qualityand ubiquity of care aswell.Problems to be solved: Integration of heterogeneous forms of legacy information. Access control to preserve the confidentiality of medicalrecords. Interfaces to information that are appropriate for use byall health-care professionals.September 11, 2012Databases and Data Mining38

Digital Publishing Management and delivery of extremely largebodies of data at very high rates. Typical dataconsists of very large objects in the megabyte togigabyte range (1996) Delivery with real-time constraints. Protection of intellectual property, including costeffective collection of small payments andinhibitions against reselling of information. Organization of and access to overwhelmingamounts of information.September 11, 2012Databases and Data Mining39

The Information SuperhighwayDatabases and database technologywill play a critical role in thisinformation explosion. AlreadyWebmasters (administrators ofWorld-Wide- Web sites) arerealizing that they are databaseadministrators September 11, 2012Databases and Data Mining40

Support for Multimedia Objects (1996) Tertiary Storage (for petabytestorage) New Data Types timely and realistic presentation ofthe data?gracefully degradation service? Canwe interpolate or extrapolate someof the data? Can we reject newservice requests or cancel old ones?Multi-resolution Queries The operations available for eachtype of multimedia data, and theresulting implementation tradeoffs.The integration of data involvingseveral of these new types.Quality of Service Tape silosDisk juke-boxesContent Based RetrievalUser Interface SupportSeptember 11, 2012Databases and Data Mining41

New Research Directions (1996) Problems associated with putting multimediaobjects into DBMSs.Problems involving new paradigms for distributionof information.New uses of databases New transaction models Data MiningData WarehousesRepositoriesWorkflow ManagementAlternative Transaction ModelsProblems involving ease of use and managementof databases.September 11, 2012Databases and Data Mining42

Conclusions of the Forum (1996)The database research community has a foundational role in creating the technologicalinfrastructure from which database advancements evolve. New research mandate because of the explosions inhardware capability, hardware capacity, andcommunication (including the internet or "web“ andmobile communication). Explosion of digitized information require the solution tosignificant new research problems: support for multimedia objects and new data typesdistribution of informationnew database applicationsworkflow and transaction managementease of database management and useSeptember 11, 2012Databases and Data Mining43

“One Size Fits All”:An Idea Whose Time Has Come and Gone.M. Stonebraker, U. CetintemelProceedingsofThe 2005 International Conferenceon Data EngineeringApril 2005http://ww.cs.brown.edu/ ugur/fits all.pdfSeptember 11, 2012Databases and Data Mining44

DBDMS Services OverviewSeptember 11, 2012Databases and Data Mining45

DBDMS Services OverviewSeptember 11, 2012Databases and Data Mining46

DBMS: “One size fits all.”Single code line with all DBMS Services solves: Cost problem: maintenance costs of a single codeline Compatibility problem: all applications will runagainst the single code line Sales problem: easier to sell a single code linesolution to a customer Marketing problem: single code line has an easiermarket positioning than multiple code lineproductsSeptember 11, 2012Databases and Data Mining47

DBMS: “One size fits all.” To avoid these problems, all the major DBMSvendors have followed the adage “put all woodbehind one arrowhead”.In this paper it is argued that this strategyhas failed already, and will fail moredramatically off into the future.September 11, 2012Databases and Data Mining48

Data Warehousing In the early 1990’s, a new trend appeared: Enterpriseswanted to gather together data from multiple operationaldatabases into a data warehouse for business intelligencepurposes.A typical large enterprise has 50 or so operationalsystems, each with an on-line user community who expectfast response time.System administrators were (and still are) reluctant toallow business-intelligence users onto the same systems,fearing that the complex ad-hoc queries from these userswill degrade response time for the on-line community.In addition, business-intelligence users often want to seehistorical trends, as well as correlate data from multipleoperational databases. These features are very differentfrom those required by on-line users.September 11, 2012Databases and Data Mining49

Data WarehousingData warehouses are very different from OnlineTransaction Processing (OLTP) systems: OLTP systems have been optimized for updates, as themain business activity is typically to sell a good orservice.In contrast, the main activity in data warehouses is adhoc queries, which are often quite complex.Hence, periodic load of new data interspersed with adhoc query activity is what a typical warehouseexperiences.September 11, 2012Databases and Data Mining50

Data WarehousingThe standard wisdom indata warehouseschemas is tocreate a fact table:“who, what, when,where” about eachoperational transaction.September 11, 2012Databases and Data Mining51

Data Warehousing Data warehouse applications run much betterusing bit-map indexes OLTP (Online Transaction Processing) applicationsprefer B-tree indexes. materialized views are a useful optimization tacticin data warehousing, but not in OLTP worlds.September 11, 2012Databases and Data Mining52

Data WarehousingBitmapsAs a first approximation, mostvendors have a warehouse DBMS (bit-mapindexes, materialized views, starschemas and optimizer tactics forstar schema queries) ale015Male016Female10 OLTP DBMS (B-tree indexesand a standard cost-basedoptimizer), which are unitedby a common parserSeptember 11, 2012Databases and Data Mining53

Emerging ApplicationsSome other examples that show:Why conventional DBDMs willnot perform on the current emerging applications.September 11, 2012Databases and Data Mining54

Emerging Sensor Based Applications Sensoring Army Battalionof 30000 humans and12000 vehicles x.10 6sensorsMonitoring TrafficAmusements Park TagsHealth CareLibrary booksEtc.September 11, 2012Databases and Data Mining55

Emerging Sensor Based Applications There is widespreadspeculation thatconventional DBMSs willnot perform well on thisnew class of monitoringapplications.For example: Linear Road,traditional solutions arenearly an order ofmagnitude slower than aspecial purpose streamprocessing engineSeptember 11, 2012Databases and Data Mining56

Example: An existing application:financial-feed processingMost large financial institutions subscribe to feedsthat deliver real-time data on market activity,specifically: News consummated trades bids and asks etc.For example: Reuters Bloomberg InfodyneSeptember 11, 2012Databases and Data Mining57

Example: An existing application:financial-feed processingFinancial institutions have a variety of applicationsthat process such feeds. These include systemsthat produce real-time business analytics,perform electronic trading,ensure legal compliance of all trades to the variouscompany and SEC rulescompute real-time risk and marketexposure to fluctuations in foreign exchange rates.The technology used to implement this class ofapplications is invariably “roll your own”, becauseno good off-the-shelf system software productsexist.September 11, 2012Databases and Data Mining58

Example: An existing application:financial-feed processingExample Detect Problems in Streaming stock ticks: Specifically, there are 4500 securities, 500 ofwhich are “fast moving”.Defined by rules: A stock tick on one of these securities is late if itoccurs more than five seconds after the previoustick from the same security. The other 4000 symbols are slow moving, and atick is late if 60 seconds have elapsed since theprevious tick.September 11, 2012Databases and Data Mining59

Stream ProcessingSeptember 11, 2012Databases and Data Mining60

Performance The example application was implemented in theStreamBase stream processing engine (SPE) [5],which is basically a commercial, industrialstrength version of Aurora [8, 13].On a 2.8Ghz Pentium processor with 512 Mbytesof memory and a single SCSI disk, the workflow inthe previous figure can be executed at 160,000messages per second, before CPU saturation isobserved.In contrast, StreamBase engineers could only get900 messages per second from an implementationof the same application using a popularcommercial relational DBMS.September 11, 2012Databases and Data Mining61

Why?: Outbound vs Inbound ProcessingRDBMS(Outbound Processing)September 11, 2012StreamBase(Inbound Processing)Databases and Data Mining62

Inbound ProcessingSeptember 11, 2012Databases and Data Mining63

Outbound vs Inbound Processing DBMSs are optimized for outbound processing Stream processing engines are optimized forinbound processing. Although it seems conceivable to construct anengine that is either an inbound or an outboundengine, such a design is clearly a research project.September 11, 2012Databases and Data Mining64

Other Issues: Correct Primitives forStreams SQL systems contain a sophisticated aggregation system, whereby a user canrun a statistical computation over groupings of the records from a table in adatabase. When the execution engine processes the last record in the table, itcan emit the aggregate calculation for each group of records. However, streams can continue forever and there is no notion of “end oftable”. Consequently, stream processing engines extend SQL with the notion oftime windows. In StreamBase, windows can be defined based on clock time, number ofmessages, or breakpoints in some other attribute.September 11, 2012Databases and Data Mining65

Other Issues: Integration of DBMS Processingand Application Logic (1/2) Relational DBMSs were all designed to have clientserver architectures. In this model, there are many client applications,which can be written by arbitrary people, andwhich are therefore typically untrusted. Hence, for security and reliability reasons, theseclient applications are run in a separate addressspace from the DBMS.September 11, 2012Databases and Data Mining66

Other Issues: Integration of DBMS Processingand Application Logic (2/2) In an embedded processing model, it isreasonable to freely mix application logiccontrol logic andDBMS logicThis is what StreamBase does.September 11, 2012Databases and Data Mining67

Other Issues: High Availability It is a requirement of many stream-basedapplications to have high availability (HA) and stayup 7x24. Standard DBMS logging and crash recoverymechanisms are ill-suited for the streaming world The obvious alternative to achieve high availabilityis to use techniques that rely on Tandem-styleprocess pairs Unlike traditional data-processing applications thatrequire precise recovery for correctness, manystream-processing applications can tolerate andbenefit from weaker notions of recovery.September 11, 2012Databases and Data Mining68

Other Issues: Synchronization Traditional DBMSs use ACID transactions betweenconcurrent transactions submitted by multipleusers for example to induce isolation. (heavyweight) In streaming systems, which are not multi-user, aconcept like isolation can be effectively achievedthrough simple critical sections, which can beimplemented through light-weight semaphores.ACID Atomicity, Consistency, Isolation(transactions can be executed in isolation),DurabilitySeptember 11, 2012Databases and Data Mining69

One Size Fits All?September 11, 2012Databases and Data Mining70

One Size Fits All?Conclusions Data warehouses: store data by column ratherthan by row; read oriented Sensor networks: flexible light-way databaseabstractions, as TinyDB; data movement vs datastorage Text Search: standard RDBMS too heavy weightand inflexible Scientific Databases: multi dimensional indexing,application specific aggregation techniques XML: how to store and manipulate XML dataSeptember 11, 2012Databases and Data Mining71

The Fourth ParadigmeScienceand thewww.fourthparadigm.org(2009)September 11, 2012Databases and Data Mining72

Four Science Paradigms (J.Gray, 2007) Thousand years ago:science was empiricaldescribing natural phenomena Last few hundred years:theoretical branchusing models, generalizations Last few decades:a computational branch 2 a 4π Gρc2 Κ 2a 3a .simulating complex phenomena Today:data exploration (eScience)unify theory, experiment, and simulation Data captured by instrumentsOrgenerated by simulator Processed by software Information/Knowledge stored in computer Scientist analyzes database / filesusing data management and statisticsSeptember 11, 2012Databases and Data Mining73

The eScience ChallengeNovel Tools needed for: Data Capturing Data Curation Data Analysis Data Communication and PublicationInfrastructureSeptember 11, 2012Databases and Data Mining74

Gray’s LawsDatabase-centric Computing in ScienceHow to approach data engineering challengesrelated to large scale scientific datasets[1]: Scientific computing is becoming increasinglydata intensive.The solution is in a “scale-out” architecture.Bring computations to the data, rather than datato the computations.Start the design with the “20 queries.”Go from “working to working.”[1] A.S. Szalay, J.A. Blakeley, The 4th Paradigm, 2009September 11, 2012Databases and Data Mining75

VLDB 2010 J. Cho, H. Garcia Molina, Dealing with Web Data:History and Look ahead, VLDB 2010, Singapore,2010. D. Srivastava, L. Golab, R. Greer, T. Johnson, J.Seidel, V. Shkapenyuk, O. Spatscheck, J. Yates,Enabling Real Time Data Analysis, VLDB 2010,Singapore, 2010. P. Matsudaira, High-End Biological ImagingGenerates Very Large 3D and Dynamic Datasets,VLDB 2010, Singapore, 2010.September 11, 2012Databases and Data Mining76

VLDB 2011Keynotes T. O’Reilly, Towards a Global Brain,. D. Campbell, Is it still “Big Data” if it fits in mypocket?”,.Novel Subjects Social Networks,MapReduce (Hadoop) , Crowdsourcing, and MiningInformation Integration and Information RetrievalSchema Mapping, Data Exchange, Disambiguation ofNamed EntitiesGPU Based Architecture and Column-store indexingSeptember 11, 2012Databases and Data Mining77

VLDB 2012 Keynotes g the Spatial Web billions of queries/weekWeb objects that are near the location where query was issuedNew challenges on: Spatial web data management Relevance ranking based on text and location Low latencyData Science for Smart Systems Integration of information and control in complex systemsSmart bridges, transportation systems, health care systems,supply chains, etc.Data characteristics: heterogeneous, volatile, uncertainNew data management techniquesNew data analyticsSeptember 11, 2012Databases and Data Mining78

VLDB 201210 Year Best Paper Award Approximate Frequency Counts over Data StreamsG. Singh Manku (Google Inc. USA), R. MotwaniData Stream Algorithms research started late 90s.Sensor networksStock dataSecurity monotoringRecently: personal data stream analysisSeptember 11, 2012Databases and Data Mining79

VLDB2012 Subjects Spatial QueriesMap ReduceBig DataCloud DatabasesCrowdsourcingSocial Networks and Mobility in the CloudeHealthWeb databasesMobilityData Semantics and Data MiningParallel and Distributed DatabasesGraphsString and Sequence ProcessingPrivacyProbabilistic DatabasesData Flow; Hardware; Indexing; Query Optimization; Streams; September 11, 2012Databases and Data Mining80

VLDB 2012 Spatial QueriesT. Lappas et al.On theSpatiotemporalBurstiness of Terms. Burst identificationSpatial, temporalUnusual high frequency

September 11, 2012 Databases and Data Mining 2 DBDM Introduction Databases and Data Mining Projects at LIACS Biological and Medical Databases and Data Mining CMSB (Phenotype Genotype), DIAL CGH DB Cyttron: Visualization of the Cell GRID Computing VLe: Virtual Lab e-Science environments DAS3/DAS4 super computer Research on Fundamentals of Databases and Data

Related Documents:

Resultaten Gepromoveerdenonderzoek 2019

Rijksuniversiteit Groningen, Technische Universiteit Delft, Technische Universiteit Eindhoven, Tilburg University, Universiteit Leiden, Universiteit Maastricht, Universiteit Twente, Universiteit Utrecht, Universiteit van Amsterdam, Vrije Universiteit Amsterdam en Wageningen Universiteit. Zij hebben medewerking aan het onderzoek verleend door

40 Views

3y ago

Energy-Efﬁcient Photonic Neuromorphic Computing for Telecommunication ...

Examencommissie: Prof. Dr. Ir. Filip de Turck (voorzitter) Universiteit Gent, INTEC Prof. Dr. Ir. Peter Bienstman (promotor) Universiteit Gent, INTEC Prof. Dr. Ir. Joni Dambre (promotor) Universiteit Gent, ELIS Dr. Ir. Thomas Van Vaerenbergh Hewlett Packard Enterprise, USA Prof. Dr. Ir. Guy Van der Sande Vrije Universiteit Brussel

14 Views

1y ago

FIFTEENTH EDITION DATABASE PROCESSING

Database Applications and SQL 12 The DBMS 15 The Database 16 Personal Versus Enterprise-Class Database Systems 18 What Is Microsoft Access? 18 What Is an Enterprise-Class Database System? 19 Database Design 21 Database Design from Existing Data 21 Database Design for New Systems Development 23 Database Redesign 23

102 Views

2y ago

Administering Oracle Database Classic Cloud Service

Getting Started with Database Classic Cloud Service. About Oracle Database Classic Cloud Service1-1. About Database Classic Cloud Service Database Deployments1-2. Oracle Database Software Release1-3. Oracle Database Software Edition1-3. Oracle Database Type1-4. Computing Power1-5. Database Storage1-5. Automatic Backup Configuration1-6

41 Views

1y ago

The 7th Kobe University Brussels European Centre Symposium "Emerging ...

Vrije Universiteit Brussel, Belgium Nikos Deligiannis is professor of Data Science in the Electronics and Informatics Department at Vrije Universiteit Brussel. He received a Diploma in Electrical and Computer Engineering from University of Patras, Greece in 2006 and a PhD in Applied Sciences from Vrije Universiteit Brussel in 2012. He was senior

4 Views

1y ago

Modal shift of palletized goods a feasibility and location analysis for

1Vrije Universiteit Brussel 2Vrije Universiteit Brussel 3Vrije Universiteit Brussel Abstract The modal shift of palletized goods to the inland waterways proved itself feasible for Belgium, but . feasibility analysis for a modal shift of palletized FMCG for the Brussels Region (Mommens et al. 2014). In total, 26 large generators of palletized .

5 Views

1y ago

Vrije Universiteit Brussel Validation of virtual sensing on subsoil ...

1 M.Sc., Vrije Universiteit Brussel, maximilian.henkel@vub.be. 2 Dr.Ir., Offshore Wind Infrastructure Lab Vrije Universiteit Brussel, wout.weijtjens@owi-lab.be. . These mode shapes can be derived either from numerical models or by operational modal analysis [8]. From Eq. (1), the modal coordinates quantifying the contribution of each mode are .

6 Views

1y ago

Performance Evaluation of Cloud Database and Traditional Database in ...

The term database is correctly applied to the data and their supporting data structures, and not to the database management system. The database along with DBMS is collectively called Database System. A Cloud Database is a database that typically runs on a Cloud Computing platform, such as Windows Azure, Amazon EC2, GoGrid and Rackspace.

56 Views

1y ago

Recent Views

PHONE NO. CONTACT TOPIC/SUBTOPIC ORGANIZATION #A

651-757-2762 Deborah Klooz MPCA Paralegal: 651-757-2631 Jean Coleman MPCA Staff Attorney: 651-757-2791 Adonis Neblett MPCA Staff Attorney: 651-757-2017 Carmen Netten MPCA Staff Attorney: 651-757-2759 David Stellmach MPCA Staff Attorney: 651-757-2247 Joseph Dammel MPCA Staff Attorney: 651-757-2545 Michelle Janson MPCA Staff Attorney: #ATTORNEY .

2y ago

409 Views

Local Prosecutors and The Attorney General

Attorney General of Iowa Other Members iii Honorable Arthur K. Bolton Attorney General of Georgia Honorable Chauncey H. Browning, J 1'. Honorable John C. Danforth Attorney General of Missouri Honorable J olm P. Moore Attorney General of Colorado Attorney General of West Virginia Honorable Larry Derryberry Attorney General of Oklahoma

1y ago

182 Views

30th Annual Anti-Fraud Conference Tentative Schedule

Apr 30, 2019 · Jill Nerone, Supervising Deputy District Attorney, Alameda County District Attorney’s Office Laura Meyers, Assistant District Attorney, San Francisco County District Attorney’s, Office Nicole Pantaleo, Deputy District Attorney, Marin County District Attorney’s Office, Insurance F

2y ago

155 Views

Shannon McClellan Hon. Diane O. Leasure Ellery M. “Rick .

Attorney at Law Hon. Pamila J. Brown BOG Liaison District Court, Howard County Alan S. Carmel Attorney at Law Sarah Dawn Cline Attorney at Law Adam Sean Cohen Attorney at Law Delegate Kathleen M. Dumais District 15 Suzanne K. Farace Attorney at Law Barry L. Gogel Attorney at Law Michael I. Gordon

2y ago

148 Views

Powers of Attorney Act 2003 A Commentary - Law Society of New South Wales

POWERS OF ATTORNEY ACT 2003: A COMMENTARY 6 POWERS OF ATTORNEY ACT 2003: COMMENTARY The commentary is provided in black text. Reference to the "Act" is a reference to the Powers of Attorney Act 2003 as amended. Reference to the "Regulation" is a reference to the Powers of Attorney Regulation 2011, recently amended by the Powers of Attorney Amendment Act 2013 and the Powers of

7m ago

100 Views

California Safe Drinking Water and Toxic Enforcement Act .

District Attorney of Madera County 209 West Yosemite Avenue Madera, CA 93637 District Attorney of Marin County 3501 Civic Center Drive, Rm. 130 San Rafael, CA 94903 District Attorney of Mariposa County P.O. Box 730 Mariposa, CA 95338 District Attorney of Mendocino County P.O. Box 1000 Ukiah, CA 95482 District Attorney of Merced County

3y ago

168 Views

IN THE UNITED STATES COURT OF APPEALS FOR THE FIRST

Mar 06, 2020 · Attorney General of New Jersey Assistant Attorney General Counsel of Record Attorney for Amicus Curiae JOHN T. PASSANTE State of New Jersey Deputy Attorney General New Jersey Attorney General’s Office Richard J. Hughes Justice Complex 25 Market Street Trenton, NJ 086

2y ago

134 Views

ATTORNEY HANDBOOK - United States Courts

e. Each attorney's or pro se litigant's name must be typed and signed on the last page of the complaint, with: (1) his/her address (2) telephone number (3) if a Pennsylvania attorney, his/her Pennsylvania Attorney ID Number f. To file a complaint, the attorney must have an electronic signature on the complaint and must have an electronic

1y ago

131 Views

Power of Attorney - FedEx

Show the date the Power of Attorney is signed. Corporation Power of Attorney Partnership 1 10 9 8 7 6 5 4 3 2 12 11 1 10 9 8 7 6 5 4 3 2 12 11 1 10 9 8 7 6 5 4 3 2 12 11 Rev 6/13 The number preceding each instruction corresponds to the same number on the example of the power of attorney form. Customs Power of Attorney, Designation as Export .

1y ago

163 Views

Powers of Attorney - Ontario

attorney, a family member or friend may have to apply to be appointed as guardian. Powers of attorney that were properly made under previous laws of Ontario remain legally valid. The forms for a Continuing Power of Attorney for Property and a Power of Attorney for Personal Care contained in this booklet were revised on March 29, 1996 in accordance

1y ago

159 Views

STATUTORY POWER OF ATTORNEY - eForms

repudiated the power of attorney; and the power of attorney still is in full force and effect. 5. I/we make this affidavit for the purpose of inducing _ to accept delivery of the above described instrument, as executed by me/us in my/our capacity of attorney(s)-in-fact for the Principal. _, Attorney-in-fact

1y ago

123 Views

John J. Hoffman Acting Attorney General of New Jersey

JOHN J. HOFFMAN ACTING ATTORNEY GENERAL OF NEW JERSEY Division of Law 124 Halsey Street — 5th Floor P.O. Box 45029 Newark, New Jersey 07101 Attorney for Plaintiffs By: Jah-Juin Ho - #033032007 Deputy Attorney General 973-648-2500 JOHN J. HOFFMAN, Acting Attorney General of the State of New Jersey, and ERIC T.

1y ago

93 Views

Options in Oregon to Help Another Person Make Decisions

Power of Attorney A “Power of Attorney” is a legal document that allows a person to give another person (called an “agent”) the right to act on the person’s behalf. A “Power of Attorney” in Oregon can only be used for financial decisions. The way a “Power of Attorney” is written is important. The authority given to the agent can

3y ago

138 Views

- fcdfa

FRESNO COUNTY SUPERIOR COURT By DEPT.402 JAN SCULLY District Attorney, County of Sacramento RUTH YOUNG, State Bar No. 133606 Deputy District Attorney 906 G Street, Suite 700 Sacramento, CA 95814 Telephone: (916) 874-6174 JACKIE LACEY District Attorney, County of Los Angeles STUART C. LYTTON, State Bar No. 114241 Deputy District Attorney

3y ago

142 Views

Non-Attorney E-File Registration

your motion for e-filing access. Instructions to submit the Non-Attorney E-File Registration: 1. Register for a Non-Attorney Filer Account on the PACER website at www.pacer.uscourts.gov. If you already have a PACER Account, login to Manage My Account, select Non-Attorney E-File Re

3y ago

188 Views

Introduction And Database Technology - Universiteit Leiden

It looks like you're using an ad-blocker