Seven Databases In Seven Weeks, Second Edition

2y ago

39 Views

4 Downloads

1.02 MB

12 Pages

Last View : Today

Last Download : 3m ago

Upload by : Camden Erdman

Report this link

Download PDF

Transcription

Extracted from:Seven Databases in Seven Weeks,Second EditionA Guide to Modern Databases and the NoSQL MovementThis PDF file contains pages extracted from Seven Databases in Seven Weeks,Second Edition, published by the Pragmatic Bookshelf. For more information orto purchase a paperback or PDF copy, please visit http://www.pragprog.com.Note: This extract contains some colored text (particularly in code listing). Thisis available only in online versions of the books. The printed versions are blackand white. Pagination might vary between the online and printed versions; thecontent is otherwise identical.Copyright 2018 The Pragmatic Programmers, LLC.All rights reserved.No part of this publication may be reproduced, stored in a retrieval system, or transmitted,in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise,without the prior consent of the publisher.The Pragmatic BookshelfRaleigh, North Carolina

Seven Databases in Seven Weeks,Second EditionA Guide to Modern Databases and the NoSQL MovementLuc Perkinswith Eric Redmondand Jim R. WilsonThe Pragmatic BookshelfRaleigh, North Carolina

Many of the designations used by manufacturers and sellers to distinguish their productsare claimed as trademarks. Where those designations appear in this book, and The PragmaticProgrammers, LLC was aware of a trademark claim, the designations have been printed ininitial capital letters or in all capitals. The Pragmatic Starter Kit, The Pragmatic Programmer,Pragmatic Programming, Pragmatic Bookshelf, PragProg and the linking g device are trademarks of The Pragmatic Programmers, LLC.Every precaution was taken in the preparation of this book. However, the publisher assumesno responsibility for errors or omissions, or for damages that may result from the use ofinformation (including program listings) contained herein.Our Pragmatic books, screencasts, and audio books can help you and your team createbetter software and have more fun. Visit us at https://pragprog.com.The team that produced this book includes:Publisher: Andy HuntVP of Operations: Janet FurlowManaging Editor: Brian MacDonaldSupervising Editor: Jacquelyn CarterSeries Editor: Bruce A. TateCopy Editor: Nancy RapoportIndexing: Potomac Indexing, LLCLayout: Gilson GraphicsFor sales, volume licensing, and support, please contact support@pragprog.com.For international rights, please contact rights@pragprog.com.Copyright 2018 The Pragmatic Programmers, LLC.All rights reserved.No part of this publication may be reproduced, stored in a retrieval system, or transmitted,in any form, or by any means, electronic, mechanical, photocopying, recording, or otherwise,without the prior consent of the publisher.Printed in the United States of America.ISBN-13: 978-1-68050-253-4Encoded using the finest acid-free high-entropy binary digits.Book version: P1.0—April 2018

Day 2: Indexing, Aggregating, MapreduceIncreasing MongoDB’s query performance is the first item on today’s docket,followed by some more powerful and complex grouped queries. Finally, we’llround out the day with some data analysis using mapreduce.Indexing: When Fast Isn’t Fast EnoughOne of Mongo’s useful built-in features is indexing in the name of enhancedquery performance—something, as you’ve seen, that’s not available on allNoSQL databases. MongoDB provides several of the best data structures forindexing, such as the classic B-tree as well as other additions, such as twodimensional and spherical GeoSpatial indexes.For now, we’re going to do a little experiment to see the power of MongoDB’sB-tree index by populating a series of phone numbers with a random countryprefix (feel free to replace this code with your own country code). Enter thefollowing code into your console. This will generate 100,000 phone numbers(it may take a while), between 1-800-555-0000 and es function(area, start, stop) {for(var i start; i stop; i ) {var country 1 ((Math.random() * 8) 0);var num (country * 1e10) (area * 1e7) i;var fullNumber " " country " " area "-" i;db.phones.insert({id: num,components: {country: country,area: area,prefix: (i * 1e-4) 0,number: i},display: fullNumber});print("Inserted number " fullNumber);}print("Done!");} Click HERE to purchase this book now. discuss

6Run the function with a three-digit area code (like 800) and a range of sevendigit numbers (5,550,000 to 5,650,000—please verify your zeros when typing). populatePhones(800, 5550000, 5650000) // This could take a minute db.phones.find().limit(2){ " id" : 18005550000, "components" : {"prefix" : 555, "number" : 5550000 },{ " id" : 88005550001, "components" : {"prefix" : 555, "number" : 5550001 },"country""display""country""display"::::1, "area" : 800," 1 800-5550000" }8, "area" : 800," 8 800-5550001" }Whenever a new collection is created, Mongo automatically creates an indexby the id. These indexes can be found in the system.indexes collection. The following query shows all indexes in the database: n) {print("Indexes for the " collection " ));});Most queries will include more fields than just the id, so we need to makeindexes on those fields.We’re going to create a B-tree index on the display field. But first, let’s verifythat the index will improve speed. To do this, we’ll first check a query withoutan index. The explain() method is used to output details of a given operation. db.phones.find({display: " 1 Stats{"executionTimeMillis": 52,"executionStages": {"executionTimeMillisEstimate": 58,}}Your output will differ from ours here and only a few fields from the outputare shown here, but note the executionTimeMillisEstimate field—milliseconds tocomplete the query—will likely be double digits.We create an index by calling ensureIndex(fields,options) on the collection. The fieldsparameter is an object containing the fields to be indexed against. The optionsparameter describes the type of index to make. In this case, we’re building aunique index on display that should just drop duplicate entries. db.phones.ensureIndex({ display : 1 },{ unique : true, dropDups : true }) Click HERE to purchase this book now. discuss

Day 2: Indexing, Aggregating, Mapreduce 7Now try find() again, and check explain() to see whether the situation improves. db.phones.find({ display: " 1 800-5650001" tionTimeMillis" : 0,"executionStages": {"executionTimeMillisEstimate": 0,}}The executionTimeMillisEstimate changed from 52 to 0—an infinite improvement(52 / 0)! Just kidding, but the query is now orders of magnitude faster.Mongo is no longer doing a full collection scan but instead walking the treeto retrieve the value. Importantly, scanned objects dropped from 109999 to1—since it has become a single unique lookup.explain() is a useful function, but you’ll use it only when testing specific querycalls. If you need to profile in a normal test or production environment, you’llneed the system profiler.Let’s set the profiling level to 2 (level 2 stores all queries; profiling level 1stores only slower queries greater than 100 milliseconds) and then run find()as normal. db.setProfilingLevel(2) db.phones.find({ display : " 1 800-5650001" })This will create a new object in the system.profile collection, which you can readas any other table to get information about the query, such as a timestampfor when it took place and performance information (such as executionTimeMillisEstimate as shown). You can fetch documents from that collection like anyother: db.system.profile.find()This will return a list of objects representing past queries. This query, forexample, would return stats about execution times from the first query inthe list: db.system.profile.find()[0].execStats{"stage" : "EOF","nReturned" : 0,"executionTimeMillisEstimate" : 0,"works" : 0,"advanced" : 0,"needTime" : 0, Click HERE to purchase this book now. discuss

8"needYield" : 0,"saveState" : 0,"restoreState" : 0,"isEOF" : 1,"invalidates" : 0}Like yesterday’s nested queries, Mongo can build your index on nested values.If you wanted to index on all area codes, use the dot-notated field representation: components.area. In production, you should always build indexes in thebackground using the { background : 1 } option. db.phones.ensureIndex({ "components.area": 1 }, { background : 1 })If we find() all of the system indexes for our phones collection, the new one shouldappear last. The first index is always automatically created to quickly lookup by id, and the other two we added ourselves. db.phones.getIndexes()[{"v" : 2,"key" : {" id" : 1},"name" : " id ","ns" : "book.phones"},{"v" : 2,"unique" : true,"key" : {"display" : 1},"name" : "display 1","ns" : "book.phones"},{"v" : 2,"key" : {"components.area" : 1},"name" : "components.area 1","ns" : "book.phones","background" : 1}]Our book.phones indexes have rounded out quite nicely. Click HERE to purchase this book now. discuss

Day 2: Indexing, Aggregating, Mapreduce 9We should close this section by noting that creating an index on a large collection can be slow and resource-intensive. Indexes simply “cost” more inMongo than in a relational database like Postgres due to Mongo’s schemalessnature. You should always consider these impacts when building an indexby creating indexes at off-peak times, running index creation in the background, and running them manually rather than using automated indexcreation. There are plenty more indexing tricks and tips online, but these arethe basics that may come in handy the most often.Mongo’s Many Useful CLI ToolsBefore we move on to aggregation in Mongo, we want to briefly tell you about theother shell goodies that Mongo provides out-of-the-box in addition to mongod andmongo. We won’t cover them in this book but we do strongly recommend checkingthem out, as they together make up one of the most amply equipped CLI toolbelts inthe NoSQL universe.CommandDescriptionmongodumpExports data from Mongo into .bson files. That can mean entire collectionsor databases, filtered results based on a supplied query, and more.mongofilesManipulates large GridFS data files (GridFS is a specification for BSONfiles exceeding 16 MB).mongooplogPolls operation logs from MongoDB replication operations.mongorestore Restores MongoDB databases and collections from backups createdusing mongodump.mongostatDisplays basic MongoDB server stats.mongoexport Exports data from Mongo into CSV (comma-separated value) and JSONfiles. As with mongodump, that can mean entire databases and collectionsor just some data chosen on the basis of query parameters.mongoimport Imports data into Mongo from JSON, CSV, or TSV (term-separated value)files. We’ll use this tool on Day 3.mongoperfPerforms user-defined performance tests against a MongoDB server.mongosShort for “MongoDB shard,” this tool provides a service for properlyrouting data into a sharded MongoDB cluster (which we will not coverin this chapter).mongotopDisplays usage stats for each collection stored in a Mongo database.bsondumpConverts BSON files into other formats, such as JSON.For more in-depth info, see the MongoDB reference eference/program Click HERE to purchase this book now. discuss

10Aggregated QueriesMongoDB includes a handful of single-purpose aggregators: count() providesthe number of documents included in a result set (which we saw earlier), distinct() collects the result set into an array of unique results, and aggregate()returns documents according to a logic that you provide.The queries we investigated yesterday were useful for basic data extraction,but any post-processing would be up to you to handle. For example, say youwanted to count the phone numbers greater than 5599999 or provide nuanceddata about phone number distribution in different countries—in other words,to produce aggregate results using many documents. As in PostgreSQL, count()is the most basic aggregator. It takes a query and returns a number (ofmatching documents). db.phones.count({'components.number': { gt : 5599999 } })50000The distinct() method returns each matching value (not a full document) whereone or more exists. We can get the distinct component numbers that are lessthan 5,550,005 in this way: s.number': { lt : 5550005 } })[ 5550000, 5550001, 5550002, 5550003, 5550004 ]The aggregate() method is more complex but also much more powerful. It enablesyou to specify a pipeline-style logic consisting of stages such as: match filtersthat return specific sets of documents; group functions that group based onsome attribute; a sort() logic that orders the documents by a sort key; andmany others.3You can chain together as many stages as you’d like, mixing and matchingat will. Think of aggregate() as a combination of WHERE, GROUP BY, and ORDER BYclauses in SQL. The analogy isn’t perfect, but the aggregation API does a lotof the same things.Let’s load some city data into Mongo. There’s an included mongoCities100000.jsfile containing insert statements for data about nearly 100,000 cities. Here’show you can execute that file in the Mongo shell: c load('mongoCities100000.js') anual/reference/operator/aggregation-pipeline/ Click HERE to purchase this book now. discuss

Day 2: Indexing, Aggregating, Mapreduce 11Here’s an example document for a city:{" id" : ObjectId("5913ec4c059c950f9b799895"),"name" : "Sant Julià de Lòria","country" : "AD","timezone" : "Europe/Andorra","population" : 8022,"location" : {"longitude" : 42.46372,"latitude" : 1.49129}}We could use aggregate() to, for example, find the average population for allcities in the Europe/London timezone. To do so, we could match all documentswhere timezone equals Europe/London, and then add a group stage that produces one document with an id field with a value of averagePopulation and anavgPop field that displays the average value across all population values in thecollection: db.cities.aggregate([{ match: {'timezone': { eq: 'Europe/London'}}},{ group: {id: 'averagePopulation',avgPop: { avg: ' population'}}}]){ " id" : "averagePopulation", "avgPop" : 23226.22149712092 }We could also match all documents in that same timezone, sort them indescending order by population, and then project documents that only containthe population field: db.cities.aggregate([{// same match statement the previous aggregation operation}, Click HERE to purchase this book now. discuss

12{ sort: {population: -1}},{ project: {id: 0,name: 1,population: 1}}])You should see results like this:{ "name" : "City of London", "population" : 7556900 }{ "name" : "London", "population" : 7556900 }{ "name" : "Birmingham", "population" : 984333 }// many othersExperiment with it a bit—try combining some of the stage types we’ve alreadycovered in new ways—and then delete the collection when you’re done, aswe’ll add the same data back into the database using a different method onDay 3. db.cities.drop()This provides a very small taste of Mongo’s aggregation capabilities. Thepossibilities are really endless, and we encourage you to explore other stagetypes. Be forewarned that aggregations can be quite slow if you add a lot ofstages and/or perform them on very large collections. There are limits to howwell Mongo, as a schemaless database, can optimize these sorts of operations.But if you’re careful to keep your collections reasonably sized and, even better,structure your data to not require bold transformations to get the outputsyou want, then aggregate() can be a powerful and even speedy tool. Click HERE to purchase this book now. discuss

Seven Databases in Seven Weeks, Second Edition A Guide to Modern Databases and the NoSQL Movement This PDF file contains pages extracted from Seven Databases in Seven Weeks, Second Edition, published by the Pragmatic Bookshelf. For more information or to purchase a paperback or

Related Documents:

Baby Caskets, Urns, and Support Services for Stillbirth, Miscarriage ...

Gestation, Length, and Size of Casket Age of baby at death 6 weeks 7 weeks 8 weeks 9 weeks 10 weeks 11 weeks 12 weeks 13 weeks 14 weeks 16 weeks 18 weeks 20 weeks

10 Views

7m ago

English Language Training & Academic Programmes 2021

Preliminary English B1 Threshold 4.0 Pre-Intermediate 3.5 KET Key English 3.0 A2 Elementary Waystage 2.5 2.0 Beginner 1 - 1.5 A1 0 - 0.5 Breakthrough 12-15 weeks 12-15 weeks 12-15 weeks 12-15 weeks 12-15 weeks 12-15 weeks 8-10 weeks 9-12 weeks 9-12 9-12 weeks 9-

106 Views

2y ago

Digital Evolution of Disclosure Management

Management Best in Class Cycle Time Global change (e.g. prime rate) 5 weeks 3 weeks 4 weeks 1-2 days Product specific change 4 weeks 3.5 weeks 4 weeks 1-2 days Creation and rollout of new disclosure 5 weeks 4 weeks 5 weeks 1 week Global Change Management Resources (ranging from 5-10 people teams) 8 teams 5 teams 6 teams 2 and automated .

18 Views

1y ago

Updated 6/26/19 Department of Teaching & Learning Fifth Grade Social ...

Fifth Grade Social Studies Curriculum Guide First Nine Weeks Second Nine Weeks Weeks Topics Content Weeks Topics Content Standard 5. 1 TN Geography . Sequoyah *Review & Unit Test Third Nine Weeks Fourth Nine Weeks Weeks Topics Content Weeks Topics Content 1-2 World War I Standards 5.10-13, 5.49: Central & Allied Powers .

27 Views

1y ago

Carlstadt-east Rutherford Regional High School District Science Department

Chapter 2: Crime Scene Investigation and Evidence Collection, 3 weeks Chapter 3: Hair Analysis, 2 weeks Chapter 4: A study of Fibers and Textiles, 2 weeks Chapter 5: Forensic Botany, 2 weeks Chapter 6: Fingerprints, 2 weeks Chapter 7: DNA Profiling, 2 weeks Chapter 8: Blood and Blood Spatter, 2 weeks Chapter 9: Forensic Toxicology, 2 weeks

18 Views

1y ago

7 databases 53 databases 58 databases 14 databases ...

14 databases History 183 databases ProQuest Primary Sources available for: Introduction ProQuest Historical Primary Sources Support Research, Teaching and Learning. Faculty and students are using a variety of resources in research, teaching and learning – including primary sources,

35 Views

3y ago

/LLO edag peq - BBMKU

Control Techniques, Database Recovery Techniques, Object and Object-Relational Databases; Database Security and Authorization. Enhanced Data Models: Temporal Database Concepts, Multimedia Databases, Deductive Databases, XML and Internet Databases; Mobile Databases, Geographic Information Systems, Genome Data Management, Distributed Databases .

9 Views

10m ago

TAMINCO GROUP NV - FSMA

TAMINCO GROUP NV Pantserschipstraat 207, 9000 Ghent, Belgium Enterprise number 0891.533.631 Offering of New Shares (with VVPR strips attached) and Existing Shares

53 Views

3y ago

Recent Views

Financial Performance of General Insurance Companies in Bangladesh: A .

general insurance companies, 17 private general insurance companies and the only public general insurance company (SBC) were taken for the study. SBC was established in 1973. To get a better representation the insurance companies were selected based on their year of establishment. Among the private general insurance companies 7 companies were

1y ago

215 Views

National Insurance Commission (Nic) Insurance, Reinsurance and Broking .

Companies, 27 Non-Life Insurance Companies, 3 Reinsurance Companies, one Reinsurance contact office and 91 Insurance Brokers and Loss Adjusters. For periodic updates, please visit the Commission's website: www.nicgh.org Life Insurance Companies The following are Life insurance companies in good standing as at November 20, 2020.

1y ago

152 Views

Personal insurance - Car & Business insurance King Price Insurance

The king's insurance options 5 Things you need to know 7 The stuff you need to do 14 How to claim 16 Our commitment to you 20 Car insurance 22 Car warranty 37 Shortfall cover 45 Scratch and dent 46 Tyre and rim 48 Motorbike insurance 53 Trailer and caravan insurance 64 Watercraft insurance 68 Home contents insurance 77 Buildings insurance 89

1y ago

673 Views

BARBADOS INSURANCE COMPANIES - PwC

Underlying legislation for international insurance companies An international insurance company can be licensed in Barbados under the Exempt Insurance Act, Cap. 308A or, alternatively, registered under the Insurance Act, Cap. 310 which also governs local insurance companies. International companies that choose to register under the

1y ago

161 Views

Solvency II and the Taxation of Life Insurance Companies - GOV.UK

59 UK life insurance companies 60 Overseas life insurance companies: rule corresponding to s.59 61 Transfers of business and transfers within a group Share pooling rules 62 UK life insurance companies 63 Overseas life insurance companies: rule corresponding to s.62 64 Sections 62 and 63: supplementary Long-term business fixed capital

1y ago

149 Views

Gold Tier - MAPFRE Insurance

Foy Insurance of MA, LLC 198 Frank Consolati Insurance Agency, Inc. 198 County Insurance Agency, Inc. 198 Woodrow W Cross Agency 214 Woodland Insurance Agency, Inc. 214 Tegeler Insurance Services of CT, Inc. 214 Pantano/VonKahle Insurance Agency, Inc. 214 . Hanson Insurance Agency, Inc. 287 J.H. Slattery Insurance Agency, Inc. 287

1y ago

565 Views

List of Insurance Companies by Insurance Manager - Cayman Islands dollar

2447 Batan Insurance Company SPC, Ltd. 29-Sep-03 1307714 BBG Insurance Services, Ltd. 09-Aug-16 1254 BCHS Insurance, Ltd. 07-Oct-98 1168 Bearacuda Re 01-Aug-97 2639 Bedrock Insurance Limited 24-Nov-05 2150 Bom Ambiente Insurance Company 14-Jun-00 2565 Boundless Insurance Company, Ltd. 01-Dec-04 769 Bucap Limited 03-Mar-89

1y ago

293 Views

Insurance Market Review - Central Bank of Bahrain Home

Consolidated Balance Sheet of Arab Insurance Group Summary of Investment Activities of National Insurance companies Section Five: Directory of Insurance Companies in Bahrain A. Onshore B. Exempt C. Insurance Brokers D. Insurance Pools and Syndicates E. Insurance Experts, Consultants and Representative Offices List of BMA Officers

10m ago

72 Views

Role of Insurance Companies in Financial Market

The subject of this paper is a theoretical study of the influence of insurance companies on the financial market. By analying insurance companies, the aim is to determine how and in what way they affect business of financial market. KEY WORDS: financial market, offer, demand, insurance, insurance companies, capital JEL: G22 UDC: 339.13:368

1y ago

155 Views

Consumer Guide to Auto Insurance - csimt.gov

consumer guide to auto insurance contents introduction to auto insurance 1 understanding your auto insurance policy 2 required auto insurance 3 optional types of auto insurance 4-5 getting the right coverage 6 accidents and violations 7 how to shop for auto insurance 8 shopping tips 9 frequently asked questions 10-11 insurance complaints/when you have a problem 12

2y ago

805 Views

Industry Observations Insurance Industry

Jun 30, 2019 · 6/17/2019 Commercial Insurance Branch of Extraco Banks, N.A. Higginbotham Insurance Group, Inc. Insurance Brokers NA 6/13/2019 Links Insurance Services, LLC World Insurance Associates LLC Property and Casualty Insurance NA 6/13/2019 Abram Interstate Insurance Services, Inc. Risk Placement Services,

2y ago

619 Views

Life Insurance Buyer's Guide Life Insurance - National Association of .

Life Insurance uers uide Naional ssociaion of Insurance Commissioners Compare the Different Types of Insurance Policies There are many types of life insurance pol-icies. You should choose a policy with fea-tures that fit your individual needs. Some things to consider are: Term Insurance vs. Cash Value In-surance. Term insurance is intended to

1y ago

520 Views

your guide to understanding auto ins in nh - New Hampshire

Hampshire Insurance Department does not mandate or set Auto Insurance Rates. Auto Insurance Rates will vary by insurance company. This guide is intended to give New Hampshire consumers basic information on auto insurance. It suggests ways to: Lower the cost of your auto insurance, shop for Auto insurance and, file an auto insurance claim.

1y ago

449 Views

18.01.41 - REPLACEMENT OF LIFE INSURANCE AND ANNUITIES - Idaho

Department of Insurance Replacement of Life Insurance and Annuities. Page 3. 04. Existing Life Insurance or Annuity. "Existing Life Insurance or Annuity" means any life insurance or annuity in force, including life insurance under a binding or conditional receipt or a lif e insurance policy or annuity that is within an unconditional refund period.

1y ago

407 Views

EXAMINATION REPORT OF THE ADMIRAL INSURANCE COMPANY AS OF . - Delaware

Berkley Regional Specialty Insurance Comp 31295 DE Carolina Casualty Insurance Company 10510 IA Clermont Insurance Company 33480 IA Continental Western Insurance Company 10804 IA Firemen's Insurance Com pany of Wash, D.C. 21784 DE Gemini Insurance Company 10833 DE Great Divide Insurance Company 25224 ND

1y ago

258 Views

Seven Databases In Seven Weeks, Second Edition

It looks like you're using an ad-blocker