DATA WAREHOUSING PROFESSIONALS - Baylor University

2y ago
113 Views
5 Downloads
3.79 MB
602 Pages
Last View : 14d ago
Last Download : 3m ago
Upload by : Roy Essex
Transcription

DATA WAREHOUSINGFUNDAMENTALS FOR ITPROFESSIONALSSecond EditionPAULRAJ PONNIAH

DATA WAREHOUSINGFUNDAMENTALS FOR ITPROFESSIONALS

DATA WAREHOUSINGFUNDAMENTALS FOR ITPROFESSIONALSSecond EditionPAULRAJ PONNIAH

Copyright # 2010 by John Wiley & Sons, Inc. All rights reservedPublished by John Wiley & Sons, Inc., Hoboken, New JerseyPublished simultaneously in CanadaNo part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by anymeans, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted underSection 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of thePublisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center,Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department,John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or onlineat http://www.wiley.com/go/permission.Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts inpreparing this book, they make no representations or warranties with respect to the accuracy or completenessof the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for aparticular purpose. No warranty may be created or extended by sales representatives or written sales materials.The advice and strategies contained herein may not be suitable for your situation. You should consult with aprofessional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or anyother commercial damages, including but not limited to special, incidental, consequential, or other damages.For general information on our other products and services or for technical support, please contact our CustomerCare Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 orfax (317) 572-4002.Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not beavailable in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.Library of Congress Cataloging-in-Publication Data:Ponniah, Paulraj.Data warehousing fundamentals for IT professionals / Paulraj Ponniah.—2nd ed.p. cm.Previous ed. published under title: Data warehousing fundamentals.Includes bibliographical references and index.ISBN 978-0-470-46207-2 (cloth)1. Data warehousing. I. Ponniah, Paulraj. Data warehousing fundamentals. II. Title.QA76.9.D37P66 2010005.740 5—dc222009041789Printed in the United States of America10 9 87 65 4 32 1

ToVimala, my loving wifeand toJoseph, David, and Shobi,my dear children

CONTENTSPREFACEPART 11xxvOVERVIEW AND CONCEPTSTHE COMPELLING NEED FOR DATA WAREHOUSING13CHAPTER OBJECTIVES / 3ESCALATING NEED FOR STRATEGIC INFORMATION / 4The Information Crisis / 6Technology Trends / 6Opportunities and Risks / 8FAILURES OF PAST DECISION-SUPPORT SYSTEMS / 9History of Decision-Support Systems / 10Inability to Provide Information / 10OPERATIONAL VERSUS DECISION-SUPPORT SYSTEMS / 11Making the Wheels of Business Turn / 12Watching the Wheels of Business Turn / 12Different Scope, Different Purposes / 12DATA WAREHOUSING—THE ONLY VIABLE SOLUTION / 13A New Type of System Environment / 13Processing Requirements in the New Environment / 14Strategic Information from the Data Warehouse / 14vii

viiiCONTENTSDATA WAREHOUSE DEFINED / 15A Simple Concept for Information Delivery / 15An Environment, Not a Product / 15A Blend of Many Technologies / 16THE DATA WAREHOUSING MOVEMENT / 17Data Warehousing Milestones / 17Initial Challenges / 18EVOLUTION OF BUSINESS INTELLIGENCE / 18BI: Two Environments / 19BI: Data Warehousing and Analytics / 19CHAPTER SUMMARY / 20REVIEW QUESTIONS / 20EXERCISES / 212DATA WAREHOUSE: THE BUILDING BLOCKSCHAPTER OBJECTIVES / 23DEFINING FEATURES / 24Subject-Oriented Data / 24Integrated Data / 25Time-Variant Data / 26Nonvolatile Data / 27Data Granularity / 28DATA WAREHOUSES AND DATA MARTS / 29How Are They Different? / 29Top-Down Versus Bottom-Up Approach / 29A Practical Approach / 31ARCHITECTURAL TYPES / 32Centralized Data Warehouse / 32Independent Data Marts / 32Federated / 33Hub-and-Spoke / 33Data-Mart Bus / 34OVERVIEW OF THE COMPONENTS / 34Source Data Component / 34Data Staging Component / 37Data Storage Component / 39Information Delivery Component / 40Metadata Component / 41Management and Control Component / 4123

CONTENTSixMETADATA IN THE DATA WAREHOUSE / 41Types of Metadata / 42Special Significance / 42CHAPTER SUMMARY / 42REVIEW QUESTIONS / 43EXERCISES / 433TRENDS IN DATA WAREHOUSINGCHAPTER OBJECTIVES / 45CONTINUED GROWTH IN DATA WAREHOUSING / 46Data Warehousing has Become Mainstream / 46Data Warehouse Expansion / 47Vendor Solutions and Products / 48SIGNIFICANT TRENDS / 50Real-Time Data Warehousing / 50Multiple Data Types / 50Data Visualization / 52Parallel Processing / 54Data Warehouse Appliances / 56Query Tools / 56Browser Tools / 57Data Fusion / 57Data Integration / 58Analytics / 59Agent Technology / 59Syndicated Data / 60Data Warehousing and ERP / 60Data Warehousing and KM / 61Data Warehousing and CRM / 63Agile Development / 63Active Data Warehousing / 64EMERGENCE OF STANDARDS / 64Metadata / 65OLAP / 65WEB-ENABLED DATA WAREHOUSE / 66The Warehouse to the Web / 67The Web to the Warehouse / 67The Web-Enabled Configuration / 69CHAPTER SUMMARY / 6945

xCONTENTSREVIEW QUESTIONS / 69EXERCISES / 70PART 24PLANNING AND REQUIREMENTSPLANNING AND PROJECT MANAGEMENT7173CHAPTER OBJECTIVES / 73PLANNING YOUR DATA WAREHOUSE / 74Key Issues / 74Business Requirements, Not Technology / 76Top Management Support / 77Justifying Your Data Warehouse / 77The Overall Plan / 78THE DATA WAREHOUSE PROJECT / 79How is it Different? / 79Assessment of Readiness / 81The Life-Cycle Approach / 81THE DEVELOPMENT PHASES / 83Adopting Agile Development / 84THE PROJECT TEAM / 85Organizing the Project Team / 85Roles and Responsibilities / 86Skills and Experience Levels / 87User Participation / 88PROJECT MANAGEMENT CONSIDERATIONS / 90Guiding Principles / 91Warning Signs / 92Success Factors / 92Anatomy of a Successful Project / 93Adopt a Practical Approach / 94CHAPTER SUMMARY / 96REVIEW QUESTIONS / 96EXERCISES / 975DEFINING THE BUSINESS REQUIREMENTSCHAPTER OBJECTIVES / 99DIMENSIONAL ANALYSIS / 100Usage of Information Unpredictable / 100Dimensional Nature of Business Data / 101Examples of Business Dimensions / 10299

CONTENTSxiINFORMATION PACKAGES—A USEFUL CONCEPT / 103Requirements Not Fully Determinate / 104Business Dimensions / 105Dimension Hierarchies and Categories / 106Key Business Metrics or Facts / 107REQUIREMENTS GATHERING METHODS / 109Types of Questions / 110Arrangement of Questions / 111Interview Techniques / 111Adapting the JAD Methodology / 113Using Questionnaires / 115Review of Existing Documentation / 115REQUIREMENTS DEFINITION: SCOPE AND CONTENT / 116Data Sources / 117Data Transformation / 117Data Storage / 117Information Delivery / 118Information Package Diagrams / 118Requirements Definition Document Outline / 118CHAPTER SUMMARY / 119REVIEW QUESTIONS / 119EXERCISES / 1206REQUIREMENTS AS THE DRIVING FORCE FORDATA WAREHOUSINGCHAPTER OBJECTIVES / 121DATA DESIGN / 122Structure for Business Dimensions / 123Structure for Key Measurements / 124Levels of Detail / 125THE ARCHITECTURAL PLAN / 125Composition of the Components / 126Special Considerations / 127Tools and Products / 129DATA STORAGE SPECIFICATIONS / 131DBMS Selection / 132Storage Sizing / 132INFORMATION DELIVERY STRATEGY / 133Queries and Reports / 134Types of Analysis / 134Information Distribution / 135121

xiiCONTENTSReal Time Information Delivery / 135Decision Support Applications / 135Growth and Expansion / 136CHAPTER SUMMARY / 136REVIEW QUESTIONS / 136EXERCISES / 137PART 37ARCHITECTURE AND INFRASTRUCTUREARCHITECTURAL COMPONENTS139141CHAPTER OBJECTIVES / 141UNDERSTANDING DATA WAREHOUSE ARCHITECTURE / 141Architecture: Definitions / 142Architecture in Three Major Areas / 142DISTINGUISHING CHARACTERISTICS / 143Different Objectives and Scope / 144Data Content / 144Complex Analysis and Quick Response / 145Flexible and Dynamic / 145Metadata-Driven / 146ARCHITECTURAL FRAMEWORK / 146Architecture Supporting Flow of Data / 146The Management and Control Module / 147TECHNICAL ARCHITECTURE / 148Data Acquisition / 149Data Storage / 152Information Delivery / 154ARCHITECTURAL TYPES / 156Centralized Corporate Data Warehouse / 156Independent Data Marts / 156Federated / 159Hub-and-Spoke / 159Data-Mart Bus / 160CHAPTER SUMMARY / 160REVIEW QUESTIONS / 160EXERCISES / 1618INFRASTRUCTURE AS THE FOUNDATION FORDATA WAREHOUSINGCHAPTER OBJECTIVES / 163163

CONTENTSxiiiINFRASTRUCTURE SUPPORTING ARCHITECTURE / 164Operational Infrastructure / 165Physical Infrastructure / 165HARDWARE AND OPERATING SYSTEMS / 166Mainframes / 167Open System Servers / 168NT Servers / 168Platform Options / 168Server Hardware / 177DATABASE SOFTWARE / 181Parallel Processing Options / 182Selection of the DBMS / 184COLLECTION OF TOOLS / 184Architecture First, Then Tools / 186Data Modeling / 186Data Extraction / 187Data Transformation / 187Data Loading / 187Data Quality / 187Queries and Reports / 187Dashboards / 187Scorecards / 187Online Analytical Processing (OLAP) / 188Alert Systems / 188Middleware and Connectivity / 188Data Warehouse Administration / 188DATA WAREHOUSE APPLIANCES / 188Evolution of DW Appliances / 189Benefits of DW Appliances / 190CHAPTER SUMMARY / 191REVIEW QUESTIONS / 191EXERCISES / 1929THE SIGNIFICANT ROLE OF METADATACHAPTER OBJECTIVES / 193WHY METADATA IS IMPORTANT / 193A Critical Need in the Data Warehouse / 195Why Metadata Is Vital for End-Users / 198Why Metadata Is Essential for IT / 199Automation of Warehousing Tasks / 200Establishing the Context of Information / 202193

xivCONTENTSMETADATA TYPES BY FUNCTIONAL AREAS / 203Data Acquisition / 204Data Storage / 205Information Delivery / 206BUSINESS METADATA / 207Content Overview / 207Examples of Business Metadata / 208Content Highlights / 209Who Benefits? / 209TECHNICAL METADATA / 209Content Overview / 210Examples of Technical Metadata / 210Content Highlights / 211Who Benefits? / 211HOW TO PROVIDE METADATA / 212Metadata Requirements / 212Sources of Metadata / 214Challenges for Metadata Management / 215Metadata Repository / 215Metadata Integration and Standards / 217Implementation Options / 218CHAPTER SUMMARY / 219REVIEW QUESTIONS / 220EXERCISES / 220PART 4DATA DESIGN AND DATA PREPARATION10 PRINCIPLES OF DIMENSIONAL MODELINGCHAPTER OBJECTIVES / 225FROM REQUIREMENTS TO DATA DESIGN / 225Design Decisions / 226Dimensional Modeling Basics / 226E-R Modeling Versus Dimensional Modeling / 230Use of CASE Tools / 232THE STAR SCHEMA / 232Review of a Simple STAR Schema / 232Inside a Dimension Table / 234Inside the Fact Table / 236The Factless Fact Table / 238Data Granularity / 238223225

CONTENTSxvSTAR SCHEMA KEYS / 239Primary Keys / 239Surrogate Keys / 240Foreign Keys / 240ADVANTAGES OF THE STAR SCHEMA / 241Easy for Users to Understand / 241Optimizes Navigation / 242Most Suitable for Query Processing / 243STARjoin and STARindex / 244STAR SCHEMA: EXAMPLES / 244Video Rental / 244Supermarket / 244Wireless Phone Service / 244Auction Company / 244CHAPTER SUMMARY / 246REVIEW QUESTIONS / 247EXERCISES / 24711 DIMENSIONAL MODELING: ADVANCED TOPICSCHAPTER OBJECTIVES / 249UPDATES TO THE DIMENSION TABLES / 250Slowly Changing Dimensions / 250Type 1 Changes: Correction of Errors / 251Type 2 Changes: Preservation of History / 252Type 3 Changes: Tentative Soft Revisions / 253MISCELLANEOUS DIMENSIONS / 255Large Dimensions / 255Rapidly Changing Dimensions / 256Junk Dimensions / 258THE SNOWFLAKE SCHEMA / 259Options to Normalize / 259Advantages and Disadvantages / 260When to Snowflake / 262AGGREGATE FACT TABLES / 262Fact Table Sizes / 264Need for Aggregates / 266Aggregating Fact Tables / 266Aggregation Options / 271FAMILIES OF STARS / 272Snapshot and Transaction Tables / 273Core and Custom Tables / 274249

xviCONTENTSSupporting Enterprise Value Chain or Value Circle / 274Conforming Dimensions / 275Standardizing Facts / 276Summary of Family of STARS / 277CHAPTER SUMMARY / 277REVIEW QUESTIONS / 278EXERCISES / 27812 DATA EXTRACTION, TRANSFORMATION, AND LOADINGCHAPTER OBJECTIVES / 281ETL OVERVIEW / 282Most Important and Most Challenging / 282Time Consuming and Arduous / 283ETL REQUIREMENTS AND STEPS / 284Key Factors / 285DATA EXTRACTION / 286Source Identification / 287Data Extraction Techniques / 287Evaluation of the Techniques / 294DATA TRANSFORMATION / 295Data Transformation: Basic Tasks / 296Major Transformation Types / 297Data Integration and Consolidation / 299Transformation for Dimension Attributes / 301How to Implement Transformation / 301DATA LOADING / 302Applying Data: Techniques and Processes / 303Data Refresh Versus Update / 306Procedure for Dimension Tables / 306Fact Tables: History and Incremental Loads / 307ETL SUMMARY / 308ETL Tool Options / 308Reemphasizing ETL Metadata / 309ETL Summary and Approach / 310OTHER INTEGRATION APPROACHES / 311Enterprise Information Integration (EII) / 311Enterprise Application Integration (EAI) / 312CHAPTER SUMMARY / 313REVIEW QUESTIONS / 313EXERCISES / 314281

CONTENTS13 DATA QUALITY: A KEY TO SUCCESSxvii315CHAPTER OBJECTIVES / 315WHY IS DATA QUALITY CRITICAL? / 316What Is Data Quality? / 316Benefits of Improved Data Quality / 319Types of Data Quality Problems / 320DATA QUALITY CHALLENGES / 323Sources of Data Pollution / 323Validation of Names and Addresses / 325Costs of Poor Data Quality / 325DATA QUALITY TOOLS / 326Categories of Data Cleansing Tools / 327Error Discovery Features / 327Data Correction Features / 327The DBMS for Quality Control / 327DATA QUALITY INITIATIVE / 328Data Cleansing Decisions / 329Who Should Be Responsible? / 330The Purification Process / 333Practical Tips on Data Quality / 334MASTER DATA MANAGEMENT (MDM) / 335MDM Categories / 335MDM Benefits / 335MDM and Data Warehousing / 336CHAPTER SUMMARY / 336REVIEW QUESTIONS / 336EXERCISES / 337PART 5INFORMATION ACCESS AND DELIVERY14 MATCHING INFORMATION TO THE CLASSES OF USERSCHAPTER OBJECTIVES / 341INFORMATION FROM THE DATA WAREHOUSE / 342Data Warehouse Versus Operational Systems / 342Information Potential / 344User – Information Interface / 347Industry Applications / 348WHO WILL USE THE INFORMATION? / 349Classes of Users / 349339341

xviiiCONTENTSWhat They Need / 352How to Provide Information / 354INFORMATION DELIVERY / 356Queries / 357Reports / 358Analysis / 359Applications / 359INFORMATION DELIVERY TOOLS / 360The Desktop Environment / 360Methodology for Tool Selection / 361Tool Selection Criteria / 364Information Delivery Framework / 365INFORMATION DELIVERY: SPECIAL TOPICS / 366Business Activity Monitoring (BAM) / 366Dashboards and Scorecards / 367CHAPTER SUMMARY / 371REVIEW QUESTIONS / 371EXERCISES / 37215 OLAP IN THE DATA WAREHOUSECHAPTER OBJECTIVES / 373DEMAND FOR ONLINE ANALYTICAL PROCESSING / 374Need for Multidimensional Analysis / 374Fast Access and Powerful Calculations / 375Limitations of Other Analysis Methods / 377OLAP is the Answer / 379OLAP Definitions and Rules / 379OLAP Characteristics / 382MAJOR FEATURES AND FUNCTIONS / 382General Features / 383Dimensional Analysis / 383What Are Hypercubes? / 386Drill Down and Roll Up / 390Slice and Dice or Rotation / 392Uses and Benefits / 393OLAP MODELS / 393Overview of Variations / 394The MOLAP Model / 394The ROLAP Model / 395ROLAP Versus MOLAP / 397373

CONTENTSxixOLAP IMPLEMENTATION CONSIDERATIONS / 398Data Design and Preparation / 399Administration and Performance / 401OLAP Platforms / 402OLAP Tools and Products / 402Implementation Steps / 403Examples of Typical Implementations / 404CHAPTER SUMMARY / 404REVIEW QUESTIONS / 405EXERCISES / 40516 DATA WAREHOUSING AND THE WEB407CHAPTER OBJECTIVES / 407WEB-ENABLED DATA WAREHOUSE / 408Why the Web? / 408Convergence of Technologies / 410Adapting the Data Warehouse for the Web / 411The Web as a Data Source / 412Clickstream Analysis / 413WEB-BASED INFORMATION DELIVERY / 414Expanded Usage / 414New Information Strategies / 416Browser Technology for the Data Warehouse / 418Security Issues / 419OLAP AND THE WEB / 420Enterprise OLAP / 420Web-OLAP Approaches / 420OLAP Engine Design / 421BUILDING A WEB-ENABLED DATA WAREHOUSE / 421Nature of the Data Webhouse / 422Implementation Considerations / 423Putting the Pieces Together / 424Web Processing Model / 426CHAPTER SUMMARY / 426REVIEW QUESTIONS / 426EXERCISES / 42717 DATA MINING BASICSCHAPTER OBJECTIVES / 429WHAT IS DATA MINING? / 430429

xxCONTENTSData Mining Defined / 431The Knowledge Discovery Process / 432OLAP Versus Data Mining / 435Some Aspects of Data Mining / 436Data Mining and the Data Warehouse / 438MAJOR DATA MINING TECHNIQUES / 439Cluster Detection / 440Decision Trees / 443Memory-Based Reasoning / 444Link Analysis / 445Neural Networks / 447Genetic Algorithms / 448Moving into Data Mining / 450DATA MINING APPLICATIONS / 452Benefits of Data Mining / 453Applications in CRM (Customer Relationship Management) / 454Applications in the Retail Industry / 455Applications in the Telecommunications Industry / 456Applications in Biotechnology / 457Applications in Banking and Finance / 459CHAPTER SUMMARY / 459REVIEW QUESTIONS / 459EXERCISES / 460PART 6IMPLEMENTATION AND MAINTENANCE18 THE PHYSICAL DESIGN PROCESSCHAPTER OBJECTIVES / 463PHYSICAL DESIGN STEPS / 464Develop Standards / 464Create Aggregates Plan / 465Determine the Data Partitioning Scheme / 465Establish Clustering Options / 466Prepare an Indexing Strategy / 466Assign Storage Structures / 466Complete Physical Model / 467PHYSICAL DESIGN CONSIDERATIONS / 467Physical Design Objectives / 467From Logical Model to Physical Model / 469461463

CONTENTSxxiPhysical Model Components / 469Significance of Standards / 470PHYSICAL STORAGE / 473Storage Area Data Structures / 473Optimizing Storage / 473Using RAID Technology / 476Estimating Storage Sizes / 477INDEXING THE DATA WAREHOUSE / 477Indexing Overview / 477B-Tree Index / 479Bitmapped Index / 481Clustered Indexes / 482Indexing the Fact Table / 482Indexing the Dimension Tables / 483PERFORMANCE ENHANCEMENT TECHNIQUES / 483Data Partitioning / 483Data Clustering / 484Parallel Processing / 484Summary Levels / 485Referential Integrity Checks / 485Initialization Parameters / 485Data Arrays / 486CHAPTER SUMMARY / 486REVIEW QUESTIONS / 486EXERCISES / 48719 DATA WAREHOUSE DEPLOYMENTCHAPTER OBJECTIVES / 489DATA WAREHOUSE TESTING / 490Front-End / 490ETL Testing / 490MAJOR DEPLOYMENT ACTIVITIES / 491Complete User Acceptance / 491Perform Initial Loads / 492Get User Desktops Ready / 493Complete Initial User Training / 494Institute Initial User Support / 495Deploy in Stages / 495CONSIDERATIONS FOR A PILOT / 497When is a Pilot Data Mart Useful? / 497489

xxiiCONTENTSTypes of Pilot Projects / 498Choosing the Pilot / 500Expanding and Integrating the Pilot / 501SECURITY / 502Security Policy / 502Managing User Privileges / 502Password Considerations / 503Security Tools / 504BACKUP AND RECOVERY / 504Why Back Up the Data Warehouse? / 505Backup Strategy / 505Setting up a Practical Schedule / 506Recovery / 507CHAPTER SUMMARY / 508REVIEW QUESTIONS / 508EXERCISES / 50920 GROWTH AND MAINTENANCECHAPTER OBJECTIVES / 511MONITORING THE DATA WAREHOUSE / 512Collection of Statistics / 512Using Statistics for Growth Planning / 514Using Statistics for Fine-Tuning / 514Publishing Trends for Users / 515USER TRAINING AND SUPPORT / 515User Training Content / 516Preparing the Training Program / 516Delivering the Training Program / 518User Support / 519MANAGING THE DATA WAREHOUSE / 520Platform Upgrades / 521Managing Data Growth / 521Storage Management / 522ETL Management / 522Data Model Revisions / 523Information Delivery Enhancements / 523Ongoing Fine-Tuning / 524CHAPTER SUMMARY / 524REVIEW QUESTIONS / 525EXERCISES / 525511

CONTENTSxxiiiANSWERS TO SELECTED EXERCISES527APPENDIX A: PROJECT LIFE CYCLE STEPS AND CHECKLISTS531APPENDIX B:535CRITICAL FACTORS FOR SUCCESSAPPENDIX C: GUIDELINES FOR EVALUATING VENDOR SOLUTIONS537APPENDIX D: HIGHLIGHTS OF VENDORS AND PRODUCTS539APPENDIX E:549REAL-WORLD EXAMPLES OF BEST PRACTICESREFERENCES555GLOSSARY557INDEX565

PREFACETHIS BOOK IS FOR YOUAre you an information technology professional watching, with great interest, the massiveunfolding and spreading of the data warehouse movement during the past decade? Areyou contemplating a move into this fast-growing area of opportunity? Are you a systems analyst, programmer, data analyst, database administrator, project leader, or software engineereager to grasp the fundamentals of data warehousing? Do you wonder how many differentbooks you may have to study to learn the underlying principles and the current practices? Areyou lost in the maze of the literature and products on the subject? Do you wish for a singlepublication on data warehousing, clearly and specifically designed for IT professionals? Doyou need a textbook that helps you learn the fundamentals in sufficient depth? If youanswered “yes” to any of the above, this book is written specially for you.This is the one definitive book on data warehousing clearly intended for IT professionals.The organization and presentation of the book are specially tuned for IT professionals. Thisbook does not presume to target anyone and everyone remotely interested in the subject forsome reason or another, but is written to address the specific needs of IT professionals likeyou. It does not tend to emphasize certain aspects and neglect other critical ones. The booktakes you over the entire spectrum of data warehousing.As a veteran IT professional with wide and intensive industry experience, as a successfuldatabase and data warehousing consultant for many years, and as one who teaches data warehousing fundamentals in the college classroom and at public seminars, I have come toappreciate the precise needs of IT professionals. In every chapter I have incorporatedthese requirements of the IT community.xxv

xxviPREFACETHE SCENARIOWhy have companies rushed into data warehousing? Why is there a tremendous surge ininterest? Data warehousing is no longer a purely novel idea just for research and experimentation. It has become a mainstream phenomenon. True, the data warehouse is not in everydoctor’s office yet, but neither is it confined to only high-end businesses. More than halfof all U.S. companies and a large percentage of worldwide businesses have made a commitment to data warehousing.In every industry across the board, from retail chain stores to financial institutions, frommanufacturing enterprises to government departments, and from airline companies to utilitybusinesses, data warehousing has revolutionized the way people perform business analysisand make strategic decisions. Every company that has a data warehouse is realizing the enormous benefits translated into positive results at the bottom line. These companies, now incorporating Web-based technologies, are enhancing the potential for greater and easier deliveryof vital information.Over the past decade, a large number of vendors have flooded the market with numerousdata warehousing products. Vendor solutions and products run the gamut of data warehousing and business intelligence—data modeling, data acquisition, data quality, data analysis,metadata, information delivery, and so on. The market is large, mature, and continuesto grow.CHANGED ROLE OF ITIn this scenario, information technology departments of all progressive companies haveperceived a radical change in their roles. IT is no longer required to create every reportand present every screen for providing information to the end-users. IT is now chargedwith the building of information delivery systems and letting the end-users themselvesretrieve information in innovative ways for analysis and decision making. Data warehousingand business intelligence environments are proving to be just that type of successful information delivery system.IT professionals responsible for building data warehouses had to revise their mindsetsabout building applications. They had to understand that a data warehouse is not a onesize-fits-all proposition. First, they had to get a clear understanding about data extractionfrom source systems, data transformations, data staging, data warehouse architecture, infrastructure, and the various methods of information delivery. In short, IT professionals, likeyou, must get a strong grip on the fundamentals of data warehousing.WHAT THIS BOOK CAN DO FOR YOUThe book is comprehensive and detailed. You will be able to study every significant topic inplanning, requirements, architecture, infrastructure, design, data preparation, informationdelivery, deployment, and maintenance. The book is specially designed for IT professionals;you will be able to follow the presentation easily because it is built upon the foundation ofyour background as an IT professional, your knowledge, and the technical terminology familiar to you. It is organized logically, beginning with an overview of concepts, moving on toplanning and requirements, then to architecture and infrastructure, on to data design, then to

PREFACExxviiinformation delivery, and concluding with deployment and maintenance. This progression istypical of what you are most familiar with in your IT experience and day-to-day work.The book provides an interactive learning experience. It is not just a one-way lecture. Youparticipate through the review questions and exercises at the end of each chapter. For eachchapter, the objectives at the beginning set the theme and the summary at the end highlightsthe topics covered. You can relate each concept and technique presented in the book to thedata warehousing industry and marketplace. You will benefit from the substantial number ofindustry examples. Although intended as a first course on the fundamentals, this book provides sufficient coverage of each topic so that you can comfortably proceed to the next step ofspecialization for specific roles in a data warehouse project.Featuring all the significant topics in appropriate measure, this book is eminently suitableas a textbook for serious self-study, a college course, or a seminar on the essentials. Itprovides an opportunity for you to become a data warehouse expert.ENHANCEMENTS IN THIS SECOND EDITIONThis greatly enhanced edition captures the developments and changes in the data warehousing landscape during the past nearly ten years. The underlying purposes and principles ofdata warehousing have remained the same. However, we notice definitive changes in thedetails, some finer aspects, and in product innovations. Although this edition succeeds inincorporating all the significant revisions, I have been careful not to disturb the overall logical arrangement and sequencing of the chapters.The term “business intelligence” has gained a lot more currency. Many practitioners nowconsider data warehousing to refer to populating the warehouse with data, and business intelligence to refer to using the warehouse data. Data warehousing has made inroads into areassuch as Customer Relationship Management, Enterprise Application Integration, EnterpriseInformation Integration, Business Activity Monitoring, and so on. The size of corporate datawarehouses has been rising higher and higher. Some progressive businesses have reapedenormous benefits from data warehouses that are almost in the 500 terabyte range (fivetimes the size of the U.S. Library of Congress archive). The benefits from data warehousesare no longer limited to a selected core of executives, managers, and analysts. Pervasive datawarehousing has become the operative principle, providing access and usage to staff at multiple levels. Information delivery through traditional reports and queries is being replaced byinteractive dashboards and scorecards.More specifically, among topics on recent trends and changes, this enhanced editionincludes the following:††††††††Evolution of business intelligenceReal-time business intelligenceData warehouse appliancesData warehouse: architectural typesData visualization enhancementsEnterprise application integration (EAI)Enterprise information integration (EII)Agile data warehouse development

CEData warehousing and KM (knowledge management)Data warehousing and ERP (enterprise resource planning)Data warehousing and CRM (customer relationship management)Improved requirements gathering methodsBusiness activity monitoring (BAM)Interactive information delivery through dashboards and scorecardsAdditional STAR schema examplesMaster data managementExamples of typical OLAP (online analytical processing) implementationsData mining applicationsWeb clickstream analysisHighlights of vendors and productsReal-world examples of best practicesACKNOWLEDGMENTSI wish to acknowledge my indebtedness and to express my gratitude to the authors listed inthe reference section at the end of the book. Their insights and observations have helped mecover every topic adequately.I must also express my appreciation to my students and professional colleagues. My interactions with them have enabled me to shape this textbook according to the needs of ITprofessionals.My special thanks are due to the wonderful staff and editors at Wiley, my publishers, whohave worked with me and supported me for more than a decade in the publication and promotion of my books.PAULRAJ PONNIAH, PH.D.Milltown, New JerseyOctober 2009

PART 1OVERVIEW AND CONCEPTS

CHAPTER 1THE COMPELLING NEED FOR DATAWAREHOUSINGCHAPTER OBJECTIVES††††††Understand the desperate need for strategic informationRecognize the information crisis at every ent

Data warehousing fundamentals for IT professionals / Paulraj Ponniah.—2nd ed. p. cm. Previous ed. published under title: Data warehousing fundamentals. Includes bibliographical references and index. ISBN 978-0-470-46207-2 (cloth) 1. Data warehousing. I. Ponniah, Paulraj. Data warehousing

Related Documents:

Baylor Scott & White Heart & Vascular Hospital - Dallas Baylor Scott & White Medical Center - Uptown Baylor University Medical Center North Central Surgical Center Baylor Scott & White Medical Center - Sunnyvale Approved by: Baylor Scott & White Health - North Texas Operating, Policy and Procedure Board on June 25, 2019

Data Warehousing on AWS AWS Whitepaper Introduction Data Warehousing on AWS Publication date: January 15, 2021 (Document histor y and contributors (p. 23)) Enterprises across the globe want to migrate data warehousing to the cloud to improve performance and lower costs. This whitepaper discusses a modern approach to analytics and data warehousing

Doctor of Nursing Practice Program Orthotics & Prosthetics Program (Nurse Anesthesia) Baylor College of Medicine . Baylor College of Medicine One Baylor Plaza, MS BCM115 . One Baylor Plaza, MS BCM115 DeBakey Bldg., Suite M108 . Debakey Building, Suite M108 Houston, Texas 77030 (713) 798-8650 (713) 798-3098

GAYNOR YANCEY Office Address: Home Address: Baylor University School of Social Work 907 Morning Sun Lane . Baylor University’s Board of Regents *Baylor University Faculty Ombudsperson, 2014-2017 *Designated as Master Teacher by Baylor University, 2016 *Selected as an Outstanding Mentor by the Council of Social Work Education, 2016 *Selected .

american PuBlic Wave III Baylor Religion Survey September 2011 A Research Project funded by Baylor University with support from the National Science Foundation and the John M. Templeton Foundation Conducted by the Department of Sociology, College of Arts and Sciences, and Hankamer School of Business, Baylor University Research Group

2-3 // About Baylor 4-5 // Majors & Minors 6 // Engagement 7 // Faith & Learning 8-9 // A Campus to Call Home 10-11 // Traditions 12-13 // Athletics 14-15 // Applying for Admission 16 // Tips for Transfers 17 // Financial Aid 18-19 // Backing Your Success 20 // Explore Waco 21 // Visit Baylor IN THESE PAGES THE BEST & THE BRIGHTEST SHINE AT BAYLOR. On this campus, academic excellence is elevated,

BAYLOR FIRSTS i The original transplant pioneer, Dr. Thomas Starzl was the inspiration behind Baylor's transplantation program. Dr. Starzl performed the first human liver transplants. In 1983, he made a presentation to Baylor on his transplant team's successes at the University of Pittsburgh. He encouraged Baylor to develop a transplant center,

The hooks infrastructure is separatede in two parts, the hook dispatcher, and the actual hooks. The dispatcher is in charge of deciding which hooks to run for each event, and gives the final review on the change. The hooks themselves are the ones that actually do checks (or any other action needed) and where the actual login you