| |
| |
Foreword | |
| |
| |
Acknowledgments | |
| |
| |
Introduction | |
| |
| |
Audience | |
| |
| |
Organization | |
| |
| |
What's on the Companion Web Site? | |
| |
| |
An Invitation | |
| |
| |
Data Warehousing | |
| |
| |
What Is a Data Warehouse? | |
| |
| |
A Data Warehouse Is Not a Project... | |
| |
| |
... Rather, Data Warehousing Is a Process | |
| |
| |
A Data Warehouse Is Not a Product... | |
| |
| |
... Rather, Data Warehousing Is a New Way of Thinking about Data | |
| |
| |
A Data Warehouse Is Not a Place... | |
| |
| |
... Rather, Data Warehousing Is a Supplement to Traditional Data Processing | |
| |
| |
Inmon's Four Characteristics of a Data Warehouse | |
| |
| |
Subject-Oriented | |
| |
| |
Integrated | |
| |
| |
Nonvolatile | |
| |
| |
Time Variant | |
| |
| |
What Is an Operational Data Store? | |
| |
| |
What Is a Data Mart? | |
| |
| |
What Is OLAP? | |
| |
| |
What Is Data Mining? | |
| |
| |
Critical Success Factors in Data Warehousing | |
| |
| |
Summary | |
| |
| |
Hardware Architectures for Oracle Data Warehousing | |
| |
| |
Computer Architectures | |
| |
| |
Requirements Definition | |
| |
| |
Evaluation Criteria | |
| |
| |
Uniprocessor Systems | |
| |
| |
Symmetric Multiprocessing (SMP) Systems | |
| |
| |
Nonuniform Memory Access (NUMA) Systems | |
| |
| |
Clustered Systems | |
| |
| |
Massively Parallel Processing (MPP) Systems | |
| |
| |
Summary: Computer Architectures | |
| |
| |
Data Storage Architectures | |
| |
| |
Software | |
| |
| |
Hardware | |
| |
| |
Summary | |
| |
| |
Oracle Server Software Architecture and Features | |
| |
| |
Oracle Server: External Architecture | |
| |
| |
Oracle Database versus Oracle Instance | |
| |
| |
File Architecture | |
| |
| |
Memory Architecture | |
| |
| |
Process Architecture | |
| |
| |
Oracle Server: Internal Architecture | |
| |
| |
Tablespaces | |
| |
| |
Segments | |
| |
| |
Extents | |
| |
| |
Database Blocks | |
| |
| |
Views | |
| |
| |
Materialized Views | |
| |
| |
The Oracle Data Dictionary | |
| |
| |
Metadata | |
| |
| |
Security Features | |
| |
| |
Schemas and Users | |
| |
| |
System Privileges | |
| |
| |
Object Privileges | |
| |
| |
Roles | |
| |
| |
Auditing | |
| |
| |
Virtual Private Databases | |
| |
| |
Oracle Optimizer | |
| |
| |
I/O Operations | |
| |
| |
Sort Operations | |
| |
| |
Backup and Recovery Features | |
| |
| |
Oracle Server Options | |
| |
| |
Summary | |
| |
| |
Designing the Oracle Data Warehouse | |
| |
| |
Users Need to Understand the Warehouse Structure and Operation | |
| |
| |
Metadata | |
| |
| |
Recording Warehouse Processing as Metadata | |
| |
| |
What Metadata Does Oracle Provide? | |
| |
| |
Delivering Metadata | |
| |
| |
Introduction to a DW Case Study | |
| |
| |
Warehouse Design Process | |
| |
| |
Logical Warehouse Schema Design | |
| |
| |
Detail and Summary Schema | |
| |
| |
Star Schemas | |
| |
| |
External and Reference Data | |
| |
| |
Natural Keys versus Artificial Keys | |
| |
| |
Physical Design | |
| |
| |
Tablespace Design | |
| |
| |
Partitioning Data | |
| |
| |
Indexing | |
| |
| |
Materialized Views | |
| |
| |
Index-Organized Tables | |
| |
| |
Temporary Tables | |
| |
| |
Other Structures | |
| |
| |
Database Constraints and Triggers | |
| |
| |
Generating Unique Key Values | |
| |
| |
Designing for High Availability | |
| |
| |
What Is High Availability? | |
| |
| |
Types of Interruptions | |
| |
| |
Design Techniques for High Availability | |
| |
| |
Balancing Design Objectives | |
| |
| |
Summary | |
| |
| |
Building the Oracle Data Warehouse | |
| |
| |
ETT Tools for Building the Warehouse | |
| |
| |
Using SQL*Plus to Create the Data Warehouse | |
| |
| |
Keeping Logs of Your Actions | |
| |
| |
SQL*Plus Formatting Commands | |
| |
| |
Passing Values to a Script at Run-Time | |
| |
| |
Writing Scripts for Execution in Batch | |
| |
| |
Using SQL to Generate SQL | |
| |
| |
Creating an Oracle Database for the Data Warehouse | |
| |
| |
Estimating the Size of the Database | |
| |
| |
Creating Tablespaces | |
| |
| |
Creating the Schema | |
| |
| |
Creating Tables | |
| |
| |
Creating Index-Organized Tables | |
| |
| |
Creating Temporary Tables | |
| |
| |
Building Indexes for the Data Warehouse | |
| |
| |
Oracle Database Clusters | |
| |
| |
Analyzing Schema Objects | |
| |
| |
Using Views in the Data Warehouse | |
| |
| |
Using Views and DECODE for Complex Analyses | |
| |
| |
Creating Materialized Views in the Data Warehouse | |
| |
| |
Views, Materialized Views, or Summary Tables? | |
| |
| |
Testing the Data Warehouse | |
| |
| |
Summary | |
| |
| |
Populating the Oracle Data Warehouse | |
| |
| |
Dealing with Dirty Data | |
| |
| |
Dealing with Missing Data | |
| |
| |
Reconciling with ETT Tools | |
| |
| |
Reconciling during Extract | |
| |
| |
Reconciling during Load | |
| |
| |
Loading Data | |
| |
| |
Timing of Loads | |
| |
| |
Using SQL*Loader | |
| |
| |
Using Custom Load Programs | |
| |
| |
Direct Load APIs | |
| |
| |
Using Oracle's Gateway Products | |
| |
| |
Extracting Data from Oracle Sources | |
| |
| |
Post-Processing of Loaded Data | |
| |
| |
Validating the Loaded Data | |
| |
| |
Index Builds and Rebuilds | |
| |
| |
"Publishing" Data | |
| |
| |
Purging and Archiving Warehouse Data | |
| |
| |
Techniques for Archiving Data | |
| |
| |
Issues with Archiving | |
| |
| |
Techniques for Purging Data | |
| |
| |
Summary | |
| |
| |
Post-Load Processing in the Data Warehouse | |
| |
| |
Tactical versus Strategic | |
| |
| |
Operational Data Stores and Data Warehouses | |
| |
| |
Post-Load Processing | |
| |
| |
Summarization and Aggregation | |
| |
| |
Filtering | |
| |
| |
Merging and Denormalization | |
| |
| |
Oracle8i Materialized Views | |
| |
| |
Snapshots and Materialized Views | |
| |
| |
Guidelines for Creating Materialized Views | |
| |
| |
Refresh | |
| |
| |
Query Rewrite | |
| |
| |
Life before Materialized Views | |
| |
| |
Parallel CREATE TABLE ... AS SELECT (pCTAS) | |
| |
| |
Parallel DML | |
| |
| |
Procedural Code for Complex Logic | |
| |
| |
Summary | |
| |
| |
Administering and Monitoring the Oracle Data Warehouse | |
| |
| |
Who Are We? Why Are We Here? | |
| |
| |
Business Requirements | |
| |
| |
System Requirements | |
| |
| |
Administration Roles | |
| |
| |
Data Warehouse Administrator (DWA) | |
| |
| |
Database Administrator (DBA) | |
| |
| |
System Administrator | |
| |
| |
Security Administrator | |
| |
| |
Network Administrator | |
| |
| |
Help Desk and Problem Resolution | |
| |
| |
Data Warehouse Security Issues | |
| |
| |
Effective Security | |
| |
| |
Unobtrusive Security | |
| |
| |
Simple Security | |
| |
| |
More Sophisticated Security Tools and Techniques | |
| |
| |
Data Warehouse Configuration Issues | |
| |
| |
Optimal Flexible Architecture (OFA) | |
| |
| |
Build/Rebuild Documentation and Scripts | |
| |
| |
Identifying All Inbound Data and Its Sources | |
| |
| |
Job Scheduling | |
| |
| |
Identifying All Outbound Data and Its Destinations | |
| |
| |
Capacity Planning | |
| |
| |
Development, Test, and Training Environments | |
| |
| |
Backup and Recovery | |
| |
| |
Enabling and Optimizing Auditing | |
| |
| |
Configuration Rules of Thumb | |
| |
| |
Locally Managed Tablespaces | |
| |
| |
SYSTEM Tablespace | |
| |
| |
TEMP Tablespace | |
| |
| |
Rollback Segments and the RBS Tablespace | |
| |
| |
DB_BLOCK_SIZE | |
| |
| |
Setting "init.ora" Initialization Parameters | |
| |
| |
Redo Logfiles | |
| |
| |
Monitoring Rules of Thumb | |
| |
| |
Monitoring the "Worst" SQL Statements | |
| |
| |
Monitoring the Top Consumers in the Operating System | |
| |
| |
Monitoring Configurable Resources | |
| |
| |
Monitoring the AUDIT Session Statistics | |
| |
| |
Monitoring Rollback Segment Contention | |
| |
| |
Monitoring Latch Contention and Locking | |
| |
| |
Monitoring the Number of Extents | |
| |
| |
Monitoring Freespace | |
| |
| |
Monitoring Tablespace Fragmentation | |
| |
| |
Monitoring Invalid Compiled Objects | |
| |
| |
Monitoring "Stale" Optimizer Statistics | |
| |
| |
Summary | |
| |
| |
Data Warehouse Performance Tuning | |
| |
| |
Sizing Your System | |
| |
| |
Setting Up the I/O Subsystem | |
| |
| |
Pretuning the Data Warehouse | |
| |
| |
Materialized Views | |
| |
| |
Usage Tracking | |
| |
| |
Optimizing the Load Phase | |
| |
| |
SQL*Loader Direct Path versus "Conventional SQL" | |
| |
| |
Parallel Loads | |
| |
| |
Drop and Re-Create Indexes for Large-Volume Loads | |
| |
| |
Constraints in the Data Warehouse | |
| |
| |
Pipelining Instead of "Staging" Flat Files | |
| |
| |
Range Partitioning to publish New Data | |
| |
| |
Range Partitioning to archive or purge Old Data | |
| |
| |
Tuning the SQL | |
| |
| |
Understanding SQL Tracing | |
| |
| |
Formatting a SQL Tuning Report with TKPROF | |
| |
| |
Looking for Problems | |
| |
| |
Reactive Tuning | |
| |
| |
Summary | |
| |
| |
Parallel Execution in the Oracle Data Warehouse | |
| |
| |
What Is Oracle Parallel Execution? | |
| |
| |
When Are Parallel Operations Useful? | |
| |
| |
Basic Parallel Queries | |
| |
| |
Dynamic Partitioning | |
| |
| |
Performing the Partitioning | |
| |
| |
Recruiting Parallel Query Slaves | |
| |
| |
Assigning Subranges | |
| |
| |
Receiving and Collating the Results | |
| |
| |
Parallel Direct-Path SQL*Loader | |
| |
| |
Parallel Create Index (pCI) | |
| |
| |
Parallel Recovery | |
| |
| |
Parallel Create Table ... As Select (pCTAS) | |
| |
| |
Extent Trimming in Oracle | |
| |
| |
Parallel Indexed Queries on Partitioned Tables | |
| |
| |
Parallel DML | |
| |
| |
Parallel Insert Append | |
| |
| |
Parallel Insert Noappend | |
| |
| |
Parallel Update and Delete | |
| |
| |
Parallel DDL | |
| |
| |
Parallel Alter Index ... Rebuild | |
| |
| |
Parallel Move Partition | |
| |
| |
Parallel Split Partition | |
| |
| |
Parallel Merge Partition | |
| |
| |
Parallel Alter Table ... Move | |
| |
| |
Parallel Alter Table ... Move Online | |
| |
| |
Parallel Alter Index ... Rebuild Partition | |
| |
| |
Parallel Alter Index ... Rebuild Subpartition | |
| |
| |
Parallel Alter Index ... Rebuild Online | |
| |
| |
Parallel Statistics Gathering | |
| |
| |
Summary | |
| |
| |
Warehousing with Oracle Parallel Server | |
| |
| |
What Is Oracle Parallel Server? | |
| |
| |
Parallel Server versus Distributed Databases | |
| |
| |
Again, What Is Parallel Server? | |
| |
| |
File Cabinets and File Clerks | |
| |
| |
Extending the Analogy to Include Parallel Server | |
| |
| |
Why Use Parallel Server? | |
| |
| |
Benefit: Performance and Capacity Scaleup | |
| |
| |
Benefit: Higher Availability | |
| |
| |
Benefit: Performance Speedup | |
| |
| |
Cost: Greater Management Complexity | |
| |
| |
Cost: Not Transparent to Some Applications | |
| |
| |
Load Segregation Using Data Routing | |
| |
| |
Randomly Segregating Load | |
| |
| |
More on PCM Locks | |
| |
| |
Administering Parallel Server | |
| |
| |
Installing and Linking in the Parallel Server Option (Required) | |
| |
| |
Separate Threads of Redo Log Files (Required) | |
| |
| |
Optimizing Parallel Server for Data Warehouses | |
| |
| |
Initialization Parameters | |
| |
| |
Detecting Contention Specific to Parallel Server | |
| |
| |
Reducing Enqueue Processing by Disabling "Table Locks" | |
| |
| |
Reducing Enqueue Processing with Locally Managed Tablespaces | |
| |
| |
Optimizing Redo Log Files in Parallel Server | |
| |
| |
Optimizing Rollback Segments in Parallel Server | |
| |
| |
Optimizing Sorting in Parallel Server | |
| |
| |
Optimizing Data Loading in Parallel Server | |
| |
| |
Summary | |
| |
| |
Distributing the Oracle Data Warehouse | |
| |
| |
Reasons to Consider a Distributed Data Warehouse | |
| |
| |
Increased Availability | |
| |
| |
Cost | |
| |
| |
Distributed Database Terminology | |
| |
| |
Remote Query | |
| |
| |
Distributed Query | |
| |
| |
Remote Transaction | |
| |
| |
Distributed Transaction | |
| |
| |
Replication | |
| |
| |
Oracle Distributed Database Technology | |
| |
| |
Database Links | |
| |
| |
Synonyms | |
| |
| |
Views | |
| |
| |
Gateways or "Heterogenous Services" | |
| |
| |
Oracle Replication Facilities | |
| |
| |
Alternative Distributed Architectures | |
| |
| |
Data Marts | |
| |
| |
Distributed Data Warehouse | |
| |
| |
Replicated Data Warehouse | |
| |
| |
Summary | |
| |
| |
Analytical Processing in the Oracle Data Warehouse | |
| |
| |
Overview and Agenda | |
| |
| |
Analytical Functions in Oracle SQL | |
| |
| |
Rollup Extension to Group By | |
| |
| |
Cube Extension to Group By | |
| |
| |
Grouping Function | |
| |
| |
Analytical Functions | |
| |
| |
Browsing and Ad Hoc Reporting | |
| |
| |
Oracle Discoverer | |
| |
| |
Online Analytical Processing | |
| |
| |
The Origins of OLAP | |
| |
| |
The Express Data Model | |
| |
| |
Dimensions | |
| |
| |
Attributes | |
| |
| |
Measures | |
| |
| |
The History of Oracle Express | |
| |
| |
The Technology of Oracle Express | |
| |
| |
Data Storage and Access | |
| |
| |
Offset Addressing | |
| |
| |
Persistent Selections | |
| |
| |
Inversions | |
| |
| |
XCA Communications | |
| |
| |
Application Architectures | |
| |
| |
Data Distribution Architectures | |
| |
| |
Load and Storage Strategy | |
| |
| |
Implementing an OLAP Solution | |
| |
| |
Requirements Definition | |
| |
| |
Analysis | |
| |
| |
Design and Build | |
| |
| |
Data Loading and Integration with External Systems | |
| |
| |
Loading Data | |
| |
| |
Aggregation/Rollup | |
| |
| |
Performance Issues and Tuning | |
| |
| |
Gathering Information to Help Tune | |
| |
| |
Express NT Server Configuration Manager Settings | |
| |
| |
Front-End Alternatives | |
| |
| |
The Oracle Express Applications | |
| |
| |
Relational Access Manager/Administrator | |
| |
| |
Oracle Express Objects and Express Analyzer | |
| |
| |
Other Development Tools | |
| |
| |
Express and the Web | |
| |
| |
Summary | |
| |
| |
Index | |