| |
| |
Foreword | |
| |
| |
Acknowledgements | |
| |
| |
Biography | |
| |
| |
Introduction | |
| |
| |
| |
Introduction to Data Integration | |
| |
| |
| |
The Importance of Data Integration | |
| |
| |
The natural complexity of data interfaces | |
| |
| |
The rise of purchased vendor packages | |
| |
| |
Key enablement of big data and virtualization | |
| |
| |
| |
What Is Data Integration? | |
| |
| |
Data in motion | |
| |
| |
Integrating into a common format-transforming data | |
| |
| |
Migrating data from one system to another | |
| |
| |
Moving data around the organization | |
| |
| |
Pulling information from unstructured data | |
| |
| |
Moving process to data | |
| |
| |
| |
Types and Complexity of Data Integration | |
| |
| |
The differences and similarities in managing data in motion and persistent data | |
| |
| |
Batch data integration | |
| |
| |
Real-time data integration | |
| |
| |
Big data integration | |
| |
| |
Data virtualization | |
| |
| |
| |
The Process of Data Integration Development | |
| |
| |
The data integration development life cycle | |
| |
| |
Inclusion of business knowledge and expertise | |
| |
| |
| |
Batch Data Integration | |
| |
| |
| |
Introduction to Batch Data Integration | |
| |
| |
What is batch data integration? | |
| |
| |
Batch data integration life cycle | |
| |
| |
| |
Extract, Transform, and Load | |
| |
| |
What is ETL? | |
| |
| |
Profiling | |
| |
| |
Extract | |
| |
| |
Staging | |
| |
| |
Access layers | |
| |
| |
Transform | |
| |
| |
Simple mapping | |
| |
| |
Lookups | |
| |
| |
Aggregation and normalization | |
| |
| |
Calculation | |
| |
| |
Load | |
| |
| |
| |
Data Warehousing | |
| |
| |
What is data warehousing? | |
| |
| |
Layers in an enterprise data warehouse architecture | |
| |
| |
Operational application layer | |
| |
| |
External data | |
| |
| |
Data staging areas coming into a data warehouse | |
| |
| |
Data warehouse data structure | |
| |
| |
Staging from data warehouse to data mart or business intelligence | |
| |
| |
Business Intelligence Layer | |
| |
| |
Types of data to load in a data warehouse | |
| |
| |
Master data in a data warehouse | |
| |
| |
Balance and snapshot data in a data warehouse | |
| |
| |
Transactional data in a data warehouse | |
| |
| |
Events | |
| |
| |
Reconciliation | |
| |
| |
Interview with an expert: Krish Krishnan on data warehousing and data integration | |
| |
| |
| |
Data Conversion | |
| |
| |
What is data conversion? | |
| |
| |
Data conversion life cycle | |
| |
| |
Data conversion analysis | |
| |
| |
Best practice data loading | |
| |
| |
Improving source data quality | |
| |
| |
Mapping to target | |
| |
| |
Configuration data | |
| |
| |
Testing and dependencies | |
| |
| |
Private data | |
| |
| |
Proving | |
| |
| |
Environments | |
| |
| |
| |
Data Archiving | |
| |
| |
What is data archiving? | |
| |
| |
Selecting data to archive | |
| |
| |
Can the archived data be retrieved? | |
| |
| |
Conforming data structures in the archiving environment | |
| |
| |
Flexible data structures | |
| |
| |
Interview with an expert: John Anderson on data archiving and data integration | |
| |
| |
| |
Batch Data Integration Architecture and Metadata | |
| |
| |
What is batch data integration architecture? | |
| |
| |
Profiling tool | |
| |
| |
Modeling tool | |
| |
| |
Metadata repository | |
| |
| |
Data movement | |
| |
| |
Transformation | |
| |
| |
Scheduling | |
| |
| |
Interview with an expert: Adrienne Tannenbaum on metadata and data integration | |
| |
| |
| |
Real Time Data Integration | |
| |
| |
| |
Introduction to Real-Time Data Integration | |
| |
| |
Why real-time data integration? | |
| |
| |
Why two sets of technologies? | |
| |
| |
| |
Data Integration Patterns | |
| |
| |
Interaction patterns | |
| |
| |
Loose coupling | |
| |
| |
Hub and spoke | |
| |
| |
Synchronous and asynchronous interaction | |
| |
| |
Request and reply | |
| |
| |
Publish and subscribe | |
| |
| |
Two-phase commit | |
| |
| |
Integrating interaction types | |
| |
| |
| |
Core Real-Time Data Integration Technologies | |
| |
| |
Confusing terminology | |
| |
| |
Enterprise service bus (ESB) | |
| |
| |
Interview with an expert: David S. Linthicum on ESB and data integration | |
| |
| |
Service-oriented architecture (SOA) | |
| |
| |
Extensible markup language (XML) | |
| |
| |
Interview with an expert: M. David Allen on XML and data integration | |
| |
| |
Data replication and change data capture | |
| |
| |
Enterprise application integration (EAI) | |
| |
| |
Enterprise information integration (EII) | |
| |
| |
| |
Data Integration Modeling | |
| |
| |
Canonical modeling | |
| |
| |
Interview with an expert: Dagna Gaythorpe on canonical modeling and data integration | |
| |
| |
Message modeling | |
| |
| |
| |
Master Data Management | |
| |
| |
Introduction to master data management | |
| |
| |
Reasons for a master data management solution | |
| |
| |
Purchased packages and master data | |
| |
| |
Reference data | |
| |
| |
Masters and slaves | |
| |
| |
External data | |
| |
| |
Master data management functionality | |
| |
| |
Types of master data management solutions-registry and data hub | |
| |
| |
| |
Data Warehousing with Real-Time Updates | |
| |
| |
Corporate information factory | |
| |
| |
Operational data store | |
| |
| |
Master data moving to the data warehouse | |
| |
| |
Interview with an expert: Krish Krishnan on real-time data warehousing updates | |
| |
| |
| |
Real-Time Data Integration Architecture and Metadata | |
| |
| |
What is real-time data integration metadata? | |
| |
| |
Modeling | |
| |
| |
Profiling | |
| |
| |
Metadata repository | |
| |
| |
Enterprise service bus-data transformation and orchestration | |
| |
| |
Technical mediation | |
| |
| |
Business content | |
| |
| |
Data movement and middleware | |
| |
| |
External interaction | |
| |
| |
| |
Big, Cloud, Virtual Data | |
| |
| |
| |
Introduction to Big Data Integration | |
| |
| |
Data integration and unstructured data | |
| |
| |
Big data, cloud data, and data virtualization | |
| |
| |
| |
Cloud Architecture and Data Integration | |
| |
| |
Why is data integration important in the cloud? | |
| |
| |
Public cloud | |
| |
| |
Cloud security | |
| |
| |
Cloud latency | |
| |
| |
Cloud redundancy | |
| |
| |
| |
Data Virtualization | |
| |
| |
A technology whose time has come | |
| |
| |
Business uses of data virtualization | |
| |
| |
Business intelligence solutions | |
| |
| |
Integrating different types of data | |
| |
| |
Quickly add or prototype adding data to a data warehouse | |
| |
| |
Present physically disparate data together | |
| |
| |
Leverage various data and models triggering transactions | |
| |
| |
Data virtualization architecture | |
| |
| |
Sources and adapters | |
| |
| |
Mappings and models and views | |
| |
| |
Transformation and presentation | |
| |
| |
| |
Big Data Integration | |
| |
| |
What is big data? | |
| |
| |
Big data dimension-volume | |
| |
| |
Massive parallel processing-moving process to data | |
| |
| |
Hadoop and MapReduce | |
| |
| |
Integrating with external data | |
| |
| |
Visualization | |
| |
| |
Big data dimension-variety | |
| |
| |
Types of data | |
| |
| |
Integrating different types of data | |
| |
| |
Interview with an expert: William McKnight on Hadoop and data integration | |
| |
| |
Big data dimension-velocity | |
| |
| |
Streaming data | |
| |
| |
Sensor and GPS data | |
| |
| |
Social media data | |
| |
| |
Traditional big data use cases | |
| |
| |
More big data use cases | |
| |
| |
Health care | |
| |
| |
Logistics | |
| |
| |
National security | |
| |
| |
Leveraging the power of big data-real-time decision support | |
| |
| |
Triggering action | |
| |
| |
Speed of data retrieval from memory versus disk | |
| |
| |
From data analytics to models, from streaming data to decisions | |
| |
| |
Big data architecture | |
| |
| |
Operational systems and data sources | |
| |
| |
Intermediate data hubs | |
| |
| |
Business intelligence tools | |
| |
| |
Data virtualization server | |
| |
| |
Batch and real-time data integration tools | |
| |
| |
Analytic sandbox | |
| |
| |
Risk response systems/recommendation engines | |
| |
| |
Interview with an expert: John Haddad on Big Data and data integration | |
| |
| |
| |
Conclusion to Managing Data in Motion | |
| |
| |
Data integration architecture | |
| |
| |
Why data integration architecture? | |
| |
| |
Data integration life cycle and expertise | |
| |
| |
Security and privacy | |
| |
| |
Data integration engines | |
| |
| |
Operational continuity | |
| |
| |
ETL engine | |
| |
| |
Enterprise service bus | |
| |
| |
Data virtualization server | |
| |
| |
Data movement | |
| |
| |
Data integration hubs | |
| |
| |
Master data | |
| |
| |
Data warehouse and operational data store | |
| |
| |
Enterprise content management | |
| |
| |
Data archive | |
| |
| |
Metadata management | |
| |
| |
Data discovery | |
| |
| |
Data profiling | |
| |
| |
Data modeling | |
| |
| |
Data flow modeling | |
| |
| |
Metadata repository | |
| |
| |
The end | |
| |
| |
References | |
| |
| |
Index | |