《微軟數(shù)據(jù)倉庫介紹》由會(huì)員分享,可在線閱讀,更多相關(guān)《微軟數(shù)據(jù)倉庫介紹(16頁珍藏版)》請(qǐng)?jiān)谘b配圖網(wǎng)上搜索。
1、,Slide Title,Body Text,Second level,Third level,Fourth level,Fifth level,Module,1,Introduction to,D,ata Warehousing,Module Overview,數(shù)據(jù)倉庫概述,考慮數(shù)據(jù)倉庫解決方案,Lesson 1,:,數(shù)據(jù)倉庫概述,商業(yè)難題,什么是數(shù)據(jù)倉庫?,數(shù)據(jù)倉庫架構(gòu),數(shù)據(jù)倉庫解決方案組件,數(shù)據(jù)倉庫項(xiàng)目,數(shù)據(jù)倉庫項(xiàng)目角色,SQL,Server,作為數(shù)據(jù)倉庫平臺(tái),業(yè)務(wù)難題,關(guān)鍵業(yè)務(wù)數(shù)據(jù)分布在多個(gè)業(yè)務(wù)系統(tǒng),找到業(yè)務(wù)決策的信息是耗時(shí)的和容易出錯(cuò)的,基本的業(yè)務(wù)問題很難回答,?,What Is,a
2、,Data Warehouse?,一,個(gè)集中存放用于報(bào)表和數(shù)據(jù)的信息容器,通常,一個(gè)數(shù)據(jù)倉庫,:,包含大量的歷史數(shù)據(jù),優(yōu)化了數(shù)據(jù)查詢,(,而,不是插入和更新,),定期加載新的業(yè)務(wù)數(shù)據(jù),為企業(yè)商務(wù)智能解決方案提供依據(jù),Data Warehouse,Architectures,Centralized Data Warehouse,Departmental Data Mart,Hub and,Spoke,Components of a Data,Warehousing,Solution,從業(yè)務(wù)系統(tǒng)和其他數(shù)據(jù)源抽取數(shù)據(jù)加載,數(shù)據(jù)通常最終加載到數(shù)據(jù)倉庫,數(shù)據(jù)清洗和重復(fù)數(shù)據(jù)的刪除,確保數(shù)據(jù)倉庫中數(shù)據(jù)的質(zhì)量
3、,MDM,提供確切的業(yè)務(wù)數(shù)據(jù)實(shí)體,Data Warehouse,Staging Database,ETL Load Process,ETL Staging Process,Master,D,ata Management,10,1,10,00110,Data Cleansing,Reporting and Analysis,Data Sources,Data,Warehousing Projects,首先確定數(shù)據(jù)倉庫需要解決的業(yè)務(wù)問題,確定回答這些問題所需的數(shù)據(jù),識(shí)別所需數(shù)據(jù)的數(shù)據(jù)源,評(píng)估關(guān)鍵業(yè)務(wù)目標(biāo)價(jià)值可行性,從現(xiàn)在的數(shù)據(jù)回答每個(gè)問題,對(duì)大量數(shù)據(jù)的項(xiàng)目,使用增量更新比較有效,:,把項(xiàng)目分解為多
4、個(gè)子項(xiàng)目,每,個(gè)子項(xiàng)目處理一個(gè)特定的主題,Data,Warehousing Project,Roles,Project manager,Solution architect,Data modeler,Database,administrator,Infrastructure,specialist,ETL,developer,Business users/analyst,Testers,Data stewards,SQL Server As a Data Warehousing Platform,SQL Server,Analysis Services,SQL Server Database E
5、ngine,Microsoft SQL Server Integration Services,SQL Server Master,D,ata Services,10,1,10,00110,SQL Server Data Quality Services,Microsoft SQL Azure,and the Windows Azure,M,arketplace,Microsoft SharePoint Server,Microsoft PowerPivot Technologies,Microsoft Excel,Data Mining Add-In,PowerPivot Add-In,MD
6、S Add-In,Power View,SQL Server,Reporting Services,Reports,KPIs,and Dashboards,Interactive data visualizations,Interactive data analysis,Data Warehousing,Business Intelligence,Lesson,2,:Considerations for a,Data Warehouse Solution,Data Warehouse Database and Storage,Data Sources,Extract,Transform,and
7、 Load Processes,Data Quality,and Master,Data,Management,Data Warehouse Database,and Storage,考慮數(shù)據(jù)倉庫包括,:,Database schema,Logical:typically denormalized,for optimal read performance,Physical:,often partitioned for performance and management,Hardware,Query processing and memory,Storage,Network,High,avai
8、lability,and,disaster recovery,Hardware redundancy,Backup strategy,Security,Server access,Data permissions,Data Sources,數(shù)據(jù)源連接類型,憑證和權(quán)限,數(shù)據(jù)格式,數(shù)據(jù)采集窗口,Extract,Transform,and Load,Processes,臨時(shí)表:,存放,臨時(shí)數(shù)據(jù),所需的轉(zhuǎn)換,:,提取數(shù)據(jù)時(shí)所需的數(shù)據(jù)轉(zhuǎn)換和清洗,增量,ETL:,數(shù)據(jù)的變化加載,Data Quality,and Master Data Management,Data quality:,Cleansing
9、data:,Validating data values,Ensuring data consistency,Identifying missing values,Deduplicating data,Master data management:,Ensuring consistent business entity definitions across multiple systems,Applying business rules to ensure data validity,10,1,10,00110,Module Review and Takeaways,Why might you consider including a staging area in your ETL solution?,What options might you consider for performing data transformations in an ETL solution?,Why would you assign the data steward role to a business user rather than a database technology specialist?,