The learn power bi we need data to work with. The Contoso Retail DW database from Microsoft is freely available, so we’ll use that one. You could just install a copy to a local desktop version of SQL Express, but instructions on deploying to Azure SQL Server for free can be found here
A description of the dataset from Microsoft is as follows:
‘The Contoso BI Demo dataset is used to demonstrate DW/BI functionalities across the entire Microsoft Office product family. This dataset includes C-level, sales/marketing, IT, and common finance scenarios for the retail industry and support map integration. In addition, this dataset offers large volumes of transactions from OLTP and well-structured aggregations from OLAP, along with reference and dimension data’.
So although, when you work in business, the data warehouse will be different. In a Retail business, it will include various similarities that that of the Contoso data warehouse. Indeed there will still be similarities across other types of organisations, such as Business to Business (B2B) and Charitable organisations. They will all for example have a customer table and a sales/revenue table. Customers will have similar characteristics, such as their first order date, and last order date, and demographics, such as age, gender, and location. Field names might be different, but there will be similarities. Some organizations will have additional datasets bespoke to their industry. Online gambling companies, for example, will have tables for bonus amounts, although this might be compared to rewards given by retail companies.
Another example is VIP/loyalty levels which are also used in many industries. Supermarkets have loyalty programs, Charities, and B2B organizations will identify high-level donors or high-value customers. Gambling Companies will have high-value losing customers.
So let’s have a look at the Contoso Retail Datawarehouse and see what’s in it:
Dimension Tables
dbo.DimAccount
SELECT TOP 100 * FROM dbo.DimAccount
Inspecting the information we can see a list of accounts. This will likely have been pulled from an accounting (Finance Department).
dbo.DimChannel
SELECT TOP 100 * FROM dbo.DimChannel
This table is just a list of 4 channels: Store, Online, Catalog, and Reseller. We would expect to use this information when querying transaction information to add what channel the transactions are from. We would expect to join using the ChannelKey field.
dbo.DimCurrency
SELECT TOP 100 * FROM dbo.DimCurrency
Again a similar table to DimChannel but for currency. We would expect a sales table to include the currency key to which we can join on to add currency to the query.
dbo.DimCustomer
SELECT TOP 100 * FROM dbo.DimCustomer
The most important table (in addition to sales) in the data warehouse. The big list of businesses. customers. We would expect to join this to a sales table, to report on what customers are buying.
Fields in this table include gender, dob, email address, address, phone numbers, and first purchase date. All sensitive personal information, so you need to think about how you work with this data and who has access to it.
dbo.DimDate
SELECT TOP 100 * FROM dbo.DimDate
An extremely useful and important table, that you would expect to be working with a lot in your reporting. By joining the table to sales orders, you can then not just the relevant Calendar Year/Quarter/Month, but the Financial Year, quarter, month, and week, which are often specific to the organization.
dbo.DimEmployee
SELECT TOP 100 * FROM dbo.DimEmployee
A useful table of employee information likely used for a variety of report requirements, not just related to pay, but maybe related to sales.
dbo.DimEntity
SELECT TOP 100 * FROM dbo.DimEntity
A table of global region and country sites, which are relevant to multinational organizations, used for Global group reporting.
dbo. DimGeography
SELECT TOP 100 * FROM dbo.DimGeography
A table with countries and regions, is probably used to classify countries by Global region in some way.
dbo.DimMachine
SELECT TOP 100 * FROM dbo.DimMachine
Appears to relate to certain types of equipment in stores.
dbo.DimOutage
SELECT TOP 100 * FROM dbo.DimOutage
A table of computer outages, possibly related to DimMachine
dbo.DimProduct
SELECT TOP 100 * FROM dbo.DimProduct
A key table, listing all the products presumably for sales, with a product key to join it to the sales table.
dbo.DimProductCategory
SELECT TOP 100 * FROM dbo.DimProductCategory
A table one would expect to join to DimProduct to allow for category level reporting.
dbo.DimProductSubCategory
SELECT TOP 100 * FROM dbo.DimProductSubCategory
Similar to dbo.DimProductCategory but presumably the subcategory level
dbo.DimPromotion
SELECT TOP 100 * FROM dbo.DimPromotion
A table of promotions, we can join it to the sales table and add marketing information to sales
dbo.DimSalesTerritory
SELECT TOP 100 * FROM dbo.DimSalesTerritory
A table of territories, presumably relevant to the sales team, which may also join to dbo.DimEmployee
dbo.DimScenario
SELECT TOP 100 * FROM dbo.DimScenario
A Finance table with keys for Actual, Budget, and Forecast. There should be sales information we can join to, to determine what type of data it is (Actual, Budget, and forecast). Top-level reporting for example will want to show actual revenue against budget and forecast.
dbo.DimStore
SELECT TOP 100 * FROM dbo.DimStore
A table of store names and addresses.
Fact Tables
dbo.FactExchangeRate
SELECT TOP 100 * FROM dbo.FactExchangeRate
A table of exchange rates. With a multi-national organization, the business will want to be able to report in Global sales (perhaps in dollars), so this table will be key. One would expect this to be updated on a regular basis, but it might be just once a quarter.
dbo.FactInventory
SELECT TOP 100 * FROM dbo.FactInventory
An essential table keeping a count of all products in stock.
A table relating to the costs of machines (presumably check-outs)
dbo.FactITSLA
SELECT TOP 100 * FROM dbo.FactITSLA
A table with information about the uptime of machines
dbo.FactOnlineSales
SELECT TOP 100 * FROM dbo.FactOnlineSales
A transaction table of online sales
dbo.FactSales
SELECT TOP 100 * FROM dbo.FactSales
The most important table of all (along with the customers). The transaction table, possibly including the Online sales data. More investigation is needed.
FactSalesQuota
SELECT TOP 100 * FROM dbo.FactSalesQuota
The sales quotas by store, with scenarios set next to them (Actual, Budget, Forecast), are presumably used for reporting on what stores are hitting their targets.
dbo.FactStrategyPlan
SELECT TOP 100 * FROM dbo.FactStrategyPlan
An empty table
So that’s a brief look at all the tables. As you can see we’ve covered data relevant to C-Suite (targets, store performance, global/regional sales), Sales (transaction/region/customer, employee/quotas), Marketing (Promotions), Finance (Accounts, Sales Quotas, exchange rates, costs).