Category: Power BI

Setting up Power BI and Git Integration with Azure DevOps

Git version control keeps track of every change to DAX measures, model relationships, table metadata, M code, and report visuals (if any). It allows you to go back to older versions and compare versions, and see exactly what has been changed.
If 2 people work on the same PBIX file, changes can be overwritten, making collaboration difficult. With Git, each person works on their own branch, and Git merges the changes, so you can review and approve changes.

You need to convert your PBIX file to a PBIP file for Power BI Git integration. The PBIP file points to a folder holding all your Power BI datasets and report elements in text-readable files(except the data itself), but when you work with PBIP, you can see data (confusing) as Power BI Desktop stores it in a hidden temp file separate from the PBIP folder, so you can still add tables, etc, with PBIP.

There are 2 options:

1. GitHub
2. Azure DevOps

Both use the same underlying Git technology, which is the open source version control system originally created by Linus Torvald. Microsoft bought GitHub for $7.5 billion dollars in 2018. So Microsoft actually owns both.

Power BI Git Integration – GitHub vs. Azure DevOps

A comparison summary of GitHub vs. Azure DevOps is below. Privacy and security are the main differences.

Feature	Azure DevOps	GitHub
Home	Microsoft Entra ID tenant	Github.com
Privacy	Higher – stays in corporate tenant, controlled by your policies	Lower – repos live in GitHub cloud, identity separate unless using SSO
Best for	Enterprise, regulated data, security	Open-source, general dev, small teams.
Access control	Uses Entra ID roles, security groups, conditional access, MFA	Github roles + optional Github
Data	Repos stored inside Azure DevOps under your tenant	Repos stores in GitHub’s cloud.
Cost	Free for individuals, companies pay.	Free for individuals, companies pay.

Power BI File Types

There are several different Power BI file types, which are useful to understand before diving into Git Integration.

File Type	Includes	Features
PBIX file – compresses the PBIP + cached data	Everything in one file – Report visuals – Data model (semantic model) – Power Query M code – cached data (imported tables).	Binary – not Git friendly. You cannot track changes.
PBIP – Power BI Project (Folder based format)	Folder including: – model.bim (semantic model) – DataModelSchema.json (Power Query / M) – *.pbir (report JSON) – Project.json	These are Git friendly (all txt files) PBIP – Power BI Project (Folder-based format) You can merge changes. Designed for DevOps Tabular editor can edit a BIM file.
BIM – Tabular Model file (model.BIM)	A JSON file that can be created by PBI desktop, Tabular editor, or Analysis Services tabular.	Does not contain Power Query M, Data or report pages. You can version control the semantic model only.
TMDL – Tabular Modular Definition (newer than BIM)	Created by TE3 and Fabric Semantic Models. Contains the same info as BIM but in separate files: – e.g. Sales.json – Data.json – Amount.json – Total sales.json	Better for version control, but not fully supported and more complex.
TMSL – Tabular Model Scripting Language	JSON format used in XMLA deployments
JSON Power BI Report (.PBIR)	Part of PBIP – Contains only the report layout, no model or data.

Git version control

For solo development, you can just use Git locally, but you need to switch between PBIX and PBIP files that can be synced with the Git Repo.

For multi-developer safety and automation, enable workspace Git integration:
When you use Git integration with Power BI, it automatically converts the file into a PBIP format in your Git repo.

– Set up workspace with Git Integration
– Connect the workspace to a Gi Repo
– Choose the direction of the sync (Workspace > Git)

Note: Syncing with Git is manual

When you upload to the PBI service, the service will show “Uncommitted changes”.
Then you have to manually synchronize changes with GitHub

So let’s try it.
Luckily, I still have my Fabric free trial on my Power BI account running. It’s on FT1.

To get started, I will setup the following first:

Create a new workspace called: Gi Test Dataset
I create a test dataset and load my Adventure Works tables into it.
Publish my dataset to my GI Test Dataset workspace.
I go to my workspace settings and select Git Integration.

I note that i only have the option for Azure DevOps and not GitHub.
Power BI GitHub integration can be enabled in the Tenant Admin Control settings.

So I will try with Azure DevOps first:

I select Connect and i’m given some details to fill in:

To use Azure DevOps for Power BI git integration, you ned to use an Azure account, which you can get for free to start, if you want to try it.
Once you get to DevOps, you start by setting up your company name and project

My project URL is now as follows, where companyname is the organisation name I chose.
https:/dev.azure.com/companyname/adventureworksgittest

Next, we select Repos, and then we just need to click Initialise to set up the main branch, and it should be ready.

Now I seem to have everything I need. It just asks for a Git folder name, which I enter last.

I hit connect, and i get an error:

So I go back to the Admin portal and enable the other geographical locations option.

Enabled export to Git in other locations

That cleared the error, and when I retry my git connection, I get this message:

And it seems to be working.

Returning to Azure, I can see my Power BI Test folder, which is my PBIP folder, and 2 sub-folders.

The first folder has report-related files, the second has dataset-related files, all in tmdl format

Then we have the model tmdl files:

Files included in the folder on Git are as follows:

The table files contain:
Columns, Measures, Hierarchies, Datatypes, and DAX Measure Definitions
The relationships.tmdl file contains all table relationships.
The model.tmdl contains calculation groups, Global metadata, and Time Intelligence
database.tmdl contains: Data sources, Connection strings, partitions, and refresh policies.
/Cultures/en-US.tmdl contains translations, format strings, etc.
definition.pbism is the entry point file for the semantic model.

Clicking on a table, we can see the TMDL for the table. That is pretty nice. You can even edit the model in DevOps and commit the changes back (but don’t break the model).

If I scroll down, I can also see the m code. That’s pretty useful.

Next, I’m going to add a measure to my dataset and republish.

I go back to Azure DevOps and check my table, but I can’t see my measure.
Because I used the PBIX file, it overwrote the dataset, and so all changes disappeared. Hmmmm.

So I need to change my PBIX file to a PBIP folder

When I publish it still overwrites the dataset.

When I check my files in the workspace, I have a conflict on the dataset.
This is because I started with a PBIX and switched to PBIP

So I went to my Git Repo and deleted my PowerBITest folder, then in Power BI workspace settings > Github integration and disconnected.

Now reconnect my Power BI git Integration as before, and I have green lights (and my folder is back in my Git Repo).

I can now see my measure in my FactInternetSales.tmdl file.

Now I create another measure – but now I’m working in my PBIP (Power BI Project) File.
This time it says ‘Uncommitted’. Good.

Next, I got back to Git Integration in workspace settings, disconnected, and then reconnected with exactly the same details as before. Now I get this:

After syncing my content from the workspace to Git, I now have the green lights again.

And when I check the TMDL, I can see my new measure (quantity):

Now it’s working, I am going to add a 3rd measure: ‘Discount.’

I check my workspace again and see it is not synced, but I still don’t have a sync option in the Power BI Git Integration settings:

And now have a source control button!

I select the source control button, and now I can commit my changes and some details:

And it worked. Great! My 3rd measure is showing.

I can also see a record of my commits with the notes I entered.

Well, that went pretty smoothly. The next steps will be to test my Power BI git integration further and restore the dataset to an earlier version to see how that works.

View Microsoft Power BI git integration documentation for further information

November 15, 2025

Multiple parameter selections using Invoke Custom Function.
Making use of a parameter and ‘invoke custom function’ allows you to control the filtering of a Power BI report data source without editing the Power BI file. This method can offer filtering multiple selections at the source and offers end users the option of controlling this filtering from a spreadsheet without the need to call the developer.

Setting up Power BI using Invoke Custom Function
1. We will filter on a field called ‘type’, which is used in the v_dim_customer table. Here, we create a parameter called ‘type’ as a text field and give it a current value of ‘Person.’
2. Create a native database query to the SQL Server database, which includes the parameter in the WHERE clause of a SQL statement. Here we create a table from our V_DimCustomer table from the ContosoRetailDW.
```
let
    // Connect to the database
    Source = Sql.Database("localhost", "ContosoRetailDW"),

    // SQL query with a parameter placeholder
    SqlQuery = "SELECT CustomerKey, FirstName, DateFirstPurchase, CustomerType FROM V_DimCustomer WHERE CustomerType = @Type",

   
    Result = Value.NativeQuery(Source, SqlQuery, [Type = Type])
in
    Result
```
3. Create a sheet for importing the options. In a prod environment, you could host this on OneDrive Business, Google Sheets, or SharePoint and import it into the model, so it will refresh in the cloud. I’ve called it ‘ControlTable’ in this example. You could also create a manual table if you want.

4. In the main native SQL query – right click on the table and select ‘Function’. Give the function a name, and note that it selects the parameter used in the query automatically.

5. Now select the control table and on the ‘Add Columns’ Tab, select ‘Invoke Custom Function’, give it a name, and select the control function created earlier.

6. Now, on the new columns created for the control function right right-click the 2 arrows and select the desired columns. This will then create a new output using the fields in the Control table to filter the output of the original query, but based on the fields in the spreadsheet, and will therefore update on refresh.

7. Filter the table to check the selections.

8. Now to test if it all works. In the spreadsheet, change the selection; in this example, select Company only and refresh the data.

Your data should now be filtered by company.

Also of note is that when the custom function table is created, the original query table disappears, but it took a while for Power BI to tidy it up.

For more information on Invoke Custom Function
October 18, 2025
Power BI Model Performance Optimization
1. Model Optimization

a) Remove unnecessary columns and tables
- Use DAX Studio Vertipaq analyzer to understand what the largest table and column are in your dataset.
- Remove columns that are duplicated in other parts of your dataset.
- Use the Measure Killer tool to find columns and tables not used in reports for deletion from the model.
b) Use correct data types:
- Use the smallest datatypes (you can check with Vertipaq analyzer (e.g., fixed decimal)
- Reduce precision (e.g., 0 decimal points instead of two).
c) Use Star schemas
- Use dimensional modelling (Fact-Dimension) instead of a snowflake or flat table.
- The further the number of joins from the fact table, the slower the filtering by dimension tables.
- Use low cardinality fields in relationships to minimize search.
d) Reduce cardinality
- Avoid unique strings (e.g., GUIDs, transaction IDs) in visuals or joins (although sometimes unavoidable).
- Split or group detailed columns (e.g., “Day” instead of full timestamp).
e) Disable auto date/time tables. Turn this feature off in options, as it bloats the model with hidden additional data. Use a central date table instead.

f) Aggregation tables
- Create summary tables (Import mode) for high-grain DirectQuery data. Try to avoid transaction-level tables in your model. If transaction-level data is required, consider using direct query, so as not to increase the size of your model (although reports will be slower).
- Power BI automatically uses them when queries match the aggregation grain.
g) Incremental refresh to reduce the dataset refresh time.
- Set up incremental refresh on large tables where data does not change historically, or if it does,s make use of an updated date field to use in your range parameters so as to ensure historical changes are also kept up to date (instead of e.g. the last day/week).
- Although incremental refresh may spike memory at some point, the reduced time will, in general, free up more memory time for report consumption.
h) Check model relationships are optimized.
- Try to keep as many relationships as one-to-many as a rule.
- Avoid bi-directional cross-filtering that can destabilize the model. Use only where necessary.
- Avoid many-to-many relationships that will slow down the model.
2. DAX measure optimization
- Avoid row-by-row iteration unless necessary. SUMX, FILTER, can be heavy; use SUM, AVERAGE, COUNTROWS where possible.
- Use variables in DAX measures to allow expressions to be reused in measures
- Consider using IF(ISINSCOPE() to use different measures for totals than filters. More complex, but worth testing.
- Avoid nested IF, SWITCH, or FILTER chains on large tables.
- Avoid DISTINCTCOUNT on large tables, you can try SUMX(VALUES(‘table'[column]), 1)
- Pre-calculate in ETL (views, etc., rather than putting load in DAX measures)
- Use KEEPFILTERS() and REMOVEFILTERS() carefully, as each adds cost.
- Test multiple variations when using complex measures, use SELECTCOLUMNS() instead of SUMMARIZE or GROUPBY()
3. Data Source / ETL Optimization
- Filter early in Power Query (or table source)
- Push filters to the source so DB does the work.
- Materialize complex transformations in the source. Avoid heavy joins, grouping, and parsing in Power Query. Push it back to SQL views, staging tables, or earlier in the pipeline.
- Compress categorical data – Replace text fields with dimension keys, e.g., ProductID, not ProductName
- Optimize refresh frequency if possible. Find the best time of day before peak report usage times.
- Use Incremental refresh to reduce refresh load and refresh times.
4. Report Layer Optimization
- Limit visuals per page. Each visual is one or more DAX queries. Try to reduce queries to around 8 per page.
- Use ‘Apply all filters’ to prevent visuals from refreshing on each slicer change. Wait for the user to finish slicing before applying the query, to prevent repetitive queries from being issued.
- Turn off unnecessary interactions between visuals using ‘Edit Interactions’.
- Pre-calculate totals in DAX or SQL. Cards with complex measures can be replaced by base measures or pre-aggregated values.
- Use bookmarks or buttons to toggle between visuals, to reduce the number of charts showing at one time (reduces the number of DAX queries being issued).
- Avoid visual-level calculations like TOP N Filters. Move logic into measures or calculated tables.
5. Service Level – For Power BI Premium / Fabric capacities.
- Enable a large dataset storage format. Reduces memory fragmentation and improves refresh concurrency.
- Use Aggregations + DirectQuery to split heavy workloads. Import summaries, query details on demand.
- Enable Query caching in capacity settings. Caches query results across users for faster loading. Note cache is cleared after a dataset refresh and potentially every hour as it checks for changes every hour.
- Use the XMLA endpoint to connect SSAS to the model and analyze refresh with SQL Server Profiler.
- Consider upgrading to higher capacities if the model is optimized, but still slow.
- Note that with Premium capacity, you can monitor CPU and memory usage using the Power BI Premium capacity metrics app.
October 9, 2025
Testing Dual mode in Power BI and BigQuery

In this test, we will try to see the performance improvement of using dual mode in a Power BI dimension table using a cloud database and a shared dataset, so we don’t get the performance issues experienced in the first dual mode test on a laptop.

Test 1
In the first test, we will use a single dimension table to test and a single fact table set to direct query mode.

I will use a public dataset from Google BigQuery.
Get Data > Google Big Query. I enter the project ID and am prompted for my Google login.

I can then navigate to the dataset I want to work with.

We choose a table from the New York taxi trips dataset, which has over 16 million rows.

This is configured with a Direct Query connection.

I will then add some simple measures for testing as follows:

Next, we’ll add the tax-zone_geom as a dimension table and add it with Direct Query mode initially. It has 263 rows.

I find there are duplicates in my dimension, which is a pain, so I need a new view in BigQuery to fix this.
I will create a simple view as follows (replace myproject_id with your own):

CREATE OR REPLACE VIEW `customer-myprojectid.pbitesting.DimZone` AS

SELECT
DISTINCT Zone_id, Zone_name
FROM
bigquery-public-data.new_york_taxi_trips.taxi_zone_geom

I can then connect Power BI to BigQuery using the different ProjectID and select my de-duped dimension table.

Next, I change the zone_id and pickup_location_id fields to whole numbers to save space.

I will then connect my DimZone table with my tlc_green_trips_2016 table using zone_id and pick_up_location_id.

I then changed my tax_zone_geom table to dual mode.

Then I publish my dataset to the Power BI service and we can connect to it with a new report.

Get Data > Power BI semantic models >

When I try to connect, I get the following BigQuery authentication error:

DataSource.Error: ADBC: The service bigquery has thrown an exception. HttpStatusCode is Unauthorized. Request is missing the required authentication credentials. Expected OAuth 2 access token, login cookie, or other valid authentication credential

So I need to add the credentials for accessing BigQuery to my dataset as follows in the dataset settings, and then I will need to sign in with my Google Account.

My settings then look like this:

I can then return to my Power BI Desktop report and connect to my semantic model.
I can now view the tables in my shared dataset.

I can then build a simple matrix with my measures from my tlc_green_tips_2016 fact table and the Zone_name from my dimension table, as well as adding a Zone_name slicer to my model.
It takes about 40 seconds to load.

Initial testing results are interesting. Even though caching is switched off in the dataset, we still see caching on testing with performance analyzer. Apparently this is due to session level caching, which can’t be switched off.

We didn’t see the session-level caching in the previous test, so we will have to select different selections each time to do the dual mode via import mode test. Also, we can’t do the full table test as this is also impacted by session-level caching.

In the first test, we will use a one-dimensional table and a fact table in direct query mode. We will test the dimension table in dual mode and then in import mode, and see the impact on performance.

Dual Mode Test 1 Results: 1 Dimension Table and 1 Fact Table in Direct Query Mode

After excluding outliers, we can see the import mode is trending faster than the dual mode.
After outliers are removed, which look more like anomalies due to resource limitations, rather than as part of the norm,
the T-test shows a significant difference between the 2 sets of observations.

Test 2

In the next test, we will test dual mode and 2 different fact tables. One is in import mode and the other is in direct query mode. This is where we should see the performance improvement of using dual mode.
We start by adding an additional fact table to the model, but this time in import mode.

The tlc_green_trips_2017 table. This table is 12.7M rows.
Then I rebuild the model. The 2016 is in direct query mode, the DimZone table is in dual mode, and the 2017 table. I then add some additional measures for each table to use in the Matrix test.

My new test matrix now has some measures from both my fact table 2016 (import), 2017 (direct query)

Test Results:
The trend appears to favor the import mode again, which was kind of disappointing as i wanted to see a similar improvement as i saw using SQL server locally.

Running a Test using Python in Power BI
We can use the Python visual to create some Python graphs and a T-Test.
I use AI to generate the code for me, and i add it into the Python script editor.
And i add the Zone, mode, and load time to the value wells of the visualisations pane.

The results are shown below, above the box plot, with the p-value = 0.0001 or a 1 in a thousand chance of this happening by chance. It seems that import mode is faster again. Perhaps BigQuery doesn’t work as well. It seems testing is the only way of finding out. Models may be faster in some circumstances than in others.

September 14, 2025

Testing Dual Mode Impact using the performance analyzer

Dual mode can potentially speed up queries by enabling the cache in certain situations. In theory, it can speed up queries in certain situations by utilizing the cache.
Microsoft’s description is here

In the following example, we start with the V_FactOnline sales view. It is a large table with over 12M rows and 19 columns. We could import it into our dataset, but it will increase the refresh time of our dataset and use valuable memory in our model that we may want to conserve.

To get an idea of the size of the table in our model, we can use the Vertipaq analyzer in DAX Studio.
I have imported 3 tables into our data model so far.

1. V_FactOnlineSales in import mode
2. V_DimProduct in import mode
3. FactOnlineSales table in direct query mode (which is the same size as V_FactOnlineSales

The total size of our dataset now is 233MB, and we can see that the imported V_FactOnlineSales table is taking up nearly all that space. When we have added another 50 tables to our model, we might decide to switch the v_fact_online sales table to a direct query model to reduce refresh time and free up memory. You will also notice that the FactSalesOnline table in direct query mode uses no memory.

After deleting the imported V_FactOnlineSales table, we can import it again, but in import mode, so it does not take up any space in our dataset. We have a one-to-many relationship between V_dimProduct and V_FactOnline Sales, with a single cross-filter direction, so only v_DimProduct and filter V_FactOnlineSales and not the other way around. They are joined by the ProductKey.

one to many relationship between dimension and fact table.

We’ll start by creating some simple measures from the FactOnlineSales table as follows:

Sales = SUM(V_FactOnlineSales[SalesAmount])
Cost = SUM(V_FactOnlineSales[TotalCost])
Profit = [Sales]-[Cost]
Profit Margin = DIVIDE([Profit], [Sales], 0)

We can organise them in the home table of V_FactOnline Sales and give them a folder to keep them together.

We can then add our measures to a matrix visual and a slicer on product name using the measures and the product name field from the dimproduct table. We can then use the performance analyzer and refresh the visuals. It takes about 5 seconds for the matrix to load. If I clear the query results and refresh the performance analyzer. I have a nice benchmark of about 5 seconds to work with. The performance analyzer is designed not to cache the queries, so this seems reliable to work with.

Power BI matrix showing product sales and profit.

If I select a single product from the slicer, it takes about 1 second to load the matrix.

Now, if I remove a product from my slicer, it takes longer (about 6 seconds, consistently).

Performance analyzer results one prouct removed

Next, I could move on to testing the DimProduct table in dual mode. Note I had to recreate the table in Direct Query mode and switch it to dual mode, as I wasn’t allowed to switch from import mode to dual mode.

A comparison of the tests between the two modes is in the table below.
I found I had quite consistent results, although when the full table was selected, I did run into performance problems on my laptop, and it was more of an issue in Dual mode, which does use more memory and some additional CPU (it creates a copy of the table import mode into memory to execute the query). Therefore, I will need to do a cloud-based test to get a more reliable picture. In this test, I’m using Power BI locally with SQL Server on an aging laptop.

The most obvious difference I found between the 2 modes was in individual selections in the slicer, with the Dual mode performing over twice the speed vs. import mode, which is consistent with what should be expected by using dual mode.

Matrix loading speed using the performance analyzer

Product Slicer Selection	DimProduct: Import FactOnelineSales: Direct Query	DimProduct: Dual Mode FactOnlineSales: Direct Query
Select All:	5-6 seconds	6 -8 seconds (CPU maxing out)
One Product selected	1.5 seconds	0.5 seconds
Exclude one product	6 seconds	1.5 seconds

To summarize, performance improvements were apparent for slicing with a dimension using dual mode, but a cloud-based test is required, in particular, to see the speed of the full unfiltered matrix refresh.

September 13, 2025

Power BI Report Builder Report Page formatting basics

While many people use Power BI, but never try report builder, but report builder is a handy tool for certain projects. If you have large volumes of detailed data that would take up too much space in your data model, or you find that direct queries are too slow, paginated reports can provide an alternative.

In this example, we’ll grab some data from Kaggle. I want a big customer file that I can work with. In real life, you’re most likely to be importing this from a Data Warehouse, but the only difference here is the connection and import; all the methods of work will be the same once you’ve imported the data into your dataset.
Here is the file I will use, entitled Bank Customer Segmentation 1M+ rows.
https://www.kaggle.com/datasets/shivamb/bank-customer-segmentation

The file we have is from a fictitious bank; someone has randomly generated a date for.
On inspection in Excel, the file is 1,048,569 rows

You can download the report builder for free from here:

https://www.microsoft.com/en-us/download/details.aspx?id=10594

Open the report builder after you’ve installed it and select a blank report to get started:

No the first thing we want to do is get our data, so right click on the Datasets folder and select ‘Get Data’, and you can log in your Power BI account here. If you don’t have one, there is a free trial. I’ll let you figure that bit out.

Once you’ve logged in, you’ll get the data source screen where you can select your data source. There are a whole host of data connector files here to choose from, including data warehouses such as Google BigQuery, SQL Server, Snowflake, as well as other tools such as Google Analytics, and in this case, because I’m lazy, I will just use the CSV file that I downloaded, but it doesn’t matter for learning report builder.

On the next screen, you need to enter your data connection information, but in this example, there is the option of uploading a file at the top below connection settings.

The main difference here is that with a connection to a data warehouse or whatever, once you publish your report to the Power BI service, the report will refresh with the latest data when the report is reloaded, whereas I will need to upload my file again.

Once my file is uploaded, I get a preview of the data, and I can change a few things, like the delimiter, which is set to comma-delimited by default.

Next, I click on Transform Data, which takes me down to a scaled-down version of Power Query compared to the version included with Power BI Desktop.

Here you can do things like add custom columns and format your data.
Click Create to import your dataset from Power Query into the Report Builder design window.

So let’s start by building a simple paginated report. I will use a simple table: Insert a table, and then I will drag my fields into the table.

Design and formatting the paginated report

First, we need to turn on the properties window (View > Properties), where we can easily edit things like fonts and background colours.

I got our mutual friend to create a logo for me and supplied the background colours in RGB format,

For the logo, I will add it to the head of the report (Insert Header). I add a rectangle and move the image into the rectangle, so I can control its properties. The design is now coming to life (Roar).

I can preview the report to test how it works by selecting Home > Run:
It takes a while to load as it loads the whole dataset, not page by page.

Below is the screenshot of the report in preview mode. There are a million rows, so it breaks the page across pages by default: It paginates the report. The logo in the header displays across pages, but the headers do not, so I need to fix that.

Fixing Row Headers in place with Power BI Report Builder

To fix the row headers in place, we need to do the following:
Select the table (Tablix) and select advanced mode in the bottom right corner. The row and column groups will display at the bottom of the screen.

In the row groups box (bottom left), I can control the positioning of the rows during pagination and scrolling.
I set the following for the Title row and the headers row:

FixedData = True. This will fix the row in position when I scroll.
KeepWithGroup = After. This is to stop the row from breaking from the row below.
RepeatOnNewPage = True. This I to repeat the row on each page.

I set these on both the title row and the headers row (click on static for each, and the properties will show on the right). To turn on properties, click View> Properties from the main ribbon.

Now, when I scroll, the header and title are fixed at the top of the page and also stay in place when I navigate pages.

Now we have a basic report working, let’s publish it to the service. Remember, if we had a connection to a data warehouse table, the report would grab the latest data for us, but for now, we’re lazy and just using a static file that won’t change.

To do so, Select File Publish and select the workspace you want to publish to:

Once I’ve published my report, I can view it on the service and give others access:
The report takes a while to load the 1 M-plus rows. Once it’s loaded, you can see some of the features available, such as export to Excel, which is a popular option. So can also set up a subscription to the report so the report will run and email it to you. This can be a useful option for some reports.

July 4, 2025
Testing Power BI DAX Measures Speed with DAX Studio
DAX Expression > DAX query > Vertipaq query.

Test 1: Testing the Impact of USERELATIONSHIP() in a DAX Measure

Does using USERELATIONSHIP() with inactive relationships slow down a DAX measure? I’m not sure so I thought I would test it out.

To test this I will create a measure using the existing between my date table and my FactOnlineSales table and then I created a copy of the date table and joined it in the same way to the FactOnlineSales table, but making the relationship inactive.

I use DAX studio to create 2 measures for calculating total sales and then I throw in some filters for year and continent as follows:
DEFINE MEASURE '_Measures'[test1] = CALCULATE ( SUMX ( 'FactOnlineSales',[Sales Amount] ), 'DimDate'[CalendarYear] = 2009, 'DimGeography'[ContinentName] = "Europe" ) MEASURE '_Measures'[test2] = CALCULATE ( SUMX ( 'FactOnlineSales',[Sales Amount] ), 'DimDate (2)'[CalendarYear] = 2009, 'DimGeography'[ContinentName] = "Europe", USERELATIONSHIP('DimDate (2)'[Datekey], FactOnlineSales[DateKey]) ) EVALUATE ROW ( "test", [test2] )
```
DEFINE
    MEASURE '_Measures'[test1] =
        CALCULATE (
            SUMX (
                'FactOnlineSales',[Sales Amount]
            ),
            
            'DimDate'[CalendarYear] = 2009,
            'DimGeography'[ContinentName] = "Europe"
            ) 

    MEASURE '_Measures'[test2] =
        CALCULATE (
            SUMX (
                'FactOnlineSales',[Sales Amount]
            ),
            'DimDate (2)'[CalendarYear] = 2009,
          	'DimGeography'[ContinentName] = "Europe",
            USERELATIONSHIP('DimDate (2)'[Datekey], FactOnlineSales[DateKey])
        )
EVALUATE
ROW (
    "test", [test2] 
)
```
Below is a screenshot of my DAX studio settings. As you can see I’ve got the ‘Clear cache on Run’ button selected. This will help ensure each test is equal. I also have the use ‘Server Timings’ button selected. Note be careful using ‘Clear cache on run’ on a connection to a production model as it will clear the cache (hence slowing down reports).

For the original sale measure that has an active relationship to the data table, I get a time of 2,197 which is broken down into FE (Formula engine) is 624 ms and 1,573 ms for the SE (Storage engine -that’s the Vertipaq engine). The FE turns the DAX expression into a DAX query and creates a query plan that it sends to the SE to retrieve the results of the query. The FE also caches the results of queries (this is why I have turned on the clear cache when the DAX query is run, otherwise the subsequent queries will return results very quickly).

Comparing the 2 measures
Here we run the query on each measure 5 times to get an average and as you can see there was very little difference in performance. Indeed the first 3 tests on the measure with the active relationship took longer (but this may have been because my laptop was busy doing something). My results are below, I think they are pretty inconclusive, but they are interesting none the less, and the method would be interesting to test with a larger model.

In the server timings you can view the Vertipaq query that is executed by the Analysis services:

Each step of the execution is below:
The first step selects the data from FactOnlineSales with the 2 filters returning 1.4M rows of data of 133 KB.
The second step does the sum of the sales amount and returns one row. The third step returns the start and end time as well as other information (if you mouse over the query).

The Par. field stands for Parrallism, which represents the number of parallel threads used by the storage engine to retrieve the result at each step.
January 1, 2025
Analysing the Power BI Data Model to Improve Refresh Rates with DAX Studio

I’ll start with some references:

Definitive Guide to DAX Book by Marco Russo and Alberto Ferrari (the definitive experts).
The Guy in the Cube video on model refresh slow:
Chat GPT (I don’t know who that is!)
My own testing and experience and demo using Power BI model using data from the Contoso Retail data warehouse

First things: Connecting DAX Studio to your data model in the Power BI service
1. Get the XMLA entry point URL from Power BI by selecting workspace settings > License info:
You will need to log in with your Power BI login to connect.

2. You can connect to your model in the service with DAX studio as follows (you may need to download and install it first). You can open it either independently or using the external tools within Power BI.

File and Model Size
Typical models use data imported into physical storage from a data warehouse and potentially other sources, such as Google Analytics. The Vertipaq engine stores the model in virtual memory. My Contoso Retail Power BI file is 193 MB (that’s not very big). It is compressed to reduce storage requirements.

Model Size in DAX Studio (Advanced > View Metrics > Summary)
When I look at the model summary from the advanced Metrics in DAX studio, the model loaded into virtual memory by the Vertipaq engine is 253 MB. Also of interest are the 37 tables and 366 columns. My model only has 8 tables and 1 calculated table, so it will be interesting later to understand what these other tables are.

So why is this different?

In-memory Compression: Power BI compresses data in the .pbix file to reduce storage, so the file size on disk is often smaller than its in-memory model size. However, once loaded, Power BI expands it for efficient query performance. There is still compression (next section), it’s just not as much as in the stored file.

Metadata Overhead: The model size shown in DAX Studio includes metadata, indexes, dictionaries, and structures needed for DAX calculations and relationships that Power BI uses while executing queries, which aren’t directly reflected in the file’s storage size.

Cache and Temp Data: DAX Studio may include caches or other temporary data generated during analysis, which increases the apparent size of the model in memory.

Unused Columns or Tables: Any tables or columns loaded but not used may also contribute to the model’s size in DAX Studio, while Power BI might not fully load them in the saved file.

Note: According to the model settings, the model is not using caching (which is off by default), so I can rule out one of the four possibilities.

Power BI License requirements based on model size

The Power BI Pro license allows for models up to 1GB in size, 10 GB of native storage, and 8 refreshes a day
The Power BI Premium license allows for models up to 100GB in size and 100 TB of native storage and up to 48 refreshes a day.

Note: Beyond just loading the raw model, Power BI needs additional memory overhead for processing, caching, and temporary storage during data refreshes and calculations. This means that even if your model fits within the 1 GB limit, you’re likely to experience slow performance or even errors during complex operations, especially if near that cap.

Semantic Model Refresh Speed (Vertipaq engine: imported tables)

We can easily access the model refresh times in the Power BI workspace (Select ‘Refresh history’ from the 3 dots menu next to the model entry. This gives us our total model refresh benchmarks (we can average them (or remove outliers as required at known busy times).

So what are the stages of the model refresh and which ones can we troubleshoot and optimize?

1. Data Import time: This one we can measure. Adam from Guy in the Cube video gives us a simple suggestion on how we can benchmark our table’s load times, by simply doing a SELECT * on each table in the data warehouse and making a benchmark of the time (repeat process at different times of the day (perhaps remove outliers at busy times) and average. I guess there is also a network latency speed to consider with large volumes of data, but let’s just do the: Total Refresh time – Table SELECT * time:

Here is an example from my Contoso data warehouse tables:
1. DimCurrency: Time 0: Rows: 28
2. DimCustomer: Time 0: Rows: 18,869
3. DimDate: Time 0: rows 648
4. DimGeography: Time 0: Rows 274
5. DimProduct: Time 0: Rows: 2467
6. DimProductCAtegory: Time 0: Rows 8
7. DimProductSubcategory: Time 0: Rows: 48
8. DimStore: Time 0: Rows 306
9. FactOnlineSales: 2 Mins 49 Secs, Rows: 12.6M

As you can see only the FactOnline sales table took any real time to SELECT.
But then you might say – hey, why are you using tables in your models and not views? If a new column appears the model refresh will fail.
So you’re right in which case you can run the tests on your views.

Now I don’t have a gateway setup to connect to my laptop today, so I need to set up something in the cloud to perform the test but let’s say the model refreshed in 4 minutes and your SELECT * benchmarks totaled 2 Minutes and 49 seconds, then you can say the for 1 Minute and 11 seconds the power bi service is doing something else.

So what is happening was the data has been imported? Well, that’s when we need to get into the Vertipaq engine. Note, that the engine is proprietary and so has secrets that we will never learn, but what we do know about the steps it performs are as follows:

Vertipaq engine steps
1. Loads the tables into columns
2. Check each column for the best sort order
3. Applys compression to create the smallest fastest model
4. Loads the model into Memory

Note: the Vertipaq engine only deals with imported data. Direct Queries are processed by the Direct Query engine.

DAX formula engine
1. Creates calculated columns and tables (in memory) – not compressed as after Vertipaq
2. Creates relationships between columns

So we need to learn more about what these two engines do.

The Vertipaq Engine
1. Stores imported data into a columnar database (for faster data retrieval).

2 . Sorts the columns by the best order for compression techniques (such as RLE)

3. Compresses the columns using various (proprietary techniques), such as:

a) Value encoding (e.g. deducting a constant value from each column entry to create a value with fewer bits and hence less space.

b) Hash encoding (dictionary coding), creating a table similar to normalization technique where an ID and the value are stored in a table in memory (e.g. product sizes, 0: small 1: medium, and 2: large). The 0, 1, and 2 will replace the sizes small, medium, and large resulting in fewer bits being stored in the table.

c) Run Length encoding (RLE): When values are repeated in a column entries 1-2000 are all Q1, then this can be put in a table of Q1: 2000 (first 50 rows are 50). Entries 2001 to 3500 are Q2. This can be put into the table as Q2: 1500 and so on.
Using this technique the Vertipaq engine can compute the results very quickly. Sorting is therefore very important for RLE-type compression. The engine does do sorting, but you can potentially help it by sorting the data by columns with the lowest cardinality first.
the
Factors impacting compression are therefore as follows:
1. Cardinality (the more unique values, the less compression can be done, hence increasing the size of the column in memory
2. Number of repetitive values – The more, the better the RLE compression possible.
3. The number of rows is important, but the cardinality and repetitive values are more important for compression
4. The data type – e.g. automatic date/time should be turned off as it created a larger entry size than the date field alone. Firstly the size will be smaller and if removing time results in much lower cardinality the column can be compressed much further using either RLE or Hash encoding.

Primary Keys such as user ID integer types will most likely use value encoding.
Numbers such as sales amounts that don’t have massive outliers will probably also be value encoded otherwise dictionary encoding may be used.

To determine the best sort order, a sample of the first x number of rows (I think its a million), is taken, so it’s important that the first x number of rows are of the best quality. You can help the Vertipaq engine therefore by ensuring your initial data is representative of the whole dataset or the engine will have to re-encode data e.g. if it finds outliers and finds its value compression doesn’t work (hence slowing down the process).

We can look at the type of compression used, by using DAX Studio (Advanced > View Metrics)

Firstly, you’ll see an overview of the tables. Each column in a table is encoded differently, so you’ll see the encoding type as ‘Many’.
You need to drill down to get to the column level (or you could just select the column tab on the left-hand side).

Now, let’s have a look at the FactOnlineSales table.

The table is sorted by default by the column size. This is the size in Bytes that the column is consuming in memory.

The first column: OnlineSalesKey has 12, 6127,608 rows and a cardinality of 12, 627,608 and as the name suggests is a unique key, hence it has the same cardinality (unique values) as rows.
It also is the largest column consuming 84MB of memory and 31.77% of the database.
It is encoding as VALUE type,

Rows: 12627608
Cardinarlity: 12627608
Col Size: 84156128 (Data + Dictionary + HierSize)
Data 33, 645560
Dictionary: 128
Hier Size: 50, 5101440
% Table: 33.07%

Unique keys are generally required fields in most tables, so it’s not something you can usually safe space by optimizing (unless you can reduce the size of the key somehow).

By applying the same analysis you can determine which columns and tables are having the biggest impact on your model size and hence slowing your model refresh time.

January 1, 2025
How to Build a Survey Results Report with Power BI
Power BI can be useful for displaying and storing survey results for your organization. In this example we build some data for the results of a fictitious Airline survey, using questions taken from a real online airport survey.

The data was built in Excel with a sheet for questions and a sheet for answers. You can download the Excel file here.

It’s easy to publish the report as a PDF, so you can share it with users who don’t have access to Power BI. Here is the airport survey report as PDF

The questions are a simple list with a Question ID and section.

The answers on the second sheet are formatted so that each row represents all the answers for each respondent.

Steps for Building the Airline Survey Report.

Step 1
The Data is imported to our Power BI Desktop survey using Power Query.

The only change to the question table is a simple column rename.

The answer table is transformed so that each line of the table contains the respondent ID, the question, and the Value (the answer). This prepares the data so it can be joined to the question table and easily visualized.
The applied steps are as follows:

As you can see the first step after the import is simply to promote the headers, so the column names are the actual column names.
The second step is just to change the column types to the correct types.
The key third step is to unpivot the data, so the table is transformed as below a separate row for each respondent’s answer. This makes the report easy to build.

The report only requires one measure, which is just a unique count of the respondent ID
```
Answer Count = DISTINCTCOUNT(Answers[Respondent])
```
Basic Graphs can be added.

And Likert Scale style graphs using a 100% Stacked bar graph visual.

Its a simple, but useful report.
May 8, 2024
How to Build a KPI Report with Power BI
The Power BI KPI demo report is a demonstration of how a typical KPI report will look. The report uses data from the fictitious Microsoft Contoso Retail Data Warehouse, which is stored in a SQL Server Database. I’ve renamed it to rn retail for demo purposes.

The report is currently divided into 5 pages.
1. The first page is a typical style KPI report, leading with the financial KPI, then breaking down to the store type, and then some online KPI. As the database is quite old (going back to 2009), it doesn’t have a lot of person-level information, but it does for online (hence online KPIs are included here). The KPIs are presented here in comparison to the previous year’s performance, but targets would also typically be included.
  The month selected allows the user to go back to any point in time in the year (or years if more data was included).
  
  2. The second page covers sales MTD, QTD, and YTD vs. the previous year there are graphs showing the YoY breakdown by store type and a table allowing more drill down into the individual store performance YoY. This interactive tool allows users to quickly pinpoint some of the reasons responsible for higher or lower YoY performance. The measure calculations can be found lower down this page.
  
  3. The third page of the report goes into more detail on stores broken down by country and city and includes the sales per square feet calculation. Again this helps the user explore the data to identify best and worst-performing stores in more detail.
2. The fourth page, shows sales by product category and product, allowing you to drill down from the top category levels. There are then graphs for Monthly and yearly performance.
  
  5. The fifth page focuses on activities. This can be developed further to show actives by activity segment (e.g., new, active, reactive, lapsed, etc. once activity segmentation is added to the model.
  
  The Data Model:
  
  Time Intelligence DAX measures
  The DAX measures for MTD, QTD and YTD and comparisons to the previous year are below:
```
//Month to Date Sales
Sales MTD = CALCULATE(SUM(FactSales[SalesAmount]) , DATESMTD('Date'[Date]))

//Quarter to Date Sales
Sales QTD = CALCULATE(SUM(FactSales[SalesAmount]) , DATESQTD('Date'[Date]))

//Year to Date Sales
Sales YTD = CALCULATE(SUM(FactSales[SalesAmount]) , DATESYTD('Date'[Date]))

//Last Year Month to Date Sales 
Sales LY MTD = CALCULATE(SUM(FactSales[SalesAmount]) , SAMEPERIODLASTYEAR(DATESMTD('Date'[Date])))

//Last Year Quarter to Date Sales
Sales LY QTD = CALCULATE(SUM(FactSales[SalesAmount]) , SAMEPERIODLASTYEAR(DATESQTD('Date'[Date])))

//Last Year Year to Date Sales 
Sales LY YTD = CALCULATE(SUM(FactSales[SalesAmount]) , SAMEPERIODLASTYEAR(DATESYTD('Date'[Date])))

//Variance Month to Date vs. Last Year Sales 
Sales VAR MTD vs. Last Year = [Sales MTD]-[Sales LY MTD]

//Variance Quarter to Date vs. Last Year Sales 
Sales VAR QTD vs. Last Year = [Sales QTD]-[Sales LY QTD]

//Variance Year to Date vs. Last Year Sales 
Sales VAR YTD vs. Last Year = [Sales YTD]-[Sales LY YTD]

//Variance % Month to Date vs. Last Year Sales
Sales VAR% MTD vs. Last Year = ([Sales MTD]-[Sales LY MTD])/[Sales LY MTD]

//Variance % Quarter to Date vs. Last Year Sales
Sales VAR% QTD vs. Last Year = ([Sales QTD]-[Sales LY QTD])/[Sales LY QTD]

//Variance % Year to Date vs. Last Year Sales
Sales VAR% YTD vs. Last Year = ([Sales YTD]-[Sales LY YTD])/[Sales LY YTD]
```
May 5, 2024
How to Create a Power BI Top Customers Report

Most businesses will have sales and marketing departments as well as top-level C-Suite/director levels. They are all going to be interested in a bunch of similar reports from organization to organization.

One of the most important reports, that sales and marketing teams are going to be interested in is:
Who are our best customers?

Top customers often contribute a significant amount of revenue to any organization. The 80/20 rule, that 80% of the revenue is made up of 20% of customers is quite common. Identifying top customers allows teams to segment their audience. Top customers can be given special treatment, such as dedicated account managers and special offers.

In the model from the Contoso database we use here, the DimCustomer table is joined to the FactOnlineSales table, and the FactOnlineSales table is joined to the Date table. We use the FactOnlineSales table rather than the FactSales table as the FactOnlineSales table as it includes a CustomerKey we can join to the DimCustomer table and pull out customer data at the person level (the FactSales table appears to include store-level data only).

To start creating the report, we can drag in the SalesAmount field from the FactOnlineSales table and the FirstName, LastName, and CustomerKey from the DimCustomer table as well as DateFirstPurchased. These are some of the fields a sales team is going to be interested in and they will likely ask for more to be added as time goes by.

Adding the Year field as a filtered visual from the data table, which is already joined to the FactOnlineSales table by the unique date key, allows us to filter the sales amount by year (and we could add months for example, as required).
We filter the report at the person level by adding the CustomerType field in the Filters panel:

You will notice here I just dragged in the SalesAmount field from the FactOnlineSales table, but that’s a bit lazy and we should create a measure for the field, like this:

OnlineSales = SUM(FactOnlineSales[SalesAmount])

The sales amount filter is useful, but teams are likely to be interested in several different sales calculations, which we can calculate using additional measures. For example, we might want to see the current year-to-date sales and the lifetime sales of a customer in the same report as follows:

Calculating Lifetime Sales with DAX

OnlineSales Lifetime = CALCULATE(SUM(FactOnlineSales[SalesAmount]), REMOVEFILTERS('Date'))

Using REMOVEFILTERS(‘Date’), removes all filters applied from the Date table, such as the year that is being used in the drop-down in this report, so the SalesAmount will be calculated for the customer’s lifetime. This now makes the report more interesting. The year filter still applies to the OnlineSales Measure, but does not apply to the OnlineSales Lifetime measure as below:

Comparing year-to-date vs. previous year-to-date is another common request.
To calculate year-to-date sales amount, we use the CALCULATE() function again, but this time combined with the DATESYTD() function:

Calculating Year-to-Date and Previous Year-to-Date Sales with DAX

To calculate Year to date sales we can use the DATESYTD function with CALCULATE:

OnlineSales YTD = CALCULATE(SUM(FactOnlineSales[SalesAmount]), DATESYTD('Date'[Date]))
and to calculate the OnlineSales YTD Prev Year, we can wrap the DATESYTD() function with the SAMETIMELASTYEAR() function as follows:

OnlineSales Prev YTD = CALCULATE(SUM(FactOnlineSales[SalesAmount]), SAMEPERIODLASTYEAR(DATESYTD('Date'[Date])))

Adding these measures to our report table, we should now have something looking like this :

Here we now have a top customer report for 2009, we can see which customers have brought in the highest revenue this year vs. last year to date, so we are comparing equal periods. We can also see customers’ lifetime value. In this example, Jon Yang is our best online customer, with $61K in online sales. This is equal to his lifetime sales. As you will notice, their first Date of Purchase is 22nd July 2001, which is a long time ago, but the database doesn’t have a record of this transaction as 2009 YTD sales equal lifetime sales. Presumably, this data was not recorded historically, which is often the case with older databases. The data you work with is often not perfect and we need to be aware of these specifics.

Now we have this year’s YTD measure and the previous year’s YTD measure, it is easy for us to calculate the variance % (or difference), between the two years.

Online Sales Var YTD% = ([OnlineSales YTD]-[OnlineSales Prev YTD])/[OnlineSales Prev YTD]

We have an infinity number for the first customer as they had no sales in the previous year, we can tidy that up using the DIVIDE() function, which automatically handles errors and infinity to tidy up the code.
Online Sales Var YTD% = DIVIDE(([OnlineSales YTD]-[OnlineSales Prev YTD]),[OnlineSales Prev YTD])

Let’s now add the month as an additional filter. As you can see the YTD and Prev YTD, are recalculated to show the online sales amount up until the end of the month selected. That’s useful and pretty powerful.

In a real-world scenario, sales teams are going to want the most up-to-date information, so we want to look to add previous days’ sales, week-to-date sales, and month-to-date sales. The issue here is that our data is old, so we need to do some juggling first to bring the Contoso sales data into the future to use for experimentation.
The process of adding a future date is outlined here.

DAX Previous Day Sales Calculation

Once we have the future date in our FactOnlineSales table as in the link above and join it to the date table, our report is brought into the future and we can create a measure calculating yesterday’s sales amount:

OnlineSales Yesterday = CALCULATE(SUM(FactOnlineSales[SalesAmount]), PREVIOUSDAY('Date'[Date]))

As you can see I can now filter on the current year, which is 2023, and remove the month filter.
I can now see the top customers for yesterday if I sort by the OnlineSalesYesterday field.
This is now giving the sales teams up-to-date, daily reporting, so they can keep their finger on the pulse of the business:

Now we’re using more ‘current’ data, we can now see we need Week-to-Date and Month-to-Date measures.

Calculating Week-to-Date Sales can be found here

December 16, 2023
How to find Max Date with List.Max() in Power Query to Create a Future Date

In the Contoso Retail database, we have old historical data which only goes up until the end of 2009. For our demo reports, we’d like to use more up-to-date data, so we want to add a date field to the FactOnlineSales (and FactSales) tables, so we can mimic more current reports.

To add what we’ll call a ‘future date’ field, we want to find out the number of days from the present day to the last date in the FactOnlineSales table, then we can recalculate all the old dates bringing them forward by this difference.

Getting the Max Date in Power Query

There doesn’t appear to be a direct M code function for max dates, but there is a List.Max() function which works with lists, so we need to create a list from the date field in the FactOnlineSales table. The steps are as follows:

1. Create a list from the date field (right-click on the DateKey field and select ‘Add as New Query’.

2. This will create a list of dates from – here I renamed it to FactOnlineSalesDates

3. We can now reference this list in the FactOnlineSales table using the List.Max() function.
So we need to create a new custom column in the FactOnlineSales table as follows:

We then just need to transform it into a Date type:

That will give us a column containing the max date (the last date), in the FactOnlineSalesTable. The MaxDate is 31/12/2009. We can now use that data in our calculations. Sure we could have just looked for it, but if the data is dynamic, it’s good to be able to calculate it, so we are sure it is correct.

4. Next we can add another custom column containing today’s date and transform it to a Date type as well. We use the Date.From() function, working on the DateTime.LocalNow() function:

5. Next add another custom column to calculate the date difference, between today and the max date in the FactOnlineSales table. We can use the Duration.Days() function to calculate the difference in days between today and the last date in the FactOnlineSales table.

This gives us the number we are looking for.

Now all we need to do is add another custom column to transform the dates into the present day.
We can call this ‘DateFuture’. And we can transform it to Date type.

We’ve added 4 additional columns here to a big table. To limit the impact on performance, we should remove the 3 columns we don’t need, just leaving us with the DateFuture column.

Now close & apply the data from Power Query to load it into the model.
In the model view, we can now edit the relationship between the Date table and the FactOnlineSales table, so they are joined using the DateFuture key, which will bring all the data into the future, when we use filters from the Date table.

If we now look at the FactOlineSales table, we can see the latest date is today (at the time of writing).

In practice a data warehouse is updated overnight, so the latest complete days worth of data should be the day before, so we want to tweak our calculation.

If we go back into Power Query query we can subtract a day from today’s date, using the Date.AddDays()function, so this will feed into the future date calculation.
Date.AddDays(Date.From(DateTime.LocalNow()), -1)

Now we see the data is up to today.

So, this is useful if we want to mimic a real-life scenario, where data is up until yesterday relative to today, but it will be confusing for financial reports if our future date keeps changing, so it makes sense to have a fixed date as well.

So to fix the today’s date field, we can do the following, using the #date function

December 16, 2023
How to Build a Date Table in Power BI
Probably the most essential table you need in your Power BI model is a date or calendar table.
You join your date table from any other table containing dates, such as your main Fact Sales table.

The advantage of using a calendar table is that it contains associated fields such as Year, Month, Start of Month, Start of Quarter, and relevant Financial Periods that will be bespoke to the business.

The date table should be stored in a data flow so everyone can use it. Here we create one using DAX, which could be exported and imported as a table in the Power Query service. You could also create a date table with Power Query directly. Creating one in DAX gives us practice with Time Intelligence functions.

The steps to create a date table are as follows:
1. Open Table View and select ‘New Table’
2. Start by creating the first column which will be our date key. We can use the CALENDAR() function to auto-create a list of dates in the table. The date range should include the minimum date relevant to your date up to any dates in the future. You could use TODAY() as dates up to today, for the end date, but sometimes for example with forecasting, it can be useful to have future data.
Of course, a big table may slow performance, so just bear that in mind.
```
Date = 
VAR DateStart = DATE(2007, 01, 01)
VAR DateEnd = DATE(2030, 12, 31)
RETURN
CALENDAR(DateStart, DateEnd)
```
3. Once you have your first date column setup which will be the key to joining your other tables, you can then start any other useful columns.

Here are some for starters:
```
Year = YEAR('Date'[Date])
Month = MONTH('Date'[Date])
Day = DAY('Date'[Date])
Quarter = QUARTER('Date'[Date])
StartOfYear = STARTOFYEAR('Date'[Date])
StartOfQuarter = STARTOFQUARTER('Date'[Date])
StartOfMonth = STARTOFMONTH('Date'[Date])
DayOfWeek = WEEKDAY('Date'[Date], 2)
```
Be careful with the day of the week number. It’s always best to double-check against another calendar.
The default is 1 for Sunday, to set the first day of the week, use WEEKDAY(‘Date'[Date], 2).

Once you have the day number setup, which can be useful in itself, you can then create a column for the day name:
```
DayName = 

SWITCH('Date'[DayOfWeek],
1, "Mon",
2, "Tue",
3, "Wed",
4, "Thu",
5, "Fri",
6, "Sat",
7, "Sun")
```
Once we’ve finished creating our data table, we can then join it within the model to the relevant tables, such as the Contoso FactSales table which has a date key. Note, that the Date field in the Date table is unique.
November 29, 2023
How to create Customer Age Bands in Power BI using DAX

An interesting dimension to report on for demographics is the age of customers. Understanding what type of customers buy what types of products can help inform our customer insight that can be used for targeting marketing offers more effectively.

In the DimCustomer table of the Contoso Retail Sales DW, we have a field called: Birthdate, but of course, this is unique for all the customers. What we want to do is group these ages into buckets or age bands, so we can create a more informative report. We can then use the grouping in graphs or as filters as in the graph below:

To create the categorization, I created a new calculated column in the DimCustomer table with the DAX code following the steps outlined below. Note the exact categories you use are up to you or the business. There are no definitive rules on what age bands you should follow, but by studying other reports, you will get the gist of the most common methods.

The first best practice step is to start creating variables.
The _dob variable is assigned the [BirthDate] field.
The _now variable is assigned the date of today, using the TODAY() function.
The _ageadys variable is then assigned the difference in the number of days between the date of birth field and today’s date, which is essential for the persons’ age in days.
The _ageyears variable divides the age in days by 365, to get the age in years.
The section after the RETURN statement then returns the age band category, based on the IF ELSE logic.
It’s important to put the statements in the correct order, starting from the least upwards, and note that there are some blank Birthdates if have used IF(_ageyears = BLANK(), “None”.
I only knew there were some blank dates by visually inspecting the BirthDate columns. If I hadn’t used this the blanks would have been picked up as “<18”), so it’s important to sense check as much as possible.

DAX for Calculated Column (note Age Range is the column name):
Age Range =
VAR _dob = DimCustomer[BirthDate]
VAR _now = TODAY()
VAR _agedays = DATEDIFF(_dob, _now, DAY)
VAR _ageyears = _agedays/365

RETURN
IF(_ageyears = BLANK(), “None”,
IF(_ageyears<18, “<18”,
IF(_ageyears<30, “18 to 29”,
IF(_ageyears<40, “30 to 39”,
IF(_ageyears<50, “40 to 49”,
IF(_ageyears<60, “50 to 59”,
IF(_ageyears<70, “60 to 69”,
IF(_ageyears<80, “70 to 79”,
IF(_ageyears>=80, “80 plus”)))))))))

November 29, 2023
How to Add a last refreshed date in Power BI

It’s very useful for users, administrators, and developers to have the data last refresh data added to Power BI reports as this helps confirm how up-to-date, the data is. Sometimes, when a Power BI file is refreshed, you will notice the data doesn’t actually refresh and there is no error, which can be confusing.

This is how to add the refresh date:

1. Open Power Query and add a blank query

2. In the formula bar of the blank query enter = DateTime.LocalNow(). This function gets the current time. Click on the source on the right-hand side to activate the function.

5. You should now have a second Transform tab, to transform the blank query to a table. Select To Table > Table

6. You can then change the data type to Date/Time

7. Finally just rename the column of the table and the column name to something meaningful, like Last Refresh Date.

8. Then Close and Apply to return to the report view.

9. Add a card to your page and add the field from the Last Refresh date table you just created. Adjust the size and you should get something like this.

11. The Date and time of the last refresh date should now refresh each time you refresh the report (as the last refresh table is refreshed).

November 29, 2023