Category: Uncategorized

  • Power BI Report Builder Report Page formatting basics

    While many people use Power BI, but never try report builder, but report builder is a handy tool for certain projects. If you have large volumes of detailed data that would take up too much space in your data model, or you find that direct queries are too slow, paginated reports can provide an alternative.

    In this example, we’ll grab some data from Kaggle. I want a big customer file that I can work with. In real life, you’re most likely to be importing this from a Data Warehouse, but the only difference here is the connection and import; all the methods of work will be the same once you’ve imported the data into your dataset.
    Here is the file I will use, entitled Bank Customer Segmentation 1M+ rows.
    https://www.kaggle.com/datasets/shivamb/bank-customer-segmentation

    The file we have is from a fictitious bank; someone has randomly generated a date for.
    On inspection in Excel, the file is 1,048,569 rows

    report builder Transactions example


    You can download the report builder for free from here:

    https://www.microsoft.com/en-us/download/details.aspx?id=10594

    Open the report builder after you’ve installed it and select a blank report to get started:

    report builder creation

    No the first thing we want to do is get our data, so right click on the Datasets folder and select ‘Get Data’, and you can log in your Power BI account here. If you don’t have one, there is a free trial. I’ll let you figure that bit out.

    Once you’ve logged in, you’ll get the data source screen where you can select your data source. There are a whole host of data connector files here to choose from, including data warehouses such as Google BigQuery, SQL Server, Snowflake, as well as other tools such as Google Analytics, and in this case, because I’m lazy, I will just use the CSV file that I downloaded, but it doesn’t matter for learning report builder.

    report builder data source

    On the next screen, you need to enter your data connection information, but in this example, there is the option of uploading a file at the top below connection settings.

    The main difference here is that with a connection to a data warehouse or whatever, once you publish your report to the Power BI service, the report will refresh with the latest data when the report is reloaded, whereas I will need to upload my file again.

    report builder data source connection

    Once my file is uploaded, I get a preview of the data, and I can change a few things, like the delimiter, which is set to comma-delimited by default.

    importing data


    Next, I click on Transform Data, which takes me down to a scaled-down version of Power Query compared to the version included with Power BI Desktop.

    report builder power query



    Here you can do things like add custom columns and format your data.
    Click Create to import your dataset from Power Query into the Report Builder design window.

    Your first dataset


    So let’s start by building a simple paginated report. I will use a simple table: Insert a table, and then I will drag my fields into the table.

    report builder adding a table

    Design and formatting the paginated report

    First, we need to turn on the properties window (View > Properties), where we can easily edit things like fonts and background colours.

    I got our mutual friend to create a logo for me and supplied the background colours in RGB format,

    For the logo, I will add it to the head of the report (Insert Header). I add a rectangle and move the image into the rectangle, so I can control its properties. The design is now coming to life (Roar).

    Report builder adding a header

    I can preview the report to test how it works by selecting Home > Run:
    It takes a while to load as it loads the whole dataset, not page by page.

    Below is the screenshot of the report in preview mode. There are a million rows, so it breaks the page across pages by default: It paginates the report. The logo in the header displays across pages, but the headers do not, so I need to fix that.

    Report builder preview report

    Fixing Row Headers in place with Power BI Report Builder

    To fix the row headers in place, we need to do the following:
    Select the table (Tablix) and select advanced mode in the bottom right corner. The row and column groups will display at the bottom of the screen.

    manage rows in report builder


    In the row groups box (bottom left), I can control the positioning of the rows during pagination and scrolling.
    I set the following for the Title row and the headers row:

    FixedData = True. This will fix the row in position when I scroll.
    KeepWithGroup = After. This is to stop the row from breaking from the row below.
    RepeatOnNewPage = True. This I to repeat the row on each page.

    I set these on both the title row and the headers row (click on static for each, and the properties will show on the right). To turn on properties, click View> Properties from the main ribbon.

    Now, when I scroll, the header and title are fixed at the top of the page and also stay in place when I navigate pages.

    fixing the top row

    Now we have a basic report working, let’s publish it to the service. Remember, if we had a connection to a data warehouse table, the report would grab the latest data for us, but for now, we’re lazy and just using a static file that won’t change.

    To do so, Select File Publish and select the workspace you want to publish to:

    Once I’ve published my report, I can view it on the service and give others access:
    The report takes a while to load the 1 M-plus rows. Once it’s loaded, you can see some of the features available, such as export to Excel, which is a popular option. So can also set up a subscription to the report so the report will run and email it to you. This can be a useful option for some reports.

    exporting to excel
    subscribing to a report
  • SQL Window Functions

    Summary


    1. Aggregate Window Functions
    These functions perform calculations across a set of table rows that are somehow related to the current row.

      SUM() OVER(…) – Running total or sum per partition
      AVG() OVER(…) – Average per partition
      COUNT() OVER(…) – Count rows in partition
      MIN() OVER(…) – Minimum value in partition
      MAX() OVER(…) – Maximum value in partition

      2. Ranking Window Functions
      These assign a rank or row number to each row within a partition.

        ROW_NUMBER() OVER(…) – Unique sequential number per row in partition
        RANK() OVER(…) – Rank with gaps for ties
        DENSE_RANK() OVER(…) – Rank without gaps for ties
        NTILE(n) OVER(…) – Divides partition into n buckets and assigns a bucket number

        3. Value Navigation (Analytic) Functions
        These functions return values from other rows in the window frame relative to the current row.

          LAG(expression, offset, default) OVER(…) – Value from a previous row
          LEAD(expression, offset, default) OVER(…) – Value from a following row
          FIRST_VALUE(expression) OVER(…) – First value in the window frame
          LAST_VALUE(expression) OVER(…) – Last value in the window frame
          NTH_VALUE(expression, n) OVER(…) – The nth value in the window frame


          Examples


          1. Aggregate Window Functions

          These perform calculations over a window of rows but return a value for each row.

          NameSalesDate
          Alice1002025-01-01
          Alice1502025-01-03
          Bob2002025-01-01
          Alice502025-01-05
          Bob3002025-01-04


          SUM() OVER()

          SELECT
            Name,
            Sales,
            SUM(Sales) OVER (PARTITION BY Name ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal
          FROM SalesData;


          The window function SUM(Sales) OVER (...) calculates a running total of sales for each Name.

          The window is partitioned by Name (so calculations are done separately for each person).

          The rows are ordered by Date within each partition.

          The frame ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW means the sum includes all rows from the first date up to the current row’s date.

          Output:

          NameSalesRunningTotal
          Alice100100
          Alice150250
          Alice50300
          Bob200200
          Bob300500


          AVG() OVER()

          Customer_idsales
          101200
          102150
          101300
          103400
          102250
          101100
          SELECT 
          customer_id, 
          sales, 
          AVG(sales) OVER (PARTITION BY customer_id) AS AvgSales 
          FROM orders;
          • The window function AVG(sales) OVER (PARTITION BY customer_id) calculates the average sales for each customer_id.
          • The average is computed over all rows with the same customer_id.
          • The result is shown on every row corresponding to that customer.
          customer_idsalesAvgSales
          101200200
          101300200
          101100200
          102150200
          102250200
          103400400


          COUNT(*) OVER

          departmentemployee_id
          Sales101
          Sales102
          HR201
          HR202
          HR203
          IT301
          SELECT
            department,
            employee_id,
            COUNT(*) OVER (PARTITION BY department) AS DeptCount
          FROM employees;
          • The window function COUNT(*) OVER (PARTITION BY department) counts the total number of employees in each department.
          • This count is repeated on every row for that department.
          • No ordering is required here because the count is the same for all rows in the partition.

          departmentemployee_idDeptCount
          Sales1012
          Sales1022
          HR2013
          HR2023
          HR2033
          IT3011


          2. Ranking Functions

          Assign ranks or row numbers within partitions.

          ROW_NUMBER() — Assigns unique row numbers

          employee_iddepartmentsalary
          101Sales5000
          102Sales7000
          103Sales6000
          201HR4500
          202HR4800
          301IT8000

          SELECT
            employee_id,
            salary,
            ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS RowNum
          FROM employees;

          employee_idsalaryRowNum
          10270001
          10360002
          10150003
          20248001
          20145002
          30180001

          The window function ROW_NUMBER() assigns a unique sequential number to each row within the partition defined by department.

          Rows are ordered by salary in descending order within each department.

          The highest salary in each department gets RowNum = 1, the next highest gets 2, and so on.

          RANK() — Assigns rank, with gaps for ties

          employee_iddepartmentsalary
          101Sales7000
          102Sales7000
          103Sales6000
          201HR4800
          202HR4500
          301IT8000
          302IT8000
          303IT7500
          SELECT
            employee_id,
            salary,
            RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS Rank
          FROM employees;

          The RANK() function assigns a rank to each employee within their department, ordering by salary descending.

          Employees with the same salary get the same rank.

          The next rank after a tie skips the appropriate number of positions (i.e., gaps in ranking).

          employee_idsalaryRank
          10170001
          10270001
          10360003
          20148001
          20245002
          30180001
          30280001
          30375003


          DENSE_RANK() — Like RANK but no gaps

          employee_iddepartmentsalary
          101Sales7000
          102Sales7000
          103Sales6000
          201HR4800
          202HR4500
          301IT8000
          302IT8000
          303IT7500
          SELECT
            employee_id,
            salary,
            DENSE_RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS DenseRank
          FROM employees;
          • DENSE_RANK() assigns ranks within each department, ordered by salary descending.
          • Employees with the same salary get the same rank.
          • Unlike RANK()DENSE_RANK() does not skip ranks after ties. The next distinct value gets the next consecutive rank.
          employee_idsalaryDenseRank
          10170001
          10270001
          10360002
          20148001
          20245002
          30180001
          30280001
          30375002



          3. Value Navigation Functions

          Access values from other rows relative to the current row.

          LAG() — Value from previous row

          employee_iddepartmentsalary
          101Sales5000
          102Sales6000
          103Sales7000
          201HR4500
          202HR4800
          301IT8000
          SELECT
            employee_id,
            salary,
            LAG(salary, 1) OVER (PARTITION BY department ORDER BY salary) AS PrevSalary
          FROM employees;

          LAG(salary, 1) returns the salary value from the previous row within the same department, based on the ascending order of salary.

          For the first row in each department, there is no previous salary, so the result is NULL.

          employee_idsalaryPrevSalary
          1015000NULL
          10260005000
          10370006000
          2014500NULL
          20248004500
          3018000NULL



          LEAD() — Value from next row

          employee_iddepartmentsalary
          101Sales5000
          102Sales6000
          103Sales7000
          201HR4500
          202HR4800
          301IT8000

          SELECT
            employee_id,
            salary,
            LEAD(salary, 1) OVER (PARTITION BY department ORDER BY salary) AS NextSalary
          FROM employees;


          LEAD(salary, 1) returns the salary value from the next row within the same department, based on the ascending order of salary.

          For the last row in each department, there is no next salary, so the result is NULL


          5000
          6000
          10260007000
          1037000NULL
          20145004800
          2024800NULL
          3018000NULL


          LAST_VALUE() — Last value in the window frame

          employee_iddepartmentsalary
          101Sales5000
          102Sales7000
          103Sales6000
          201HR4500
          202HR4800
          301IT8000

          SELECT
            employee_id,
            salary,
            LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS MaxSalary
          FROM employees;
          
          • LAST_VALUE(salary) returns the last salary value in the ordered partition.
          • The window frame ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ensures the entire partition is considered (not just up to the current row).
          • When ordered by salary ascending, the last value is the highest salary in the department.
          employee_idsalaryMaxSalary
          10150007000
          10360007000
          10270007000
          20145004800
          20248004800
          30180008000