Microsoft Power BI is a business intelligence tool that offers both cloud-based and desktop versions for various kinds of reporting and dashboard creation. Microsoft Power BI boosts many features like automatic data source updates.
Data source configuration in Power BI usually requires the entire data to be wiped out to update it will new ones. This method usually takes a tremendous amount of time if you are working with a big dataset, which is why the use of incremental refresh has come into play which provides you will the ability to refresh your data incrementally.
To follow along with this article, you should have a basic understanding of Power BI Desktop, Power BI Service, and how to create a workspace in Power BI services.
Power BI incremental refresh supports both the Pro, Premium Per user licenses, Premium per capacity, and Power BI Embedded. The hybrid tables, however, need a PPU or Power BI Premium capacity license. Only Power BI Premium, Premium per user, and Power BI Embedded datasets are supported for obtaining real-time data when using these different licenses.
Power BI services also support incremental refresh by using Dataflows and Datamarts.
Data Source Supported
Incremental refresh works best with relational data that supports “query folding” such as SQL databases and other structured data sources/warehouses.
Date Column – The data source must have a field or column containing the “date datatype.” The incremental refresh policies are added to the column in form of filters.
Power BI Gateway
The Power BI Gateway is software used to access data located in an on-premises network. For the on-premises data source, the gateway serves as a gatekeeper. The need for Gateway arises if the data used for incremental refresh is from an on-premises data source.
What is Incremental Refresh
Microsoft Power BI provides users with two options for loading data, either the full load or incremental refresh. The full load retrieves the entire dataset and erases any previous data each time this action is performed, thereby consuming a lot of resources.
The Power BI incremental refresh feature allows users to refresh data incrementally rather than conducting a full data refresh every time. This can speed up data refreshment considerably and cut down on the volume of information that must be transmitted between Power BI and the data source.
Partitioning in Incremental Refresh
This process splits the data into multiple partitions, thereby speeding up the data refreshes, and cutting down on the volume of information that must be transmitted between Power BI and the data source.
Benefits of Power BI Incremental Refresh
- Performance Improvement: Incremental refresh can drastically cut down on the quantity of data that needs to be exchanged between Power BI and the data source, which can speed up performance and cut down on the time it takes to update your data.
- Reduce Data Storage: You can minimize the amount of data stored in your Power BI data model, which can save you money on storage by just updating the data that has changed.
- User Flexibility: This feature allows the use of schedule refresh with can be widely used by most organizations.
Limitation of Power BI Incremental Refresh
The major limitation that can be found in the Power BI incremental refresh is reports published to the Power BI services cannot be downloaded because the dataset is now partitioned.
Setting up Power BI Incremental Refresh
The following steps listed below will guide us in setting incremental refresh on Power BI Desktop.
Step 1: Get Data Source
Incremental refresh works best with data that support “query folding,” for this, we will be connecting to the SQL Server Management Studio.
For this article, we will be using the sales data from the SQL Server Management Studio.
On your Power BI Desktop, click on Get data and select SQL Server. Provide the appropriate credentials like Server name and database (Optional) then connect.
In your Power Query Editor ensure all datatype is in the right format, then “Close and Apply”
Step 2: Create a Simple Chart
In your Power BI design view, create a cluster column chart. This will be needed to indicate the trend and get the minimum and maximum dates from the date column.
Step 3: Set Parameter Range
To use the incremental refresh, you need to create a Power Query parameter. The parameter will contain two values, one for the minimum date which is the RangeStart, and the other for the maximum date which is the RangeEnd. Click on Transform Data on your Power BI Desktop, this should take you to the Power Query Editor environment.
In your Power Query Editor, click on “Manage Parameter” and then select “New Parameter.”
Now, we need to create two parameters of keywords RangeStart and RangeEnd, the keywords are case-sensitive and must follow this format. Also, the datatype to be used for both keywords in the parameter must be a Date/Time or Timestamp datatype by default the 12:00:00 AM appears.
Step 4: Filter Data Based on Parameter
Now that the parameter is set up, next we need to filter the date column on which we want to base the parameter on. Click on the filter button, then select Date/Time filters, and navigate to Custom Filters. This should open another window for you.
In the new window, you are expected to fill in the following information based on the parameter we just created.
Step 5: Incremental Setup Policy
On the Power Query Editor Home tab, click on “Close & Apply,” this will close all transformations done and take you back to the Power BI Desktop window.
In your Power BI desktop navigate to the data pane and right-click on the table you want to apply the incremental policy on.
In the new window, you are expected to ON the incremental refresh setting. In the policy settings, you are to provide the number of rows of data you want to store which will be your archive. The incremental refresh interval will bring the data that is present in the last period be it days, months, quarters, or years.
Let me explain further, I have decided to keep data for the past 4 years and refresh it incrementally once a month. This indicates that only the most recent month’s worth of data will be accessed gradually whenever the data model is updated via the Power BI service.
Choose Optional Settings:
While setting the parameter you are also provided with multiple options for configuration.
- DirectQuery (Premium Only): Use to get the most up-to-date information, including the most recent data changes made at the data source during the previous refresh period in real time.
- Refresh Completed Days: Use to only update full days, select rows for the entire day are not refreshed if the refresh function determines that day is not complete. When you choose to Get the most recent data in real-time with DirectQuery (Premium only), this option is immediately turned on.
- Detect Data Changes: This is used to define a date/time column that will only refresh on days when the data has changed. At the data source, there must be a date/time column, typically for auditing purposes.
Step 6: Publish Report to Power BI Services
This is the last step, at this point we need to publish the report to a workspace in the Power BI service preferably a “premium per user workspace”. To create a Power BI workspace, check out our previous blog on how to create Workspaces in Power BI service.
Since I am using SQL Server on-premises in this example, I have also configured the Data Gateway so that the Power BI service can update the dataset.
Validate Incremental Refresh
In the workspace, you published your report click on “Settings” . This will open a new tab, select the Premium tab, and change the radio button from Pro to Premium, which will automatically provide you with the Workspace Connection URL.
Click on “Refresh now” to create the necessary partitions.
Note: Your connection Gateway at this moment should be running, due to the fact the data source used is gotten from an on-premises source.
Connect to Analysis Services
In your SQL Server Management Studio, click on the Connect dropdown at the top left corner, and select “Analysis Services.”
In the new window you are expected to provide the following:
- Server name: This is the “Workspace connection” link, gotten from the Power BI services.
- Authentication: Change it from the default Windows authentication to Azure Active Directory – Universal with MFA.
- Username: This should be the email address used for your workspace.
Now with this connection, you can check for partition of your dataset.
In this article, we have learned all about Microsoft Power BI incremental refresh. Benefits of using incremental refresh and how to set it up in your Power BI desktop. We also talked about publishing your Report on incremental policies for the Power BI service and how to connect your workspace to the Analysis Services.