Editor’s note: This article has been updated for accuracy and comprehensiveness in December 2023.
Is Power BI ETL even a thing? It may be a surprise if you know Power BI mainly as a data visualization tool. But part of business intelligence is data preparation. And Power BI is not disconnected from that. As you will see later, there’s more to Power BI than meets the eye.
So, are there data preparation techniques or methods in Power BI? Read on to find out. Here’s a quick overview of what we’re going to cover:
Table of contents
Important information
Additional information
Is Power BI an ETL Tool?
Power BI is a business intelligence tool. It’s kind of an ETL tool because ETL is part of business intelligence.
To clarify this further, let’s revisit what ETL means. ETL is Extract, Transform, Load. It’s a form of a data pipeline to integrate various data sources. Without it, your analytical reports and dashboards look old because of outdated data. ETL helps update them so your reports are current.
Meanwhile, here’s how Microsoft defines Power BI: Power BI is a collection of software services, apps, and connectors that work together to turn your unrelated sources of data into coherent, visually immersive, and interactive insights. – Microsoft Docs
The “visually immersive and interactive insights” part is the output we all see. But note that it’s “a collection of services, apps, and connectors”. And you guessed it right. ETL is one of them.
So, Power BI has an ETL tool plus more. The ETL part makes the data coherent. Meanwhile, the data modeling and visualization part makes it a very appealing tool for data analysis. Moving forward, let’s discuss its capabilities and harness ETL using Power BI.
ETL Process and Capabilities in Power BI
When you connect to a data source in Microsoft Power BI, you pick the tables you need for your report or dashboard. It can be product sales or whatever. Underneath that is Power Query – the ETL engine. If you didn’t perform any transformations, you may not have seen it.
So, here are the Power BI ETL capabilities:
- Able to connect to various data sources and extract data from them.
- A visual editor that allows hundreds of data transformations, like sorting, grouping, column splitting, and more.
- The M language. Any transformation you do visually is written in M. You can also edit the code using the Advanced Editor.
- Perform AI insights through Azure Machine Learning and Cognitive Services.
Power Query is also available in other Microsoft products aside from Power BI. These include Excel, SQL Server Analysis Services, and more. So, you can reuse the Power Query skills you gain from one product to another.
But how does it work? The diagram below shows a typical ETL process. But in the case of Power Query, the destination is limited to where it is supported. In our case, this is Power BI.
Let’s discuss ETL tools Power BI has to offer through examples.
Using Power Query for ETL
Let’s start with the Power BI ETL tool using Power Query. Whether you’re using the Power BI Desktop or Power BI Service, you’ll be using Power Query. Also, using Power Query Desktop is free. So, learning it from here is a good starting point.
To see this in action, let’s use Power Query ETL for a data warehouse in Power BI. For more details about the data structure of this data warehouse, please refer to this article.
STEP 1: Open Power BI and Connect to the Data Source
For this example, we’re using SQL Server Database as the data source.
Enter the connection settings for the data source. You may be asked for a user ID and password upon clicking OK.
STEP 2: Select the Tables You Need
Then, another window will appear to ask you to pick the tables you need. Select tables from the data source.
The screenshot shows the tables within the data warehouse. Mark checked all the dimensions and fact tables. Then, select Transform Data. At this point, we’re done with the extraction part of the ETL.
STEP 3: Transform the Data Based on Your Requirements
There are many data transformations you can do in Power Query. But in this example, let’s assume you need to group the data.
CREATE A NEW QUERY BY REFERENCING AN EXISTING TABLE
First, make a reference to the FactFireInsuranceSales table. See below how to do it.
Right-click the FactFireInsuranceSales table. Then, select Reference. Another query called FactInsuranceSales(2) will appear. Then, rename this to MonthlyProductSales by selecting it and pressing F2.
GROUP THE DATA WITH AN AGGREGATION
Referring to the screenshot, here’s how it is done:
- Click Group By.
- Then, click Advanced. We need to group using 2 columns. So, Basic won’t do.
- Click Add grouping. And when a dropdown list appears, select product_id.
- Then, define the aggregation. Enter Total Premium in New column name. Then, select the Operation. And last, select the premium column to sum.
- Finally, click OK to create the grouping.
The final result of the transformations we did is seen in the next screenshot.
All the steps needed appear in the right pane under Applied Steps. You can select any of them to modify a step. Meanwhile, all the steps generated the M language code seen below.
let
Source = FactFireInsuranceSales,
#"Expanded dimDate" = Table.ExpandRecordColumn(Source, "dimDate", {"year_month_number"}, {"dimDate.year_month_number"}),
#"Grouped Rows" = Table.Group(#"Expanded dimDate", {"dimDate.year_month_number", "product_id"}, {{"Total Premium", each List.Sum([premium]), type number}})
in
#"Grouped Rows"
Once you’re done, click Close & Apply. And your data will be imported to Power BI Desktop as the destination of this ETL setup.
ETL with Power BI Dataflows
ETL with Power BI Dataflows will take your ETL process to the next level. Read on to know more.
What Are Power BI Dataflows?
Power BI Dataflows are collections of entities or tables created and managed in workspaces in the Power BI service. This is much like a table in your database. And then, you update them using refresh schedules. It works by using Power Query and loads the result to Azure Data Lake Storage Gen2.
Before we begin, you need a Power BI Service with a premium license.
5 Easy Ways to Do ETL with Power BI Dataflows
Creating Dataflows is similar to the one we did with Power BI Desktop because of ETL Power Query. But first, log in to your Power BI Service. Then, open your workspace and click New. Then, select Dataflow.
Then, you will see 4 options to create a Dataflow as shown below on the screenshot.
These 4 options allow you to use different methods in creating Dataflows. These are explained next.
#1. CREATE DATAFLOWS USING NEW TABLES
Refer to the above screenshot and click Add new tables to use this first method. This allows you to connect to a data source and select the tables you need.
Use cases: Any table or entity in a data source you wish to perform ETL. This can be sales records, customer information, and many more.
So, the next step is to connect to your data source. We’re going to use the same SQL Server database in this example. First, you select the SQL Server data source for your Dataflow.
Second, you configure connection settings for an SQL Server data source in Power BI Dataflow.
Then, you choose the tables you need. See an example screenshot below.
Once you checkmark all the tables you need, click Transform data. You can now perform similar functions like grouping data. Finally, don’t forget to name and save your new Dataflow.
#2. CREATE DATAFLOWS USING LINKED TABLES
The second method involves creating a link to an existing table. This allows you to reuse an existing table.
Use cases: Any table you can reuse in different Dataflows. One example is reusing a Date dimension table in a data warehouse.
To start with this method, refer to four options in creating a new Dataflow, which we described above. Then, click Add link tables. You will be taken to a page that lists all the Dataflows and tables in a tree hierarchy. Select the table you wish to link. Then, click Transform Data. See the screenshot below and, finally, click Save and Close.
#3. CREATE DATAFLOWS USING COMPUTED TABLES
The third method involves referencing a table and performing in-storage computations. Since it’s just a reference to a table, calculations will not involve the external data source. The result is a new table called a computed table.
Tables and linked tables can be edited. And you can start creating computed tables from here. To do that, right-click a table and mark Enable load as checked. Then, create a Reference to it. Check this out in the screenshot below.
Then, rename the resulting computed table and do any transformations as needed.
#4. CREATE DATAFLOWS USING IMPORT/EXPORT
The fourth method is using the Import/Export model. This allows you to export an existing Dataflow to a JSON file. Then, use the Import model to create a Dataflow in another workspace using the exported file.
Now, refer again to the four options for creating a new Dataflow, which we described above. To import the model we just exported, go to another workspace, and create a new Dataflow. And then choose the Import model. Then, browse for the JSON file you exported and click OK.
#5. CREATE DATAFLOWS USING CDM FOLDER
The fifth method of creating a Dataflow is through the use of a CDM folder. This allows you to reference a table that has been written by another application in the Common Data Model (CDM) format. You need to provide the complete path to the CDM format file stored in Azure Data Lake Storage Gen 2. Note that you also need to have permission to access that folder.
To do that, refer to the four options above. Then, create a new Dataflow and click Create and attach. Then, see what you need to input next and, finally, click Create and attach.
Consuming a Dataflow in Power BI
You can use the Dataflows you created in Power BI Desktop. See how this is done in the screenshot below.
Selecting Power BI dataflows requires you to sign in to Power BI Service. Then, you need to choose the tables you need in your Power BI report.
There’s so much more about Dataflows as an ETL tool in Power BI than we can discuss in this article. Visit these valuable links for more information:
- Benefits of using Power BI Dataflows and their use-cases.
- Configuring Incremental Refresh for faster loading.
- Limitations and considerations in using Power BI Dataflows.
- Power BI ETL best practices using Dataflows.
Skyvia ETL Tool for Power BI
Skyvia is a cloud data platform suitable for multiple data-related operations: data warehousing, ETL and Reverse ETL, workflow automation, etc. You can easily load data into a data warehouse by using Skyvia, and it will then be analyzed with Power BI. What’s more, Skyvia is code-free, which means that any setup is done in the visual wizard through drag-and-drop operations.
Technical Deep Dive into Skyvia ETL
To replicate data in a data warehouse, use Skyvia Replication.
- Log into Skyvia or create an account.
- Go to +New->Replication in the top menu.
- Select the app of your choice as a source.
- Select a preferred data warehouse as a destination. Skyvia supports Amazon RedShift, Google BigQuery, Azure Synapse Analytics, Snowflake, and other popular data warehouse platforms.
- Click Add task to set up the data replication parameters.
- Click Schedule to plan the replication procedure on a regular basis.
- Click Create to save the replication scenario.
Power BI offers native integration with major data warehouses. So, you can transfer the recently uploaded data from the data warehouse to Power BI.
- In Power BI, select Get Data and type BigQuery in the search bar.
- Select Google BigQuery on the list and click Connect.
- Enter your BigQuery credentials. See more detailed instructions on how to connect to BigQuery here.
- Once the connection is established, the Navigator window with available data on the server appears. Select the needed data and click Load to load it in Power BI Desktop.
As we’ve just observed, Skyvia perfectly carries out data warehousing scenarios with ELT. It also has all the necessary functionality for ETL and Reverse ETL implementation:
- Import case suitable for ETL and Reverse ETL.
- Data Flow, Control Flow for complex data transformations across compound data pipelines.
NOTE: Skyvia’s Integration tool also offers functionality for data export into CSV files and synchronization between data sources.
Skyvia vs. Other ETL Tools
No doubt, Skyvia is a powerful solution for data warehousing, but it works with different data integration scenarios. However, it’s not unique in its genre in the market; there are pretty many similar solutions by competitors. So, let’s have a look at other popular data integration solutions, and see how they are different or similar to our platform. The best way to start is by providing a comparison table with several critical parameters.
Skyvia | Informatica | Stitch | Hevo | |
---|---|---|---|---|
Focus | ETL, ELT, Reverse ETL, data sync, workflow automation | ETL | ETL, ELT | ETL, Reverse ETL, ELT |
Skill level | No-code and visual wizard | No-code, low-code | No-code, low-code | No-code, low-code |
Sources | 180+ | 90+ | 140+ | 150+ |
The ability for customers to add other sources | With REST API | With Connector Toolkit Developer | With Singer, Import API, or Incoming Webhooks | With REST API |
G2 customer satisfaction rate | 4.8/5 | 4.4/5 | 4.5/5 | 4.3/5 |
Compliance with safety certificates | HIPAA, GDPR, PCI DSS. ISO 27001 and SOC 2 (by Azure) | SOC 1, SOC 2, SOC 3, HIPAA / HITECH, GDPR, Privacy Shield | SOC 2 Type II, HIPAA BAA, ISO/IEC 27001, GDPR, and CCPAR | SOC 2, HIPAA, GDPR, CCPA |
Pricing | Volume- and feature-based. Has a free plan. | Consumption-based. | Volume- and feature-based. Has a 14-day trial. | Consumption-based pricing based on Hevo events. Full-feature 14-day free trial. Has a free plan. |
Compared to its major competitors, Skyvia supports the largest number of data integration scenarios and sources. It also has the highest rankings for usability on the G2 review platform.
In the sense of skill level requirements, all tools are very similar as each of them offers no-code and low-code interfaces. Each service also strongly complies with modern security protocols and certificates.
As for pricing, Skyvia, as well as Hevo, offers a free plan with limited monthly data volume and features. Pricing plans of all tools rely on a consumption-based model considering data size and feature set.
Case Studies
To switch from theoretical dogmas to practice, let’s show several real-life cases of Skyvia’s users.
Data warehousing
The first story depicts how the telco company TechneValue successfully realized the ELT scenario within its organizational environment. The company’s main objective was to discover the ‘white spots’ of their network as the quality of service has degraded during COVID-19. Gathering data from multiple devices and detecting the components of the network caused delays was crucial to keeping customer satisfaction at high levels.
TechneValue discovered Skyvia’s integration talents and features and decided to use its potential fully. Skyvia extracts data from modems and devices connected to the network and loads it into the data warehouse. Then, analysts use this data to discover the roots of network problems and make predictions for the future.
See more details about this case here.
ETL
Another example is the mechanism of ETL implementation in Exclaimer. The company got the challenging task of establishing Salesforce as a central source of truth serving the Sales, Finance, and Marketing departments.
With Skyvia, Exclaimer managed to transfer data from the internal billing and subscription systems into Salesforce with the previous data transformations adopted.
See more details about this case here.
Conclusion
You’ve learned a great deal today in using ETL in Power BI. You’ve also learned a more flexible option of using Skyvia ETL to power your reports in Power BI.
Do you have any questions and comments? Then, write them in the Comments section below.