Power BI ETL: A Comprehensive Guide

Editor’s note: This article has been updated for accuracy and comprehensiveness in December 2023.

Is Power BI ETL even a thing? It may be a surprise if you know Power BI mainly as a data visualization tool. But part of business intelligence is data preparation. And Power BI is not disconnected from that. As you will see later, there’s more to Power BI than meets the eye.

So, are there data preparation techniques or methods in Power BI? Read on to find out. Here’s a quick overview of what we’re going to cover:

Table of contents

Important information

  1. Is Power BI an ETL Tool?
  2. ETL Processes and Capabilities in Power BI
  3. Skyvia ETL Tool for Power BI

Additional information

  1. Using Power Query for ETL
  2. ETL with Power BI Dataflows
  3. Conclusion

Is Power BI an ETL Tool?

Power BI is a business intelligence tool. It’s kind of an ETL tool because ETL is part of business intelligence.

To clarify this further, let’s revisit what ETL means. ETL is Extract, Transform, Load. It’s a form of a data pipeline to integrate various data sources. Without it, your analytical reports and dashboards look old because of outdated data. ETL helps update them so your reports are current.

Meanwhile, here’s how Microsoft defines Power BI: Power BI is a collection of software services, apps, and connectors that work together to turn your unrelated sources of data into coherent, visually immersive, and interactive insights. – Microsoft Docs

The “visually immersive and interactive insights” part is the output we all see. But note that it’s “a collection of services, apps, and connectors”. And you guessed it right. ETL is one of them.

So, Power BI has an ETL tool plus more. The ETL part makes the data coherent. Meanwhile, the data modeling and visualization part makes it a very appealing tool for data analysis. Moving forward, let’s discuss its capabilities and harness ETL using Power BI.

Discover best pricing

ETL Process and Capabilities in Power BI

When you connect to a data source in Microsoft Power BI, you pick the tables you need for your report or dashboard. It can be product sales or whatever. Underneath that is Power Query – the ETL engine. If you didn’t perform any transformations, you may not have seen it.

So, here are the Power BI ETL capabilities:

  • Able to connect to various data sources and extract data from them.
  • A visual editor that allows hundreds of data transformations, like sorting, grouping, column splitting, and more.
  • The M language. Any transformation you do visually is written in M. You can also edit the code using the Advanced Editor.
  • Perform AI insights through Azure Machine Learning and Cognitive Services.

Power Query is also available in other Microsoft products aside from Power BI. These include Excel, SQL Server Analysis Services, and more. So, you can reuse the Power Query skills you gain from one product to another.

But how does it work? The diagram below shows a typical ETL process. But in the case of Power Query, the destination is limited to where it is supported. In our case, this is Power BI.

high level data pipeline architecture

Let’s discuss ETL tools Power BI has to offer through examples.

Using Power Query for ETL

Let’s start with the Power BI ETL tool using Power Query. Whether you’re using the Power BI Desktop or Power BI Service, you’ll be using Power Query. Also, using Power Query Desktop is free. So, learning it from here is a good starting point.

To see this in action, let’s use Power Query ETL for a data warehouse in Power BI. For more details about the data structure of this data warehouse, please refer to this article.

STEP 1: Open Power BI and Connect to the Data Source

For this example, we’re using SQL Server Database as the data source.

Open Power BI and Connect to the Data Source

Enter the connection settings for the data source. You may be asked for a user ID and password upon clicking OK.

Enter the connection settings for MSSQL

STEP 2: Select the Tables You Need

Then, another window will appear to ask you to pick the tables you need. Select tables from the data source.

Select the Tables You Need from MSSQL

The screenshot shows the tables within the data warehouse. Mark checked all the dimensions and fact tables. Then, select Transform Data. At this point, we’re done with the extraction part of the ETL.

STEP 3: Transform the Data Based on Your Requirements

There are many data transformations you can do in Power Query. But in this example, let’s assume you need to group the data.

CREATE A NEW QUERY BY REFERENCING AN EXISTING TABLE

First, make a reference to the FactFireInsuranceSales table. See below how to do it.

Power BI ETL: make a reference to the FactFireInsuranceSales table

Right-click the FactFireInsuranceSales table. Then, select Reference. Another query called FactInsuranceSales(2) will appear. Then, rename this to MonthlyProductSales by selecting it and pressing F2.

GROUP THE DATA WITH AN AGGREGATION

Power BI ETL: group the data by aggregation

Referring to the screenshot, here’s how it is done:

  1. Click Group By.
  2. Then, click Advanced. We need to group using 2 columns. So, Basic won’t do.
  3. Click Add grouping. And when a dropdown list appears, select product_id.
  4. Then, define the aggregation. Enter Total Premium in New column name. Then, select the Operation. And last, select the premium column to sum.
  5. Finally, click OK to create the grouping.

The final result of the transformations we did is seen in the next screenshot.

Power BI ETL: The final result of the transformations

All the steps needed appear in the right pane under Applied Steps. You can select any of them to modify a step. Meanwhile, all the steps generated the M language code seen below.

let
    Source = FactFireInsuranceSales,
    #"Expanded dimDate" = Table.ExpandRecordColumn(Source, "dimDate", {"year_month_number"}, {"dimDate.year_month_number"}),
    #"Grouped Rows" = Table.Group(#"Expanded dimDate", {"dimDate.year_month_number", "product_id"}, {{"Total Premium", each List.Sum([premium]), type number}})
in
    #"Grouped Rows"

Once you’re done, click Close & Apply. And your data will be imported to Power BI Desktop as the destination of this ETL setup.

ETL with Power BI Dataflows

ETL with Power BI Dataflows will take your ETL process to the next level. Read on to know more.

What Are Power BI Dataflows?

Power BI Dataflows are collections of entities or tables created and managed in workspaces in the Power BI service. This is much like a table in your database. And then, you update them using refresh schedules. It works by using Power Query and loads the result to Azure Data Lake Storage Gen2.

Before we begin, you need a Power BI Service with a premium license.

5 Easy Ways to Do ETL with Power BI Dataflows

Creating Dataflows is similar to the one we did with Power BI Desktop because of ETL Power Query. But first, log in to your Power BI Service. Then, open your workspace and click New. Then, select Dataflow.

Power BI ETL: Creating Dataflows

Then, you will see 4 options to create a Dataflow as shown below on the screenshot.

Power BI ETL: 4 options to create a Dataflow

These 4 options allow you to use different methods in creating Dataflows. These are explained next.

#1. CREATE DATAFLOWS USING NEW TABLES

Refer to the above screenshot and click Add new tables to use this first method. This allows you to connect to a data source and select the tables you need.

Use cases: Any table or entity in a data source you wish to perform ETL. This can be sales records, customer information, and many more.

So, the next step is to connect to your data source. We’re going to use the same SQL Server database in this example. First, you select the SQL Server data source for your Dataflow.

Power BI ETL: connect to your data source in the Data Flow

Second, you configure connection settings for an SQL Server data source in Power BI Dataflow.

configure connection settings for a SQL Server data source in Power BI Dataflow

Then, you choose the tables you need. See an example screenshot below.

Power BI ETL: choose the tables you need​ in the Data Flow

Once you checkmark all the tables you need, click Transform data. You can now perform similar functions like grouping data. Finally, don’t forget to name and save your new Dataflow.

#2. CREATE DATAFLOWS USING LINKED TABLES

The second method involves creating a link to an existing table. This allows you to reuse an existing table.

Use cases: Any table you can reuse in different Dataflows. One example is reusing a Date dimension table in a data warehouse.

To start with this method, refer to four options in creating a new Dataflow, which we described above. Then, click Add link tables. You will be taken to a page that lists all the Dataflows and tables in a tree hierarchy. Select the table you wish to link. Then, click Transform Data. See the screenshot below and, finally, click Save and Close.

Power BI ETL: CREATE DATAFLOWS USING LINKED TABLES​

#3. CREATE DATAFLOWS USING COMPUTED TABLES

The third method involves referencing a table and performing in-storage computations. Since it’s just a reference to a table, calculations will not involve the external data source. The result is a new table called a computed table.

Tables and linked tables can be edited. And you can start creating computed tables from here. To do that, right-click a table and mark Enable load as checked. Then, create a Reference to it. Check this out in the screenshot below.

Power BI ETL: CREATE DATAFLOWS USING COMPUTED TABLES​

Then, rename the resulting computed table and do any transformations as needed.

#4. CREATE DATAFLOWS USING IMPORT/EXPORT

The fourth method is using the Import/Export model. This allows you to export an existing Dataflow to a JSON file. Then, use the Import model to create a Dataflow in another workspace using the exported file.

Power BI ETL: CREATE DATAFLOWS USING IMPORT/EXPORT​

Now, refer again to the four options for creating a new Dataflow, which we described above. To import the model we just exported, go to another workspace, and create a new Dataflow. And then choose the Import model. Then, browse for the JSON file you exported and click OK.

#5. CREATE DATAFLOWS USING CDM FOLDER

The fifth method of creating a Dataflow is through the use of a CDM folder. This allows you to reference a table that has been written by another application in the Common Data Model (CDM) format. You need to provide the complete path to the CDM format file stored in Azure Data Lake Storage Gen 2. Note that you also need to have permission to access that folder.

To do that, refer to the four options above. Then, create a new Dataflow and click Create and attach. Then, see what you need to input next and, finally, click Create and attach.

Power BI ETL: CREATE DATAFLOWS USING CDM FOLDER​

Consuming a Dataflow in Power BI

You can use the Dataflows you created in Power BI Desktop. See how this is done in the screenshot below.

Consuming a Dataflow in Power BI

Selecting Power BI dataflows requires you to sign in to Power BI Service. Then, you need to choose the tables you need in your Power BI report.

There’s so much more about Dataflows as an ETL tool in Power BI than we can discuss in this article. Visit these valuable links for more information:

Skyvia ETL Tool for Power BI

Skyvia is a cloud data platform suitable for multiple data-related operations: data warehousing, ETL and Reverse ETL, workflow automation, etc. You can easily load data into a data warehouse by using Skyvia, and it will then be analyzed with Power BI. What’s more, Skyvia is code-free, which means that any setup is done in the visual wizard through drag-and-drop operations.

Replication by Skyvia

Technical Deep Dive into Skyvia ETL

To replicate data in a data warehouse, use Skyvia Replication.

  1. Log into Skyvia or create an account.
  2. Go to +New->Replication in the top menu.
  3. Select the app of your choice as a source.
  4. Select a preferred data warehouse as a destination. Skyvia supports Amazon RedShift, Google BigQuery, Azure Synapse Analytics, Snowflake, and other popular data warehouse platforms.
  5. Click Add task to set up the data replication parameters.
Skyvia Replication scenario
  1. Click Schedule to plan the replication procedure on a regular basis.
  2. Click Create to save the replication scenario.

Power BI offers native integration with major data warehouses. So, you can transfer the recently uploaded data from the data warehouse to Power BI.

  1. In Power BI, select Get Data and type BigQuery in the search bar.
Get data in Power BI
  1. Select Google BigQuery on the list and click Connect.
  2. Enter your BigQuery credentials. See more detailed instructions on how to connect to BigQuery here.
  3. Once the connection is established, the Navigator window with available data on the server appears. Select the needed data and click Load to load it in Power BI Desktop.
Power BI menu

As we’ve just observed, Skyvia perfectly carries out data warehousing scenarios with ELT. It also has all the necessary functionality for ETL and Reverse ETL implementation:

  • Import case suitable for ETL and Reverse ETL.
  • Data Flow, Control Flow for complex data transformations across compound data pipelines.

NOTE: Skyvia’s Integration tool also offers functionality for data export into CSV files and synchronization between data sources.

Skyvia vs. Other ETL Tools

No doubt, Skyvia is a powerful solution for data warehousing, but it works with different data integration scenarios. However, it’s not unique in its genre in the market; there are pretty many similar solutions by competitors. So, let’s have a look at other popular data integration solutions, and see how they are different or similar to our platform. The best way to start is by providing a comparison table with several critical parameters.

SkyviaInformaticaStitchHevo
FocusETL, ELT, Reverse ETL, data sync, workflow automationETLETL, ELTETL, Reverse ETL, ELT
Skill levelNo-code and visual wizardNo-code, low-codeNo-code, low-codeNo-code, low-code
Sources180+90+140+150+
The ability for customers to add other sourcesWith REST APIWith Connector Toolkit DeveloperWith Singer, Import API, or Incoming WebhooksWith REST API
G2 customer satisfaction rate4.8/54.4/54.5/54.3/5
Compliance with safety certificatesHIPAA, GDPR, PCI DSS.
ISO 27001 and SOC 2 (by Azure)
SOC 1, SOC 2, SOC 3, HIPAA / HITECH, GDPR, Privacy ShieldSOC 2 Type II, HIPAA BAA, ISO/IEC 27001, GDPR, and CCPARSOC 2, HIPAA, GDPR, CCPA
PricingVolume- and feature-based.
Has a free plan.
Consumption-based.Volume- and feature-based.
Has a 14-day trial.
Consumption-based pricing based on Hevo events.
Full-feature 14-day free trial.
Has a free plan.

Compared to its major competitors, Skyvia supports the largest number of data integration scenarios and sources. It also has the highest rankings for usability on the G2 review platform.

In the sense of skill level requirements, all tools are very similar as each of them offers no-code and low-code interfaces. Each service also strongly complies with modern security protocols and certificates.

As for pricing, Skyvia, as well as Hevo, offers a free plan with limited monthly data volume and features. Pricing plans of all tools rely on a consumption-based model considering data size and feature set.

Start Skyvia Now

Case Studies

To switch from theoretical dogmas to practice, let’s show several real-life cases of Skyvia’s users.

Data warehousing

The first story depicts how the telco company TechneValue successfully realized the ELT scenario within its organizational environment. The company’s main objective was to discover the ‘white spots’ of their network as the quality of service has degraded during COVID-19. Gathering data from multiple devices and detecting the components of the network caused delays was crucial to keeping customer satisfaction at high levels.

TechneValue discovered Skyvia’s integration talents and features and decided to use its potential fully. Skyvia extracts data from modems and devices connected to the network and loads it into the data warehouse. Then, analysts use this data to discover the roots of network problems and make predictions for the future.

See more details about this case here.

ETL

Another example is the mechanism of ETL implementation in Exclaimer. The company got the challenging task of establishing Salesforce as a central source of truth serving the Sales, Finance, and Marketing departments.

With Skyvia, Exclaimer managed to transfer data from the internal billing and subscription systems into Salesforce with the previous data transformations adopted.

See more details about this case here.

Conclusion

You’ve learned a great deal today in using ETL in Power BI. You’ve also learned a more flexible option of using Skyvia ETL to power your reports in Power BI.

Do you have any questions and comments? Then, write them in the Comments section below.

Edwin Sanchez
Edwin Sanchez
Software developer and project manager with a total of 20+ years of software development. His most recent technology preferences include C#, SQL Server BI Stack, Power BI, and Sharepoint. Edwin combines his technical knowledge with his most recent content writing skills to help new breed of technology enthusiasts.

TOPICS

BY CONNECTORS

Skyvia podcast