How to Integrate MySQL and Google BigQuery: A Comprehensive Guide

If there were ever a classic in data integrations, MySQL to BigQuery would surely take the lead. And it’s no surprise – combining the structured storage of MySQL with BigQuery’s powerful analytics won’t give businesses a clairvoyant vision. Still, it does provide the data-driven edge needed to stay ahead of the competition.

In this guide, we’ll break down MySQL and BigQuery’s key capabilities and compare their multiple integration methods – both manual and automated. Read on to know which one is the best fit for your needs!

Table of Contents

  1. MySQL Explained: An Overview and Key Benefits
  2. Google BigQuery: Key Features and Benefits
  3. What You Can Migrate from MySQL to BigQuery?
  4. Methods to Connect MySQL to BigQuery
  5. Methods’ Comparison
  6. Challenges in Integrating MySQL with BigQuery
  7. Why Use Skyvia to Sync MySQL to BigQuery?
  8. Conclusion

MySQL Explained: An Overview and Key Benefits

My SQL Workbench

MySQL is an open-source relational database management system owned by Oracle. It is fast, reliable, and free to use, which adds to its popularity. 

With SQL for data management and support of ACID-compliant transactions, MySQL is equally suitable for transactional and structured records. It’s scalable and easily integrated with other systems, which makes it a popular choice for web services, e-commerce platforms, SaaS solutions, and business applications.  

Key benefits

  • High performance & speed: optimized for fast read and write operations, making it ideal for online transaction processing workloads.
  • Scalability: supports vertical and horizontal scaling to handle increasing traffic efficiently.
  • Ensures data integrity: ACID compliance makes MySQL second to none for financial systems where accuracy is critical.
  • Multi-platform & cloud compatibility: runs on Windows, Linux, macOS, and major cloud platforms like AWS, Google Cloud, and Azure.
  • Easy data warehouse (DWH) integration: MySQL is a common component of ETL (Extract, Transform, Load) pipelines, where it is used to store structured transactional data before moving it to a DWH for analytics.

Google BigQuery: Key Features and Benefits

Google BigQuery

BigQuery is Google’s contribution to the data warehouse family – a serverless, scalable, and fully managed cloud solution that is part of the Google Cloud Platform (GCP). 

Unlike traditional databases, BigQuery follows a decoupled architecture, where storage and computing are independent. This allows each resource to scale separately, resulting in better performance: since storage doesn’t impact query processing, BigQuery can handle massive datasets without bottlenecks.

Being a cloud storage, BigQuery shares most common features with its kinfolk from AWS (Amazon Redshift) and Azure (Azure Synapse Analytics), including:

  • Handling massive datasets through distributive computing;
  • Support of SQL-based querying;
  • Cloud-native scalability;
  • Integration with data pipelines;
  • Enterprise-grade security. 

With that said, BigQuery boasts several unique features that set it apart from competitors:

  • Serverless architecture: It was the first major cloud DWH to be fully serverless. No infrastructure management – Google handles everything automatically, including patches, updates, and auto-scaling. The pricing is based on a pay-per-query model, eliminating the need to maintain always-on servers.   
  • Real-time querying: Thanks to the built-in support of Google’s Live Stream API, BigQuery can ingest and analyze streaming data, providing instant insights and live dashboards. Unlike competitors, real-time querying works out of the box – no additional services or external tools are required.
  • ML support: With BigQuery ML, Google’s native ML feature, users can train and deploy machine learning models directly inside BigQuery with simple SQL commands – no need for complex ML frameworks like TensorFlow or PyTorch.
Explore perfect fit

What You Can Migrate from MySQL to BigQuery?

Not all MySQL components can be directly transferred to BigQuery. When planning such an integration, it is important to evaluate each component of your database and plan the migration strategy accordingly. Below is a breakdown:​

Not all MySQL components can be directly transferred to BigQuery. When planning such an integration, it is important to evaluate each component of your database and plan the migration strategy accordingly. Below is a breakdown:​    

MySQL component Transferability Comment 
Data rows Yes
Data types YesSome data types require transformation, see Challenges in Integrating MySQL with BigQuery.
Table schemasYes
IndexesNoMySQL indexes are not applicable in BigQuery.
Stored procedures NoCannot be directly migrated; requires re-implementation using BigQuery’s SQL syntax.
User-defined functionsNoMust be rewritten with BigQuery’s SQL syntax.
TriggersNoBigQuery does not support triggers.
ViewsYes
Transactional logicNoNeeds to be redesigned within BigQuery.
User PermissionsNoAppropriate access controls should be set up in BigQuery.

Methods to Connect MySQL to BigQuery

Connecting MySQL with BigQuery requires an ETL pipeline – an automated process that extracts data from MySQL, transforms it, and loads it into BigQuery for analytics. This integration is commonly practiced by businesses that seek to improve operational performance and gain advanced analytical insights. Its benefits include:

  • Optimized performance for large datasets. Relational DBs are built for fast lookups and transactions, but they may stall when processing heavy analytical queries. BigQuery, on the other hand, is optimized for fast ingestion of large data streams, making it a top choice for real-time and batch pipelines.
  • Consistency of transactional and analytical operations. Running analytics on a live transactional database inevitably impacts app performance for end users. Keeping analytical workloads separate in a DWH ensures that transactional operations remain uninterrupted, maintaining smooth business processes.

Below, we’ll explore the most effective methods for integrating MySQL with BigQuery.

Method 1: Connect MySQL to BigQuery Manually

Dump & Load

This is a traditional approach to data transfers. It involves exporting records from MySQL with mysqldump, uploading them to Google Cloud Storage (GCS), and then importing them into BigQuery. Although manually effortful, this method is straightforward and reliable, best suited for one-time data transfers or infrequent updates. Below is a step-by-step guide:​

  1. Set up the Google Cloud Platform (GCP) environment:
  • Create a Google Cloud account
  • Enable BigQuery Service: within your GCP project, ensure that the BigQuery API is enabled.​
  1. Install the Google Cloud SDK, the toolkit for interacting with GCP services.
  • After installation, run gcloud init to configure the SDK with your GCP account and set the default project.
  1. Export data from MySQL:
  • Use the mysqldump command to export MySQL databases or tables. To export a specific table:
mysqldump -u [username] -p [password] -h [hostname] [database] [table_name] > [table_name].sql 

Replace the placeholders in [ ] brackets with your MySQL credentials and the specific table you wish to export.

  1. Upload data to GCS:
  • In the GCP Console, create a new bucket in GCS to hold your files.​
  • Upload your files to a newly created bucket using the gsutil command-line tool:
gsutil cp [table_name].csv gs://your-bucket-name/ 
  1. Create a new dataset in BigQuery:
  • Using the bq utility, run the command:
bq mk [dataset_name]
  1. Import data into BigQuery:
  • Define the table schema: BigQuery requires a schema definition for the data. Create a JSON file (schema.json) that describes the structure of your table. For example:
[
  {"name": "column1", "type": "STRING"},
  {"name": "column2", "type": "INTEGER"},
  {"name": "column3", "type": "FLOAT"}
]

Adjust the column names and types to match your records.

  • Use the bq tool to load data from GCS into BigQuery:
bq load --source_format=CSV [dataset_name].[table_name] gs://[your_bucket_name]/[table_name].csv schema.json

Ensure that the source_format matches the format of your file (e.g., CSV).

Using Google Cloud Data Fusion

Cloud Data Fusion is a fully managed service for building cloud-native ETL pipelines. With a convenient GUI and drag-and-drop wizard for creating dataflows, it is beginner-friendly and accessible to all kinds of users. 

Steps to integrate MySQL with BigQuery:

  1. Set up a Cloud Data Fusion instance:
  • In the Google Cloud Console, create a new Data Fusion instance.​
  • Grant the instance necessary permissions to access both MySQL and BigQuery.​
  1. Configure MySQL as a source:
  • Within Data Fusion, establish a connection to your MySQL database by providing the required credentials and connection details.​
  • Test the connection to verify accessibility.​
  1. Design the ETL pipeline:
  • Use the visual interface to create a new pipeline.
Google Data Fusion
  • Add a MySQL source component to read data from the desired tables.​
  • Incorporate any necessary transformation logic to align formats and schemas.​
  • Add a BigQuery sink component to load the transformed data into the target dataset.​
  1. Run and monitor:
  • Deploy the pipeline and initiate the transfer process.​
  • Check the progress and set up alerts for any errors or issues.

Using Python Scripts for Data Replication

For those comfortable with programming, developing Python scripts is an excellent choice to customize the data replication process. This method allows for tailored transformations and scheduling to meet specific business needs.​

Steps to Integrate MySQL with BigQuery:

  1. Set Up the Development Environment:
  • Ensure Python is installed on your system.​
  • Install necessary libraries such as mysql-connector-python for MySQL interactions and google-cloud-bigquery for BigQuery operations.​
  1. Extract records from MySQL:
  • Establish a connection to the MySQL database using the MySQL connector.​
  • Execute SQL queries to retrieve data from the desired tables.​
  1. Transform data using pandas or Spark:
  • Load the extracted records into a preferred DataFrame for transformation.​
  • Perform necessary cleaning and formatting to ensure compatibility with BigQuery’s schema.​
  1. Load data into BigQuery:
  • Authenticate with the Google Cloud API using service account credentials.​
  • Utilize the BigQuery client library to load the transformed data into the target BigQuery table.​
  1. Automate the Process:
  • Schedule the Python script to run at desired intervals using tools like cron (Linux) or Task Scheduler (Windows).​
  • Alternatively, deploy the script using Google Cloud Functions and use Cloud Scheduler to trigger it, achieving a serverless automation solution.

Method 2: Automate MySQL to BigQuery Integration with Skyvia

While the manual methods discussed above provide greater control over exported data, they can be resource-intensive and require technical expertise. In contrast, the next method – integrating MySQL and BigQuery with Skyvia – is fast, straightforward, and fully automated.

Step 1. Create Connections

First, let’s establish a connection to MySQL. 

  1. Sign in to Skyvia, or, if you don’t have an account yet, create it for free. 
  2. Click +Create New, select Connection, and choose MySQL.
Connectors by Skyvia

Note: Skyvia supports two connection methods for MySQL: direct and with an agent. Use a direct connection if your MySQL server is accessible via the internet. If it is on a local computer or network, you’ll need to install the Skyvia Agent application to make a secure connection. For this case, we’ll create a direct connection. 

  1. Provide MySQL credentials: server address, port, user ID, password, and database name.
MySQL connection by Skyvia
  1. Click Create Connection

Сonnecting Skyvia to BigQuery involves the same steps as with MySQL: 

  1. In Skyvia, go to Connections and click +Create New.
  2. On the Select Connector page, choose BigQuery.
  3. Select the preferred authentication method. 

Note: Skyvia supports two authentication methods for Google BigQuery: OAuth authentication (User Account) and Service Account authentication. When using OAuth authentication, you sign in with your Google account without sharing your credentials with Skyvia. Instead, Skyvia uses OAuth 2.0 to generate a secure token, which is bound to the connection. For Service Account authentication, you need to provide the Private Key JSON. 

  1. Enter your BigQuery credentials: Project Id and DataSet Id to connect to. You can retrieve these in the Google API console.
Google BigQuery connection
  1. Specify a Google Cloud Storage Bucket (optional unless you are planning for bulk import and replication operations).
  2. Click Create Connection

Step 2. Create an Import Integration 

Once both connections are ready, let’s implement a scenario of moving MySQL records to BigQuery. 

  1. In the top menu, click +Create New and select Import
  2. Set the corresponding source type. For the MySQL to BigQuery import, choose Data Source database or cloud app. Select the MySQL connection as the Source and the BigQuery connection as the Target. 
  3. Click Add new to create an import task.
MySQL BigQuery Import
  1. On the Source Definition page, select a table to import data from:
Task editor

Note: Each table must be imported as a separate task. You can add as many tasks as needed.

  1. On the Target Definition page, select an object to import your data to, and one of the available operations.
Task editor
  1. Click Next step and proceed to mapping.

Step 3. Mapping

On the Mapping Definition page, map target columns (on the right) to source columns (on the left). To map a target column, click it in the table, select the corresponding source column, and choose the desired mapping type.

task editor mapping data

Note: If the source and target columns have different data types, use Expression Mapping to transform them and make them fit the target data structure. 

Click Save task and run the integration. 

Step 4. Monitor

You can run the import task on schedule or manually. To automate integration, click Schedule and configure the timing. You can track its progress in the Monitor or Logs tabs. If errors occur, click the run results to review the failed records.

Log and monitor

When the run is completed, a copy of your MySQL data will be available in BigQuery.    

Methods’ Comparison Table

Criteria Methods 
Dump & Load Cloud Data FusionPython scriptsSkyvia
Ease of useManual, requires SQL knowledgeUser-friendly visual interfaceRequires programming skills. No-code, user-friendly
SpeedSlow (manual process, batch mode)Fast (optimized pipeline execution)Varies (depends on script efficiency)Fast (optimized cloud processing)
CostLow (only storage & query costs)Higher (GCP service charges apply)Medium (GCP usage + development time)Affordable (subscription-based pricing)
AutomationNoYesPartial (requires custom scheduling)Fully automated & scheduled
Best forOne-time transfers, small datasetsEnterprises needing scalable ETLCustom transformations & developersAutomated, regular integrations
ProsSimple, no extra tools required– Real-time integration support – Fully managed- Scalable- Centralized management of data pipelines– Highly customizable – Supports complex logic– No-code setup – Automation- Support of ETL, reverse ETL, and ELT pipelines
Cons– Time-intensive- Lack of automation– Requires GCP knowledge- Limited custom transformations- Less control over infrastructureRequires coding & maintenanceRequires paid subscription to access advanced options.

Challenges in Integrating MySQL with BigQuery

While MySQL to BigQuery integration offers clear advantages in terms of analytics and scalability, the process itself comes with certain challenges and limitations that users should be aware of.

  1. Schema differences & data type mismatches. MySQL and BigQuery handle data types differently, which can lead to schema mismatches during migrations. Some MySQL data types have no direct equivalent in BigQuery and require manual mapping.
MySQL Data TypeBigQuery EquivalentPotential Issue
TINYINT(1)BOOLEANNeeds conversion
DATETIMETIMESTAMPTime zones may cause inconsistencies
DECIMALNUMERICPrecision differences
TEXT/BLOBSTRINGPotential size limits
ENUM/SETSTRINGMay require transformation
  1. Performance bottlenecks with large datasets. Since MySQL is optimized for fast transactions, extracting large volumes of data from it can result in a slowdown and degraded performance.  
  2. Real-time syncing complexity. MySQL does not natively support real-time streaming to BigQuery. Batch processing is common, but for real-time analytics, additional tools are needed. 
  3. ETL pipeline complexity & maintenance. ETL workflows require ongoing monitoring and updates. Schema changes in MySQL, such as adding/dropping columns or data type changes, can easily break it.

Why Use Skyvia to Sync MySQL to BigQuery?

Skyvia is a cloud data integration platform whose diverse functionality makes it a one-stop solution for all data operations. The benefits of using Skyvia include:

  • Automation & scalability. Skyvia automates the entire ETL pipeline, handling large-scale routine tasks so you don’t have to.
  • Data transformation. The platform’s advanced mapping options allow you to transform and restructure data before loading, resolving schema mismatches between MySQL and BigQuery.
  • Versatile integration options. It goes beyond basic import. With Skyvia, you can replicate MySQL objects into BigQuery, perform complex transformations using conditional logic, query data directly, and even set up automated backups.
  • Ease of use. The platform’s no-code interface makes data integration accessible to all users, from business analysts to engineers. And if you need more control, the Query product lets you either use a drag-and-drop visual designer or write complex SQL queries manually.

Watch this video tutorial to find out more about how to integrate MySQL with BigQuery using Skyvia.

Conclusion

Although both MySQL and BigQuery are used for storing data, they are optimized for different needs. Their integration creates a powerful solution – combining efficient transaction processing with scalable analytics. 

There are multiple integration options available, and choosing the right one depends on your technical expertise and business needs:

  • If you prefer coding and CLI tools, a manual process may be the best fit.
  • For those needing a no-code interface, Skyvia is the ideal choice. 

But whatever method you choose, integrating MySQL with BigQuery is a strategic move that allows you to maximize data potential and drive better business decisions.

FAQ for MySQL and Google BigQuery

Loader image

BigQuery is a serverless, highly scalable DWH optimized for big data analytics. Unlike MySQL, it supports real-time streaming, fast SQL-based querying on large datasets, built-in ML models, and pay-per-query pricing for cost efficiency.

Yes! Google Cloud SQL (managed MySQL) can be directly integrated with BigQuery using Cloud SQL federated queries or ETL tools like Dataflow or Cloud Data Fusion, enabling real-time or batch data transfers.

The easiest way is to use a fully automated tool like Skyvia, Hevo, or Fivetran. These tools handle all steps of the ETL process – extraction, transformation, and loading – ensuring seamless transfer with minimal effort.

Popular ETL tools for MySQL to BigQuery include Skyvia, Hevo Data, Fivetran, Stitch, Google Cloud Data Fusion, and Airbyte. These tools automate data extraction, transformation, and loading, reducing manual effort.

Anastasiia Kulyk
Anastasiia Kulyk
With years of experience in technical writing, Anastasiia specializes in data integration, ETL/ELT processes, and cloud technologies. Passionate about making complex concepts accessible, she crafts insightful content that helps businesses navigate the ever-evolving world of data management.

TOPICS

BY CONNECTORS

Skyvia podcast