Skio delivers your data into a BigQuery dataset inside Skio's GCP project (largedata-380204) with read-only access. This is intentional — you don't pay storage costs, and Skio maintains the source of truth. If you want to join Skio data with your own sources (Shopify, Klaviyo, ad platforms, etc.), build custom dashboards, or run unrestricted SQL, you'll need to copy the dataset into a project you own.
There are two supported ways to do this:
Method | Best for | Min. refresh | Effort |
|---|---|---|---|
Method A: BigQuery Data Transfer Service (recommended) | Full dataset replication, auto-picks up new tables, minimal maintenance | Every 24 hours (12 hours for some same-region setups) | ~5 minutes, no SQL |
Method B: Scheduled Query | Refreshes more frequent than every 12 hours, specific tables only, or transformations on copy | Every 15 minutes | ~15 minutes, requires SQL |
Skio refreshes your dataset every ~4 hours. If you need your copy to stay close to that cadence, use Method B. If daily is fine, Method A is simpler.
Before you start
You should have received the following from your Skio point of contact. If not, reach out before proceeding.
Source project ID:
largedata-380204Source dataset name:
skio_<your_merchant_slug>(e.g.,skio_acme_brand)Read access confirmed for either a user/group email you own or a service account in your GCP project
Note: BigQuery doesn't show shared datasets in the Explorer until you star the project. Learn how to do this here.
Source dataset region
Skio's source dataset is in us-west1. Your destination dataset region affects cost and schedule options:
Same-region copy (destination also in
us-west1) — cheapest, no egress fees, supports the fastest schedulesCross-region copy (e.g.,
USmulti-region,EU) — fully supported, but Google charges network egress and some schedule minimums apply
Unless you have a strong reason to choose another region, create your destination dataset in
us-west1.
Understanding permissions
BigQuery's cross-project permission model has two sides. Both must be in place before a copy job will succeed.
What | Where it's granted | Who grants it |
|---|---|---|
BigQuery Data Viewer on | Skio's project ( | Skio — already set up when you were onboarded |
BigQuery Job User on your GCP project | Your GCP project | You |
The copy job runs from your project, so the identity running it needs BigQuery Job User in your project. It reads from Skio's project, where Skio has already granted Data Viewer access.
How to grant BigQuery Job User in your project
In your GCP console, go to IAM & Admin > IAM.
Click Grant Access.
Add the user email or service account that will run the transfer.
Assign the BigQuery Job User role (or BigQuery User, which includes Job User).
Click Save.
If you're using a service account, make sure it's the same one Skio has whitelisted on the dataset. If not, send your Skio contact the service account email (<name>@<project-id>.iam.gserviceaccount.com) and they'll add it.
Method A: BigQuery Data Transfer Service
Use this method if daily refresh is acceptable, you want a one-time setup that automatically picks up any new tables Skio adds, and you don't want to write SQL.
Step 1: Create your destination dataset
Open BigQuery Studio in your GCP console.
Click the three-dot menu next to your project name and select Create dataset.
Fill in the following:
Dataset ID — e.g.,
skio_replicaorskio_dataLocation type → Region →
us-west1
Click Create dataset.
Step 2: Create the data transfer
In the BigQuery left sidebar, click Data transfers.
Click + Create Transfer.
Configure the source:
Source type:
Dataset CopyDisplay name:
Skio Replication(or any descriptive label)Repeat frequency:
Daily— orCustom→every 12 hoursif your setup supports itStart time: Any time works. If you want the freshest daily snapshot, schedule it for early morning in your timezone.
Configure the data source:
Source project:
largedata-380204Source dataset:
skio_<your_merchant_slug>Destination dataset: the dataset you created in Step 1
Check Overwrite destination tables — this keeps your copy in sync rather than appending duplicate rows.
Click Save.
Step 3: Verify
The first run kicks off immediately (or on your schedule). Go to Data transfers, click your transfer, and check the run history. Once complete, confirm it worked:
SELECT COUNT(*) FROM `<your-project>.skio_replica.Subscription`;
Limitations
Minimum schedule is 24 hours for most configurations. Some same-region setups support 12-hour schedules, but BigQuery may block more frequent options. Use Method B if you need more frequent refreshes.
Schema changes on existing tables may require you to delete and recreate the destination table the first time. Checking Overwrite destination tables handles most cases automatically.
Method B: Scheduled Query
Use this method if you need refreshes more frequently than every 12 hours, you only want specific tables copied, or you want to transform the data (filter rows, rename columns, join sources) on the way in.
Step 1: Create your destination dataset
Same as Method A, Step 1. Create a dataset in your project — this guide uses skio_replica in us-west1.
Step 2: Write the query
In BigQuery Studio, open a new SQL editor tab and paste a statement like the one below. Replace skio_<your_merchant_slug> and the table list with your actual values.
-- Replicate Skio tables into your own dataset.
-- CREATE OR REPLACE rebuilds each table on every run,
-- keeping your copy in sync with Skio's source.
CREATE OR REPLACE TABLE `<your-project>.skio_replica.Site` AS (
SELECT * FROM `largedata-380204.skio_<your_merchant_slug>.Site`
);
CREATE OR REPLACE TABLE `<your-project>.skio_replica.Subscription` AS (
SELECT * FROM `largedata-380204.skio_<your_merchant_slug>.Subscription`
);
CREATE OR REPLACE TABLE `<your-project>.skio_replica.SubscriptionLineItem` AS (
SELECT * FROM `largedata-380204.skio_<your_merchant_slug>.SubscriptionLineItem`
);
-- Add one CREATE OR REPLACE block per table you want to copy.
-- Ask Skio for the full table list if needed.
Run the query manually once to confirm it works before scheduling it. If you hit a permission error, re-check the permissions section above.
Step 3: Schedule the query
With your query in the editor, click Schedule (top right).
Click Create new scheduled query.
Fill in:
Name:
Skio ReplicationRepeat frequency:
Hours→ every4hours (matches Skio's refresh cadence). You can go as low as every 15 minutes, but there's no data benefit below 4 hours.Destination for query results: leave blank — the
CREATE OR REPLACE TABLEstatements handle writes explicitly.Advanced options → Service account: (recommended) use a service account in your project that has BigQuery Data Viewer on the Skio dataset and BigQuery Job User in your project. This keeps the schedule working even if your user account changes.
Click Save.
Step 4: Verify
Go to Scheduled queries in the left sidebar, click your query, and check the run history. Force a manual run to confirm it works end-to-end.
Limitations
You have to maintain the SQL. If Skio adds a new table and you want it copied, you'll need to add a new
CREATE OR REPLACE TABLEblock. Ask Skio for the current table list periodically, or use Method A if you want automatic table discovery.Full table rewrites are fine at typical Skio data volumes, but can increase compute costs on very large tables. If you're watching BigQuery costs, your Skio contact can advise on incremental merge patterns using
_PARTITIONTIMEorupdatedAt.
Which method should I use?
You want a copy of all the data, refreshed daily, with minimal setup → Method A
You need data refreshed more frequently than every 12 hours → Method B
You want to filter or transform data on the way in → Method B
You want new tables Skio adds to show up automatically → Method A
When in doubt, start with Method A. You can always switch to Method B later.
Troubleshooting
"Access Denied: Dataset largedata-380204:skio_xxx. Permission bigquery.datasets.get denied"
The identity running the job doesn't have Data Viewer on Skio's dataset. Confirm with Skio that the exact email or service account address is on the whitelist. If you added it recently, permissions sync can take up to a few hours — retry before escalating.
"User does not have permission to query table largedata-380204:skio_xxx.YYY"
Usually the same root cause as above. Also confirm whether Skio has you listed as a user vs. a group — Google treats these as distinct principal types, and a mismatch will cause the permission to silently fail.
"Permission bigquery.jobs.create denied in project"
The identity running the job is missing BigQuery Job User on your project. See the permissions section above.
The dataset doesn't appear in my BigQuery Explorer
Log in as the authorized user and go to console.cloud.google.com/bigquery?project=largedata-380204. Click the star next to largedata-380204. The project will now appear in your Explorer sidebar.
My Dataset Copy transfer won't let me schedule more frequently than every 12 or 24 hours
This is a BigQuery Data Transfer Service limitation. Switch to Method B if you need a tighter cadence.
My copy is stale — Skio's data has rows that mine doesn't
Check your transfer or scheduled query run history for failures. A failed run leaves the previous snapshot in place. Common causes: revoked permissions, a schema change on a Skio table, or paused GCP billing.
FAQ
How often does Skio refresh the source dataset?
Every ~4 hours, with about 2 hours of latency end-to-end. Data from 6:00 PM, for example, is typically queryable around 8:00 PM.
Will copying all the data cost a lot?
Storage in BigQuery is cheap — fractions of a cent per GB per month. The main variable cost is query compute (~$6.25 per TB scanned on on-demand pricing). A daily full-dataset copy is typically negligible unless your dataset is very large.
Can I query Skio's dataset directly without copying?
BigQuery supports cross-project queries when permissions are configured correctly. In practice, most BI tools and warehouse integrations assume everything lives in one project, which is why copying is recommended. If direct querying would work for your use case, reach out to your Skio contact.
Can Skio write data directly into my GCP project?
Not currently. Skio owns the source storage to maintain data integrity and cost control. The copy methods above are the supported integration paths.
What if I use Fivetran, Airbyte, or Hightouch?
These tools typically need write access to the dataset, which Skio doesn't grant by default. The supported paths are Method A and Method B. If you have a specific ETL tool in mind, flag it with your Skio contact and they can advise on compatibility.