/

/

Shopify SQL Queries: Analyze Your Store Data Like a Pro

Content

Shopify SQL Queries: Analyze Your Store Data Like a Pro

Shopify SQL Queries: Analyze Your Store Data Like a Pro

Shopify SQL Queries: Analyze Your Store Data Like a Pro

Shopify's built-in analytics show you the basics: total sales, top products, returning customers. But when your store scales, you need answers that dashboards can't give you.

Which marketing channel produces the highest lifetime value customers? What's your real refund rate by product variant? Which discount codes actually drive profit vs just revenue?

The answer is SQL. Once your Shopify data lives in a database, these queries unlock insights that no dashboard provides.

Getting Your Shopify Data into a SQL Database

Shopify is a SaaS platform, so you don't get direct database access. But there are several ways to get your data into a queryable format:

  • Shopify Data Export: CSV exports from Admin → Analytics → Reports

  • ETL Tools: Stitch, Fivetran, or Airbyte sync Shopify data to your warehouse automatically

  • Shopify API: Pull data programmatically into MySQL, PostgreSQL, or BigQuery

  • ShopifyQL Notebooks: Available on Shopify Plus for direct querying

  • Third-party apps: Panoply, Coupler.io, or Supermetrics

Once synced, your Shopify data typically lands in these tables:

Table

What It Contains

orders

Order ID, customer, totals, dates, status

order_line_items

Products within each order

products

Product catalog with titles, types, vendors

variants

Product variants (size, color, SKU)

customers

Customer profiles, emails, addresses

transactions

Payment details

refunds

Refund records

discount_codes

Coupon usage

inventory_levels

Stock quantities by location

Revenue & Sales Queries

1. Daily Revenue with Order Count

SELECT 
    DATE(created_at) as order_date,
    COUNT(*) as total_orders,
    SUM(total_price) as revenue,
    AVG(total_price) as avg_order_value,
    SUM(total_discounts) as total_discounts_given
FROM orders
WHERE financial_status = 'paid'
  AND cancelled_at IS NULL
  AND created_at >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY)
GROUP BY DATE(created_at)
ORDER BY order_date DESC;

2. Revenue by Traffic Source

SELECT 
    COALESCE(referring_site, 'Direct') as traffic_source,
    COUNT(*) as orders,
    SUM(total_price) as revenue,
    AVG(total_price) as avg_order_value,
    COUNT(DISTINCT customer_id) as unique_customers
FROM orders
WHERE financial_status = 'paid'
  AND cancelled_at IS NULL
  AND created_at >= DATE_SUB(CURRENT_DATE, INTERVAL 90 DAY)
GROUP BY referring_site
ORDER BY revenue DESC
LIMIT 15;

3. Revenue by Product Type

SELECT 
    p.product_type,
    COUNT(DISTINCT o.id) as orders,
    SUM(li.quantity) as units_sold,
    SUM(li.price * li.quantity) as revenue,
    AVG(li.price) as avg_price
FROM order_line_items li
JOIN orders o ON li.order_id = o.id
JOIN products p ON li.product_id = p.id
WHERE o.financial_status = 'paid'
  AND o.cancelled_at IS NULL
  AND o.created_at >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY)
GROUP BY p.product_type
ORDER BY revenue DESC;

Customer Queries

4. Customer Lifetime Value (Top 20)

SELECT 
    c.email,
    c.first_name,
    c.last_name,
    c.city,
    c.country,
    COUNT(o.id) as total_orders,
    SUM(o.total_price) as lifetime_value,
    MIN(o.created_at) as first_order,
    MAX(o.created_at) as last_order,
    DATEDIFF(MAX(o.created_at), MIN(o.created_at)) as customer_age_days
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.financial_status = 'paid'
  AND o.cancelled_at IS NULL
GROUP BY c.id
ORDER BY lifetime_value DESC
LIMIT 20;

5. Customer Cohort: First Purchase Month

SELECT 
    DATE_FORMAT(first_order, '%Y-%m') as cohort_month,
    COUNT(DISTINCT customer_id) as new_customers,
    SUM(first_order_value) as cohort_revenue,
    AVG(first_order_value) as avg_first_order
FROM (
    SELECT 
        customer_id,
        MIN(created_at) as first_order,
        (SELECT total_price FROM orders o2 
         WHERE o2.customer_id = o.customer_id 
         ORDER BY created_at LIMIT 1) as first_order_value
    FROM orders o
    WHERE financial_status = 'paid' AND cancelled_at IS NULL
    GROUP BY customer_id
) cohorts
GROUP BY cohort_month
ORDER BY cohort_month DESC;

6. Customers at Risk of Churning

SELECT 
    c.email,
    c.first_name,
    COUNT(o.id) as past_orders,
    SUM(o.total_price) as total_spent,
    MAX(o.created_at) as last_order,
    DATEDIFF(CURRENT_DATE, MAX(o.created_at)) as days_since_last_order
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.financial_status = 'paid'
GROUP BY c.id
HAVING COUNT(o.id) >= 2
  AND days_since_last_order > 90
ORDER BY total_spent DESC;

Product Performance Queries

7. Best-Selling Products with Variant Breakdown

SELECT 
    p.title as product,
    v.title as variant,
    v.sku,
    SUM(li.quantity) as units_sold,
    SUM(li.price * li.quantity) as revenue,
    AVG(li.price) as avg_selling_price
FROM order_line_items li
JOIN products p ON li.product_id = p.id
JOIN variants v ON li.variant_id = v.id
JOIN orders o ON li.order_id = o.id
WHERE o.financial_status = 'paid'
  AND o.cancelled_at IS NULL
  AND o.created_at >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY)
GROUP BY p.id, v.id
ORDER BY units_sold DESC
LIMIT 20;

8. Product Refund Rate

SELECT 
    p.title,
    COUNT(DISTINCT o.id) as total_orders,
    COUNT(DISTINCT r.id) as refunded_orders,
    ROUND(COUNT(DISTINCT r.id) * 100.0 / NULLIF(COUNT(DISTINCT o.id), 0), 1) as refund_rate_pct,
    SUM(COALESCE(rli.subtotal, 0)) as refund_amount
FROM products p
JOIN order_line_items li ON p.id = li.product_id
JOIN orders o ON li.order_id = o.id
LEFT JOIN refund_line_items rli ON li.id = rli.line_item_id
LEFT JOIN refunds r ON rli.refund_id = r.id
WHERE o.created_at >= DATE_SUB(CURRENT_DATE, INTERVAL 90 DAY)
GROUP BY p.id
HAVING total_orders >= 10
ORDER BY refund_rate_pct DESC;

Discount & Marketing Queries

9. Discount Code Performance

SELECT 
    dc.code,
    COUNT(DISTINCT o.id) as times_used,
    SUM(o.total_price) as total_revenue,
    SUM(o.total_discounts) as total_discount_given,
    AVG(o.total_price) as avg_order_value,
    ROUND(SUM(o.total_discounts) * 100.0 / NULLIF(SUM(o.total_price + o.total_discounts), 0), 1) as discount_pct
FROM discount_codes dc
JOIN orders o ON dc.order_id = o.id
WHERE o.financial_status = 'paid'
  AND o.cancelled_at IS NULL
GROUP BY dc.code
ORDER BY times_used DESC;

10. Repeat Purchase Rate

SELECT 
    total_customers,
    repeat_customers,
    ROUND(repeat_customers * 100.0 / total_customers, 1) as repeat_rate_pct,
    single_purchase,
    ROUND(single_purchase * 100.0 / total_customers, 1) as one_time_pct
FROM (
    SELECT 
        COUNT(DISTINCT customer_id) as total_customers,
        SUM(CASE WHEN order_count > 1 THEN 1 ELSE 0 END) as repeat_customers,
        SUM(CASE WHEN order_count = 1 THEN 1 ELSE 0 END) as single_purchase
    FROM (
        SELECT customer_id, COUNT(*) as order_count
        FROM orders
        WHERE financial_status = 'paid' AND cancelled_at IS NULL
        GROUP BY customer_id
    ) per_customer
) summary;

Skip the SQL Learning Curve

Every query above took expertise to write. The JOINs, date functions, CASE statements, and subqueries are complex even for experienced developers.

With AI2SQL, you describe what you want in plain English:

  • "Show me revenue by country for the last 3 months" → done

  • "Which customers bought more than 3 times but haven't ordered in 60 days?" → done

  • "Compare discount code performance this quarter" → done

Try it free at ai2sql.io and turn your Shopify data into actionable insights.

Share this

More Articles

More Articles

More Articles