SQL ETL Testing Interview Questions

What Is ETL Testing? (Definition + Real-World Example)

ETL Testing is the process of validating data during Extract, Transform, and Load operations using SQL queries to ensure data accuracy, completeness, consistency, and performance in a Data Warehouse (DW).

Real-world example

In a banking project, daily transaction data is extracted from OLTP systems, transformed using business rules (currency conversion, deduplication, SCD handling), and loaded into fact tables.
An ETL tester uses SQL ETL testing interview questions–level queries to:

  • Compare source and target record counts
  • Validate transformations and aggregations
  • Ensure reports match DW data

This is why SQL skills are mandatory for ETL testing roles.


Data Warehouse Flow: Source → Staging → Transform → Load → Reporting

  1. Source – OLTP DBs, flat files, APIs
  2. Staging – Raw data landing area
  3. Transformation – Business rules, joins, aggregations
  4. Load – Fact & Dimension tables
  5. Reporting – BI dashboards and analytics

SQL ETL Testing Interview Questions & Answers

(Basic → Advanced | SQL-Driven & Scenario-Based)


🔹 Basic SQL ETL Testing Questions (1–15)

  1. What is SQL ETL testing?
    Using SQL queries to validate ETL data accuracy and transformations.
  2. Why is SQL important for ETL testers?
    SQL validates record counts, joins, aggregations, and data quality.
  3. What is Source-to-Target (S2T) mapping?
    Document defining how source columns map to target columns.
  4. How do you validate record count using SQL?
    Compare COUNT(*) between source and target tables.
  5. What is staging area testing?
    Ensuring staging data matches source exactly.
  6. What are audit fields?
    load_date, batch_id, created_by, updated_date.
  7. What is data reconciliation?
    Matching totals and counts between source and target.
  8. Difference between fact and dimension tables?
    Facts store measures; dimensions store descriptive attributes.
  9. What is full load vs incremental load?
    Full reloads all data; incremental loads only delta records.
  10. What is reject data testing?
    Validating rejected rows and error reasons.
  11. What is data quality testing?
    Accuracy, completeness, consistency checks.
  12. What is mapping validation?
    Ensuring SQL logic matches S2T rules.
  13. What is end-to-end ETL testing?
    Source → DW → Report validation.
  14. What is data lineage?
    Tracking data from source to report.
  15. What SQL clauses are most used in ETL testing?
    JOIN, GROUP BY, HAVING, window functions.

🔹 Intermediate SQL ETL Testing Questions (16–35)

  1. Scenario: Source has 1M rows, target has 980K. What SQL do you run?
    Count, reject, and filter validation queries.
  2. How do you test NULL handling using SQL?
    WHERE column IS NULL checks on mandatory fields.
  3. What is SCD Type 1?
    Overwrites old values without history.
  4. What is SCD Type 2?
    Maintains history with effective dates and flags.
  5. How do you validate SCD2 using SQL?
    Check multiple rows per business key with current_flag.
  6. What is surrogate key testing?
    Validate uniqueness and non-null values.
  7. What is late-arriving dimension scenario?
    Fact arrives before dimension; validate dummy key.
  8. Scenario: Duplicate rows in target. What SQL do you use?
    GROUP BY HAVING COUNT(*) > 1.
  9. How do you test aggregation logic?
    Compare source aggregates with target aggregates.
  10. What is hashing used for in ETL?
    Change detection and deduplication.
  11. How do you test CDC using SQL?
    Validate only changed records are loaded.
  12. Scenario: Negative amount in report. What SQL do you run?
    Filter and business rule validation queries.
  13. What is referential integrity testing?
    Fact foreign keys must exist in dimension tables.
  14. How do you test data type mismatches?
    Validate precision, scale, and truncation.
  15. How do you validate date transformations?
    Compare source and target date formats and time zones.
  16. What is soft delete testing?
    Validate delete_flag values.
  17. What is schema drift?
    Source schema changes impacting ETL.
  18. How do you validate BI reports using SQL?
    Compare report queries with DW tables.
  19. What is partition testing?
    Validate correct partition placement.
  20. What is incremental load validation?
    Ensure only delta data is loaded.

🔹 Advanced SQL ETL Testing Scenarios (36–55)

  1. Scenario: ETL job misses SLA. How do you analyze SQL performance?
    Check execution plans and indexes.
  2. How do you test performance tuning using SQL?
    Use EXPLAIN or EXPLAIN PLAN.
  3. Scenario: Many-to-many join inflates data. How to detect?
    Join cardinality checks.
  4. How do you validate window functions?
    Compare ranked results with business rules.
  5. Scenario: Late data affects aggregates. What SQL do you run?
    Recalculate impacted partitions.
  6. How do you test audit/control tables?
    Validate batch IDs and counts.
  7. Scenario: Report mismatch with DW. First SQL check?
    Validate GROUP BY logic.
  8. How do you test ETL rollback using SQL?
    Ensure partial loads are removed.
  9. Scenario: Parallel jobs cause deadlocks. What to analyze?
    Locking and transaction isolation.
  10. How do you test historical reloads?
    Check duplication and SCD logic.
  11. Scenario: Source file arrives late. What SQL validation?
    Validate missing data loads.
  12. How do you validate checksum/hash totals?
    Compare source vs target hashes.
  13. Scenario: Currency conversion mismatch. What SQL test?
    Validate exchange rate joins.
  14. How do you test cloud ETL pipelines with SQL?
    Validate scalability and performance.
  15. Scenario: NULLs after LEFT JOIN. Why?
    Missing dimension records.
  16. How do you test archival logic?
    Validate data movement to history tables.
  17. Scenario: Incorrect SCD expiry date. Root cause?
    Effective date logic error.
  18. How do you validate metadata tables?
    Check load timestamps and row counts.
  19. Scenario: Unexpected data spike. What SQL test?
    Source anomaly detection.
  20. How do you perform real-time SQL ETL testing?
    Source → Target → Report reconciliation.

Real SQL Query Examples for ETL Validation

Sample Tables

  • src_orders(order_id, cust_id, amount, order_dt)
  • dim_customer(cust_sk, cust_id, current_flag)
  • fact_sales(order_id, cust_sk, amount, load_dt)

1️⃣ Record Count Validation

SELECT COUNT(*) FROM src_orders;

SELECT COUNT(*) FROM fact_sales;

2️⃣ JOIN Validation

SELECT COUNT(*) AS missing_dim

FROM fact_sales f

LEFT JOIN dim_customer d

ON f.cust_sk = d.cust_sk

WHERE d.cust_sk IS NULL;

3️⃣ GROUP BY Aggregation

SELECT cust_id, SUM(amount)

FROM src_orders

GROUP BY cust_id;

4️⃣ Window Function – Deduplication

SELECT *

FROM (

  SELECT order_id,

         ROW_NUMBER() OVER (PARTITION BY order_id ORDER BY order_dt DESC) rn

  FROM src_orders

) t

WHERE rn = 1;

5️⃣ Performance Tuning Validation

EXPLAIN

SELECT *

FROM fact_sales

WHERE load_dt >= CURRENT_DATE – 1;


ETL Tools Commonly Asked in SQL ETL Interviews

  • Informatica
  • Microsoft SQL Server Integration Services
  • Ab Initio
  • Pentaho
  • Talend

ETL Defect Examples + Test Case Sample

Defect Example

Issue: Duplicate records in fact table

  • Root Cause: Incorrect join logic
  • Fix: Correct join keys and add deduplication

Sample Test Case

  • Test Case: Validate SCD2 update
  • Expected Result:
    • Old record expired
    • New record inserted with current_flag = ‘Y’

SQL ETL Testing – Quick Revision Sheet

  • Validate counts, sums, NULLs, duplicates
  • Check S2T mapping carefully
  • Test SCD1, SCD2, audit fields, hashing
  • Use JOIN, GROUP BY, window functions
  • Always reconcile source vs target vs reports

FAQs (Snippet-Friendly)

Q1. What are SQL ETL testing interview questions?
They focus on validating ETL logic using SQL queries.

Q2. Which SQL is mandatory for ETL testing?
JOINs, GROUP BY, window functions, and performance tuning queries.

Q3. How do you test SCD2 using SQL?
By validating multiple versions, effective dates, and current flags.

Leave a Comment

Your email address will not be published. Required fields are marked *