ETL Testing Interview Questions for 4 Years Experienced – Real-Time, Scenario-Driven Guide

1. What is ETL Testing? (Definition + Example)

ETL Testing validates data during Extract, Transform, Load processes to ensure accuracy, completeness, consistency, history handling, and performance when data is moved from source systems to a data warehouse (DW) and reporting layer.

Real-Time Example (4-Year Experience)

  • Source: Orders & Customers from OLTP (Oracle/MySQL)
  • Transform: Currency conversion, deduplication, SCD handling, hashing
  • Target: Fact_Orders, Dim_Customer
  • Testing Focus: S2T validation, incremental loads, reconciliation, SLA performance

At 4 years, interviewers expect confident SQL validation, DW concepts, and scenario troubleshooting.


2. Data Warehouse (DW) Flow

Source → Staging → Transform → Load → Reporting

Layer Responsibilities

  1. Source: OLTP DBs, flat files, APIs
  2. Staging: Raw snapshot (no business rules)
  3. Transform: Cleansing, business logic, SCDs
  4. Load: Facts & dimensions
  5. Reporting: BI dashboards (Power BI/Tableau)

3. ETL Testing Interview Questions & Best Answers (Basic → Advanced)

Below are 60+ frequently asked ETL testing interview questions for 4 years experienced, structured to match real interview flow.


A. Core ETL & DW Questions

1. Why is ETL testing important?

It ensures trusted analytics by validating data movement, transformations, history, and performance.

2. ETL testing vs database testing?

ETL testing validates data flow and transformations; DB testing validates schema/constraints/CRUD.

3. What is a staging table?

A temporary store for extracted raw data before transformations.

4. What is Source-to-Target (S2T) mapping?

A document defining column mappings, data types, and transformation rules from source to target.


B. Data Warehouse Concepts

5. What is a fact table?

Stores measurable metrics (amount, quantity).

6. What is a dimension table?

Stores descriptive attributes (customer, product, time).

7. Star vs Snowflake schema?

Star is denormalized and faster; Snowflake is normalized and space-efficient.


C. SCD, Audit & History (Must-Know)

8. Explain SCD Type 1 and Type 2.

  • SCD1: Overwrites old values (no history)
  • SCD2: Preserves history using effective_date/expiry_date/active_flag

9. How do you test SCD Type 2?

Validate old record expiry, new record insert, and single active record.

10. What are audit fields?

load_date, batch_id, created_ts, updated_ts, record_source

11. What is hashing in ETL?

Compares hash values to detect changes efficiently during incremental loads.


4. Real SQL Query Examples for ETL Validation

Sample Data

Source_Orders

order_idcust_idamountcurrency

Target_Fact_Orders
| order_key | cust_key | amount_usd | load_date |


JOIN – Missing Records

SELECT s.order_id

FROM source_orders s

LEFT JOIN target_fact_orders t

ON s.order_id = t.order_key

WHERE t.order_key IS NULL;


GROUP BY – Aggregation Check

SELECT cust_id, SUM(amount)

FROM source_orders

GROUP BY cust_id;


Window Function – Duplicate Detection

SELECT *

FROM (

  SELECT order_key,

         ROW_NUMBER() OVER (PARTITION BY order_key ORDER BY load_date DESC) rn

  FROM target_fact_orders

) x

WHERE rn > 1;


Performance Tuning – Explain Plan

EXPLAIN PLAN FOR

SELECT * FROM target_fact_orders

WHERE load_date >= SYSDATE – 1;


5. Scenario-Based ETL Testing Questions

12. Record count mismatch—how do you debug?

Check extraction filters, rejected rows, joins, and transformation conditions.

13. How do you handle NULL values?

Use defaults (NVL/COALESCE), reject rows, or allow per business rules.

14. How do you test incremental loads?

Validate delta via last_run_date/batch_id and compare counts & hashes.

15. ETL job slow—what steps?

Index checks, partition pruning, SQL tuning, parallelism review.

16. Late-arriving data—how handled?

Special logic to update facts/dimensions with correct effective dates.


6. ETL Architecture & Mapping Validation

Mapping Validation Checklist

✔ Column mapping
✔ Data types & lengths
✔ Transformation rules
✔ Mandatory fields
✔ Business logic alignment


7. ETL Tools – Interview Knowledge (4 Years)

  • Informatica – Enterprise ETL with rich transformations
  • Microsoft SSIS – SQL Server-native ETL
  • Ab Initio – High-performance data processing
  • Pentaho – Open-source ETL/BI
  • Talend – Cloud & on-prem integration

8. ETL Defect Examples

Defect TypeExample
Data MismatchIncorrect transformation
DuplicatesMissing dedup logic
History IssueSCD2 not applied
Load FailureJob aborted
PerformanceSLA breach

9. Sample ETL Test Case (4-Year Level)

Test Case: Incremental Load with Hashing

  • Validate delta extraction
  • Compare source/target hash values
  • Ensure audit fields populated correctly

10. Quick Revision Sheet (4 Years Experience)

✔ ETL & DW architecture
✔ S2T mapping
✔ Advanced SQL (JOIN/GROUP BY/Window)
✔ SCD1 & SCD2
✔ Incremental loads
✔ Performance tuning
✔ Defect lifecycle


11. FAQs – ETL Testing Interview (4 Years)

Q1. What SQL depth is expected?
Advanced joins, aggregations, window functions, and tuning basics.

Q2. Manual or automated ETL testing?
Primarily SQL-driven manual testing; scripting/automation is a plus.

Q3. What differentiates a strong 4-year ETL tester?
SQL mastery, DW understanding, and confident scenario handling.

Leave a Comment

Your email address will not be published. Required fields are marked *