ETL Testing Basic Interview Questions – Complete Beginner to Intermediate Guide

1. What is ETL Testing? (Definition + Example)

ETL Testing is the process of validating data that is Extracted from source systems, Transformed using business rules, and Loaded into a target system such as a Data Warehouse (DW).

The goal of ETL testing is to ensure:

  • Data accuracy
  • Data completeness
  • Correct transformation logic
  • High performance and reliability

Simple ETL Example

  • Source: Sales table from OLTP system
  • Transform: Currency conversion, trimming spaces, deduplication
  • Target: Fact_Sales table in data warehouse

ETL testing verifies whether transformed data matches business expectations.


2. Data Warehouse (DW) Flow

Source → Staging → Transform → Load → Reporting

DW Layer Explanation

  1. Source Layer: ERP, CRM, flat files, APIs
  2. Staging Layer: Raw extracted data
  3. Transformation Layer: Business rules applied
  4. Load Layer: Fact and Dimension tables
  5. Reporting Layer: BI dashboards and reports

3. Why ETL Testing Is Important for Interviews

Interviewers expect candidates to understand:

  • Basic ETL concepts
  • Data warehouse architecture
  • SQL validation skills
  • Real-time data issues
  • ETL defects and scenarios

This guide on etl testing basic interview questions is structured exactly in that order.


4. ETL Testing Basic Interview Questions & Answers


A. Basic ETL Interview Questions (Freshers)

1. What does ETL stand for?

ETL stands for Extract, Transform, Load.

2. What is ETL testing?

ETL testing validates the correctness, completeness, and performance of data during ETL processes.

3. What is a data warehouse?

A data warehouse is a centralized repository that stores historical, integrated data for reporting and analytics.

4. What are source systems?

Operational systems such as OLTP databases, files, APIs, or applications from which data is extracted.

5. What is a staging table?

A temporary table used to store raw data before transformation.


B. Data Warehouse Interview Questions

6. What are fact tables?

Fact tables store quantitative business metrics (sales amount, quantity).

7. What are dimension tables?

Dimension tables store descriptive data (customer, product, time).

8. What is star schema?

A schema with a central fact table connected to multiple dimension tables.

9. Difference between OLTP and OLAP?

OLTP is transaction-oriented; OLAP is analytics-oriented.


C. ETL QA Questions – Intermediate Level

10. What validations are done in ETL testing?

  • Record count validation
  • Data type validation
  • Transformation logic validation
  • Null and default value checks
  • Duplicate data checks

11. What is Source-to-Target (S2T) mapping?

S2T mapping defines how source columns map to target columns with transformation rules.

12. What is data reconciliation?

Comparing source and target data to ensure consistency.


D. SCD & Data History Questions

13. What is SCD Type 1?

Slowly Changing Dimension Type 1 overwrites old data without keeping history.

14. What is SCD Type 2?

SCD Type 2 maintains historical records using versioning, effective dates, and active flags.

15. What are audit fields?

Fields used for tracking ETL loads, such as:

  • load_date
  • batch_id
  • created_timestamp
  • updated_timestamp

5. Real SQL Query Examples for ETL Testing

Sample Tables

Source_Customers

cust_idnamecountry

Target_Dim_Customers
| cust_key | name | country | active_flag |


JOIN – Missing Records Validation

SELECT s.cust_id

FROM source_customers s

LEFT JOIN target_dim_customers t

ON s.cust_id = t.cust_key

WHERE t.cust_key IS NULL;


GROUP BY – Aggregation Validation

SELECT country, COUNT(*) 

FROM source_customers

GROUP BY country;

Compare the result with target dimension counts.


Window Function – Duplicate Check

SELECT *

FROM (

  SELECT cust_key,

         ROW_NUMBER() OVER (PARTITION BY cust_key ORDER BY load_date DESC) rn

  FROM target_dim_customers

) x

WHERE rn > 1;


Performance Tuning – Explain Plan

EXPLAIN PLAN FOR

SELECT * FROM target_dim_customers WHERE active_flag = ‘Y’;


6. Scenario-Based ETL Testing Questions

16. What if record count mismatches occur?

Check filters, rejected records, joins, and transformation logic.

17. How do you handle NULL values in ETL?

  • Replace with default values
  • Reject records
  • Allow nulls based on business rules

18. How do you test incremental loads?

Validate delta data using last_run_date and compare with previous loads.

19. ETL job is slow – how do you troubleshoot?

  • Check indexes
  • Review SQL execution plan
  • Optimize joins
  • Partition tables

7. ETL Architecture & Mapping Validation

Mapping Validation Checklist

✔ Column mapping
✔ Data types & length
✔ Transformation logic
✔ Mandatory fields
✔ Business rules


8. ETL Tools – Interview Knowledge

Popular ETL Tools

  • Informatica
  • Microsoft SSIS
  • Ab Initio
  • Pentaho
  • Talend

9. ETL Defect Examples

Defect TypeExample
Data MismatchWrong transformation logic
Duplicate RecordsMissing dedup logic
Null IssuesMandatory field missing
Load FailureJob aborted
PerformanceSLA breach

10. Sample ETL Test Case

Test Case: SCD Type 2 Validation

  • Input: Customer address change
  • Expected: Old record expired, new record inserted
  • Validation: active_flag, effective_date

11. Quick Revision Sheet – ETL Basics

✔ ETL process
✔ DW architecture
✔ SCD types
✔ Record count checks
✔ SQL joins & aggregations
✔ ETL defects


12. FAQs – ETL Testing Basic Interview Questions

Q1. Is ETL testing manual or automated?

Mostly manual with SQL; automation can be done using scripts.

Q2. How much SQL is required for ETL testing?

Strong SQL with joins, group by, subqueries, and window functions.

Q3. Can freshers learn ETL testing?

Yes, with SQL and data warehouse basics.

Leave a Comment

Your email address will not be published. Required fields are marked *