1. What is ETL Testing? (Definition + Example)
ETL Testing is the process of validating data that is Extracted from source systems, Transformed using business rules, and Loaded into a target system such as a Data Warehouse (DW).
The goal of ETL testing is to ensure:
- Data accuracy
- Data completeness
- Correct transformation logic
- High performance and reliability
Simple ETL Example
- Source: Sales table from OLTP system
- Transform: Currency conversion, trimming spaces, deduplication
- Target: Fact_Sales table in data warehouse
ETL testing verifies whether transformed data matches business expectations.
2. Data Warehouse (DW) Flow
Source → Staging → Transform → Load → Reporting
DW Layer Explanation
- Source Layer: ERP, CRM, flat files, APIs
- Staging Layer: Raw extracted data
- Transformation Layer: Business rules applied
- Load Layer: Fact and Dimension tables
- Reporting Layer: BI dashboards and reports
3. Why ETL Testing Is Important for Interviews
Interviewers expect candidates to understand:
- Basic ETL concepts
- Data warehouse architecture
- SQL validation skills
- Real-time data issues
- ETL defects and scenarios
This guide on etl testing basic interview questions is structured exactly in that order.
4. ETL Testing Basic Interview Questions & Answers
A. Basic ETL Interview Questions (Freshers)
1. What does ETL stand for?
ETL stands for Extract, Transform, Load.
2. What is ETL testing?
ETL testing validates the correctness, completeness, and performance of data during ETL processes.
3. What is a data warehouse?
A data warehouse is a centralized repository that stores historical, integrated data for reporting and analytics.
4. What are source systems?
Operational systems such as OLTP databases, files, APIs, or applications from which data is extracted.
5. What is a staging table?
A temporary table used to store raw data before transformation.
B. Data Warehouse Interview Questions
6. What are fact tables?
Fact tables store quantitative business metrics (sales amount, quantity).
7. What are dimension tables?
Dimension tables store descriptive data (customer, product, time).
8. What is star schema?
A schema with a central fact table connected to multiple dimension tables.
9. Difference between OLTP and OLAP?
OLTP is transaction-oriented; OLAP is analytics-oriented.
C. ETL QA Questions – Intermediate Level
10. What validations are done in ETL testing?
- Record count validation
- Data type validation
- Transformation logic validation
- Null and default value checks
- Duplicate data checks
11. What is Source-to-Target (S2T) mapping?
S2T mapping defines how source columns map to target columns with transformation rules.
12. What is data reconciliation?
Comparing source and target data to ensure consistency.
D. SCD & Data History Questions
13. What is SCD Type 1?
Slowly Changing Dimension Type 1 overwrites old data without keeping history.
14. What is SCD Type 2?
SCD Type 2 maintains historical records using versioning, effective dates, and active flags.
15. What are audit fields?
Fields used for tracking ETL loads, such as:
- load_date
- batch_id
- created_timestamp
- updated_timestamp
5. Real SQL Query Examples for ETL Testing
Sample Tables
Source_Customers
| cust_id | name | country |
Target_Dim_Customers
| cust_key | name | country | active_flag |
JOIN – Missing Records Validation
SELECT s.cust_id
FROM source_customers s
LEFT JOIN target_dim_customers t
ON s.cust_id = t.cust_key
WHERE t.cust_key IS NULL;
GROUP BY – Aggregation Validation
SELECT country, COUNT(*)
FROM source_customers
GROUP BY country;
Compare the result with target dimension counts.
Window Function – Duplicate Check
SELECT *
FROM (
SELECT cust_key,
ROW_NUMBER() OVER (PARTITION BY cust_key ORDER BY load_date DESC) rn
FROM target_dim_customers
) x
WHERE rn > 1;
Performance Tuning – Explain Plan
EXPLAIN PLAN FOR
SELECT * FROM target_dim_customers WHERE active_flag = ‘Y’;
6. Scenario-Based ETL Testing Questions
16. What if record count mismatches occur?
Check filters, rejected records, joins, and transformation logic.
17. How do you handle NULL values in ETL?
- Replace with default values
- Reject records
- Allow nulls based on business rules
18. How do you test incremental loads?
Validate delta data using last_run_date and compare with previous loads.
19. ETL job is slow – how do you troubleshoot?
- Check indexes
- Review SQL execution plan
- Optimize joins
- Partition tables
7. ETL Architecture & Mapping Validation
Mapping Validation Checklist
✔ Column mapping
✔ Data types & length
✔ Transformation logic
✔ Mandatory fields
✔ Business rules
8. ETL Tools – Interview Knowledge
Popular ETL Tools
- Informatica
- Microsoft SSIS
- Ab Initio
- Pentaho
- Talend
9. ETL Defect Examples
| Defect Type | Example |
| Data Mismatch | Wrong transformation logic |
| Duplicate Records | Missing dedup logic |
| Null Issues | Mandatory field missing |
| Load Failure | Job aborted |
| Performance | SLA breach |
10. Sample ETL Test Case
Test Case: SCD Type 2 Validation
- Input: Customer address change
- Expected: Old record expired, new record inserted
- Validation: active_flag, effective_date
11. Quick Revision Sheet – ETL Basics
✔ ETL process
✔ DW architecture
✔ SCD types
✔ Record count checks
✔ SQL joins & aggregations
✔ ETL defects
12. FAQs – ETL Testing Basic Interview Questions
Q1. Is ETL testing manual or automated?
Mostly manual with SQL; automation can be done using scripts.
Q2. How much SQL is required for ETL testing?
Strong SQL with joins, group by, subqueries, and window functions.
Q3. Can freshers learn ETL testing?
Yes, with SQL and data warehouse basics.
