1. Introduction
ETL testing interview questions Infosys are a major part of interviews for Data Warehouse Testing, ETL QA, BI Testing, and Data Validation roles at Infosys. Infosys handles large-scale banking, insurance, retail, telecom, and healthcare data platforms, where data accuracy and performance are business-critical.
In Infosys interviews, candidates are evaluated on:
- Strong ETL and Data Warehouse fundamentals
- Ability to write SQL queries for validation
- Clear understanding of Source-to-Target (S2T) mapping
- Handling real-time ETL defects
- Knowledge of SCD1, SCD2, audit fields, hashing, incremental loads
- Performance and SLA awareness
This article is written as a complete Infosys-oriented ETL interview preparation guide, suitable for freshers, 3–5 years experienced, and senior ETL testers.
2. What is ETL Testing? (Definition + Infosys-Style Example)
ETL Testing is the process of validating that data extracted from source systems is correctly transformed according to business rules and loaded into the target data warehouse, ensuring accuracy, completeness, history, and performance.
Real-Time Infosys Project Example
- Source: Banking transactions from OLTP systems
- Transform:
- Deduplication
- Currency conversion
- Daily and monthly aggregation
- SCD2 handling for customer dimension
- Deduplication
- Target: Enterprise Data Warehouse (EDW)
- Reporting: Power BI / Tableau dashboards
At Infosys, ETL testers are expected to validate data + business impact, not just table counts.
Typical ETL Architecture in Infosys Projects
- Source Systems – OLTP DBs, flat files, APIs
- Staging Layer – Raw extracted data
- Transformation Layer – Business rules, cleansing, enrichment
- Target Layer (DW / Data Mart) – Fact & Dimension tables
- Reporting Layer – BI tools & analytics
👉 Interview Tip: Infosys interviewers often ask how you handle reconciliation, restartability, and audit validation.
4. ETL Testing Interview Questions Infosys (Basic → Advanced)
A. Basic ETL Testing Interview Questions (Infosys)
Q1. What is ETL testing?
ETL testing validates extraction, transformation, and loading of data to ensure accuracy and completeness.
Q2. Why is ETL testing important in Infosys projects?
Because Infosys handles large enterprise clients where incorrect data can lead to financial loss and regulatory issues.
Q3. What is a data warehouse?
A centralized repository that stores historical and integrated data for reporting and analytics.
Q4. What is a staging table?
A temporary table used to store raw extracted data before transformation.
B. Source-to-Target (S2T) Mapping Questions
Q5. What is S2T mapping?
A document defining how source fields map to target fields along with transformation logic.
Q6. How do you validate S2T mapping in Infosys projects?
By writing SQL queries to compare source, staging, and target data after transformation.
Q7. What challenges do you face during S2T validation?
- Complex joins
- Derived columns
- Conditional transformations
- Lookup mismatches
5. SQL Query Examples (Very Important for Infosys)
Record Count Validation
SELECT COUNT(*) FROM src_orders;
SELECT COUNT(*) FROM fact_orders;
Q8. What does this validation ensure?
No data loss during ETL load.
Data Validation Using JOIN
SELECT s.order_id,
s.amount AS src_amount,
t.amount AS tgt_amount
FROM src_orders s
JOIN fact_orders t
ON s.order_id = t.order_id
WHERE s.amount <> t.amount;
Q9. Why is this query important?
It detects transformation or loading issues.
Finding Missing Records
SELECT s.order_id
FROM src_orders s
LEFT JOIN fact_orders t
ON s.order_id = t.order_id
WHERE t.order_id IS NULL;
Q10. Which join is most used for missing record checks?
LEFT JOIN / RIGHT JOIN.
GROUP BY & Aggregation Validation
SELECT region, SUM(sales_amount)
FROM fact_sales
GROUP BY region;
Q11. What are you validating here?
Correct aggregation logic in fact tables.
Window Function Example
SELECT customer_id,
SUM(amount) OVER (PARTITION BY customer_id) AS total_spend
FROM fact_orders;
Q12. Why are window functions important in ETL testing?
They validate running totals and partition-level calculations.
Performance Tuning SQL
EXPLAIN ANALYZE
SELECT *
FROM fact_orders
WHERE order_date >= ‘2025-01-01’;
Q13. Why is performance testing important in Infosys ETL projects?
To ensure SLA compliance for large enterprise clients.
6. Slowly Changing Dimension (SCD) Questions (Frequently Asked)
Q14. What is SCD Type 1?
Overwrites old data without maintaining history.
Q15. What is SCD Type 2?
Maintains history using:
- Start date
- End date
- Active flag
SCD2 Validation SQL
SELECT customer_id, start_date, end_date, is_active
FROM dim_customer
WHERE customer_id = 101;
Q16. Common SCD2 defects seen in Infosys projects?
- Multiple active records
- Old record not expired
- Incorrect effective dates
7. Scenario-Based ETL Testing Questions (Infosys Style)
Scenario 1: Record Count Mismatch
Possible Causes:
- Filter condition mismatch
- Wrong join type
- Duplicate source data
Scenario 2: Null Values in Target Table
SELECT *
FROM dim_customer
WHERE email IS NULL;
Action: Check default value or reject logic.
Scenario 3: ETL Job Misses SLA
Resolution Steps:
- Analyze execution plan
- Optimize SQL
- Partition large tables
- Tune parallelism
8. ETL Tools Asked in Infosys Interviews
Infosys focuses more on concepts and experience than tool syntax.
Commonly asked tools:
- Informatica
- Microsoft SSIS
- Ab Initio
- Talend
- Pentaho
9. ETL Defect Examples + Test Case Sample
Common ETL Defects in Infosys Projects
| Defect Type | Example |
| Data loss | Missing records |
| Transformation error | Wrong revenue calculation |
| Duplicate data | Incorrect join |
| SCD defect | Multiple active records |
| Performance issue | Job misses SLA |
Sample ETL Test Case
| Field | Value |
| Test Case ID | INFY_ETL_TC_01 |
| Scenario | Validate SCD Type 2 |
| Source | src_customer |
| Target | dim_customer |
| Expected | Only one active record |
10. Advanced ETL Interview Questions (Infosys)
Q17. What are audit fields?
Fields like created_date, updated_date, batch_id used for tracking ETL loads.
Q18. What is hashing in ETL testing?
Using checksum/hash values to compare large datasets efficiently.
Q19. How do you test incremental loads?
By validating records using last_updated_date or watermark columns.
11. Quick Revision Sheet (Infosys ETL Interviews)
- ETL = Extract + Transform + Load
- Validate count + data + transformation
- SQL is mandatory (JOIN, GROUP BY, window functions)
- SCD2 questions are very common
- Performance & SLA awareness is critical
12. FAQs – ETL Testing Interview Questions Infosys
Q1. Does Infosys ask tool-specific ETL questions?
Mostly conceptual, with practical examples from your experience.
Q2. Is SQL mandatory for Infosys ETL interviews?
Yes, strong SQL is non-negotiable.
Q3. What experience level questions are asked?
Depends on role—freshers to 5+ years get scenario-based questions.
Q4. Is ETL testing manual or automated in Infosys?
Primarily SQL-driven manual testing with partial automation.
