1. Introduction
Infosys ETL testing interview questions are a core part of interviews for Data Warehouse Testing, ETL QA, BI Testing, and Data Validation roles at Infosys. Infosys works with large enterprise clients in banking, insurance, retail, telecom, and healthcare, where data accuracy, performance, and compliance are business-critical.
Interviewers at Infosys typically evaluate:
- Strong ETL and Data Warehouse fundamentals
- Hands-on SQL skills for data validation
- Clear understanding of Source-to-Target (S2T) mapping
- Ability to handle real-time ETL defects
- Knowledge of SCD1, SCD2, audit fields, hashing, incremental loads
- Awareness of performance tuning and SLA impact
This article is written as a complete, interview-oriented guide for Infosys ETL testing roles—useful for freshers, 3–5 years experienced, and senior testers.
2. What is ETL Testing? (Definition + Infosys-Style Example)
ETL Testing is the process of validating that data extracted from source systems is correctly transformed according to business rules and loaded into the target data warehouse, ensuring accuracy, completeness, history, and performance.
Real-Time Infosys Project Example
- Source: Banking transactions from OLTP systems
- Transform:
- Deduplication based on business keys
- Currency conversion
- Daily and monthly aggregation
- SCD2 handling for customer dimension
- Deduplication based on business keys
- Target: Enterprise Data Warehouse (EDW)
- Reporting: Power BI / Tableau dashboards
At Infosys, ETL testers are expected to validate data correctness + business impact, not just record counts.
Typical ETL Architecture in Infosys Projects
- Source Systems – OLTP databases, flat files, APIs
- Staging Layer – Raw extracted data (no business rules)
- Transformation Layer – Cleansing, enrichment, business logic
- Target Layer (DW/Data Mart) – Fact & Dimension tables
- Reporting Layer – BI tools & analytics
Interview Tip: Be ready to explain reconciliation, audit checks, restartability, and failure handling.
4. Infosys ETL Testing Interview Questions & Answers (Basic → Advanced)
A. Basic ETL Testing Interview Questions (Infosys)
Q1. What is ETL testing?
ETL testing validates extraction, transformation, and loading of data to ensure accuracy, completeness, and performance.
Q2. Why is ETL testing critical in Infosys projects?
Because Infosys handles large enterprise and regulatory-driven clients, where incorrect data can cause financial loss and compliance issues.
Q3. What is a data warehouse?
A centralized repository storing historical and integrated data for reporting and analytics.
Q4. What is a staging table?
A temporary table used to store raw extracted data before transformations are applied.
B. Data Warehouse & S2T Mapping Questions
Q5. What is Source-to-Target (S2T) mapping?
A document defining how source columns map to target columns, including transformation rules, default values, and rejection logic.
Q6. How do you validate S2T mapping in Infosys projects?
By writing SQL queries comparing source, staging, and target data after applying transformation logic.
Q7. What challenges do you face during S2T validation?
- Complex joins across multiple sources
- Conditional and derived columns
- Lookup mismatches
- Null and default value handling
5. SQL Query Examples (Very Important for Infosys)
Record Count Validation
SELECT COUNT(*) FROM src_orders;
SELECT COUNT(*) FROM fact_orders;
✔ Ensures no data loss during ETL.
Data Validation Using JOIN
SELECT s.order_id,
s.amount AS src_amount,
t.amount AS tgt_amount
FROM src_orders s
JOIN fact_orders t
ON s.order_id = t.order_id
WHERE s.amount <> t.amount;
✔ Detects transformation or load defects.
Finding Missing Records
SELECT s.order_id
FROM src_orders s
LEFT JOIN fact_orders t
ON s.order_id = t.order_id
WHERE t.order_id IS NULL;
✔ Identifies records missing in target.
GROUP BY & Aggregation Validation
SELECT region, SUM(sales_amount)
FROM fact_sales
GROUP BY region;
✔ Validates aggregation logic in fact tables.
Window Function Example
SELECT customer_id,
SUM(amount) OVER (PARTITION BY customer_id) AS total_spend
FROM fact_orders;
✔ Used to validate running totals and partition-level calculations.
Performance Tuning SQL
EXPLAIN ANALYZE
SELECT *
FROM fact_orders
WHERE order_date >= ‘2025-01-01’;
✔ Helps analyze slow queries and SLA risks.
6. Slowly Changing Dimension (SCD) Questions (Frequently Asked)
Q8. What is SCD Type 1?
Overwrites old data; history is not maintained.
Q9. What is SCD Type 2?
Maintains history using:
- Start date
- End date
- Active flag
SCD2 Validation SQL
SELECT customer_id, start_date, end_date, is_active
FROM dim_customer
WHERE customer_id = 101;
Q10. Common SCD2 defects seen in Infosys projects?
- Multiple active records
- Old record not expired
- Incorrect effective dates
7. Scenario-Based ETL Testing Interview Questions (Infosys Style)
Scenario 1: Record Count Mismatch
Possible Causes:
- Filter condition mismatch
- Wrong join type
- Duplicate source data
Scenario 2: Null Values in Target
SELECT *
FROM dim_customer
WHERE email IS NULL;
Action: Verify default value handling or reject logic.
Scenario 3: ETL Job Missing SLA
Resolution Steps:
- Analyze execution plan
- Optimize SQL
- Partition large tables
- Tune parallelism
8. ETL Tools Asked in Infosys Interviews
Infosys focuses more on conceptual understanding and experience than tool syntax.
Commonly asked tools:
- Informatica
- Microsoft SSIS
- Ab Initio
- Talend
- Pentaho
9. ETL Defect Examples + Test Case Sample
Common ETL Defects in Infosys Projects
| Defect Type | Example |
| Data loss | Missing records |
| Transformation error | Wrong revenue calculation |
| Duplicate data | Incorrect join |
| SCD defect | Multiple active records |
| Performance issue | Job misses SLA |
Sample ETL Test Case
| Field | Value |
| Test Case ID | INFY_ETL_TC_01 |
| Scenario | Validate SCD Type 2 |
| Source | src_customer |
| Target | dim_customer |
| Expected | Only one active record |
10. Advanced Infosys ETL Interview Questions
Q11. What are audit fields?
Fields like created_date, updated_date, batch_id, source_system used for traceability.
Q12. What is hashing in ETL testing?
Using checksum/hash values to compare large datasets efficiently.
Q13. How do you test incremental loads?
By validating delta records using last_updated_date or watermark columns.
11. Quick Revision Sheet (Infosys ETL Interviews)
- ETL = Extract + Transform + Load
- Always validate count + data + transformation
- Strong SQL is mandatory (JOIN, GROUP BY, window functions)
- SCD2 questions are very common
- Performance & SLA awareness is critical
12. FAQs – Infosys ETL Testing Interview Questions
Q1. Does Infosys ask tool-specific ETL questions?
Mostly conceptual, based on real project experience.
Q2. Is SQL mandatory for Infosys ETL interviews?
Yes, strong SQL skills are non-negotiable.
Q3. Are scenario-based questions common?
Yes, especially for 3+ years experienced roles.
Q4. Is ETL testing manual or automated in Infosys?
Primarily SQL-driven manual testing with partial automation.
