1. Introduction
ETL testing interview questions Capgemini are frequently asked for ETL QA, Data Warehouse Testing, BI Testing, and Data Validation roles at Capgemini. Capgemini delivers large-scale data solutions for banking, insurance, retail, healthcare, and telecom clients, where data accuracy, transformation logic, and SLA compliance are critical.
In Capgemini ETL interviews, candidates are evaluated on:
- Strong ETL & Data Warehouse fundamentals
- Hands-on SQL validation skills
- Clear understanding of Source-to-Target (S2T) mapping
- Ability to explain real-time ETL defects
- Knowledge of SCD1, SCD2, audit fields, hashing
- Awareness of performance tuning and batch processing
This article is a Capgemini-focused ETL interview preparation guide, suitable for freshers, 2–5 years experienced, and senior ETL testers.
2. What is ETL Testing? (Definition + Capgemini-Style Example)
ETL Testing is the process of validating that data is:
- Extracted correctly from source systems
- Transformed accurately based on business rules
- Loaded completely into the target data warehouse
Real-Time Capgemini Project Example
- Source: Core banking transaction tables
- Transform:
- Deduplication using business keys
- Currency conversion
- Daily and monthly aggregation
- SCD Type 2 handling for customer dimension
- Deduplication using business keys
- Target: Enterprise Data Warehouse (EDW)
- Reporting: Power BI / Tableau dashboards
At Capgemini, ETL testers validate data correctness + business impact, not just ETL job success.
3. Data Warehouse Flow – Source → Staging → Transform → Load → Reporting
Typical ETL Architecture in Capgemini Projects
- Source Systems – OLTP databases, flat files, APIs
- Staging Layer – Raw extracted data
- Transformation Layer – Cleansing, enrichment, business logic
- Target Layer (DW/Data Mart) – Fact & Dimension tables
- Reporting Layer – BI tools
👉 Interview Tip: Be prepared to explain reconciliation, audit validation, restartability, and failure handling.
4. ETL Testing Interview Questions Capgemini (Basic → Advanced)
A. Basic ETL Testing Interview Questions
Q1. What is ETL testing?
ETL testing validates extraction, transformation, and loading of data to ensure accuracy, completeness, and performance.
Q2. Why is ETL testing important in Capgemini projects?
Because Capgemini works with enterprise clients, where incorrect data can cause financial loss and compliance issues.
Q3. What is a data warehouse?
A centralized repository storing historical and integrated data for analytics and reporting.
Q4. What is a staging table?
A temporary table used to store raw extracted data before transformations.
B. Data Warehouse & S2T Mapping Questions
Q5. What is Source-to-Target (S2T) mapping?
A document defining how source columns map to target columns, including transformation rules, default values, and reject logic.
Q6. How do you validate S2T mapping in Capgemini ETL projects?
By writing SQL queries comparing source, staging, and target data after applying transformation logic.
Q7. What challenges do you face during S2T validation?
- Complex joins across multiple sources
- Derived and conditional columns
- Lookup mismatches
- Null and default value handling
5. SQL Query Examples (Very Important for Capgemini)
Record Count Validation
SELECT COUNT(*) FROM src_orders;
SELECT COUNT(*) FROM fact_orders;
✔ Ensures no data loss during ETL load.
Data Validation Using JOIN
SELECT s.order_id,
s.amount AS src_amount,
t.amount AS tgt_amount
FROM src_orders s
JOIN fact_orders t
ON s.order_id = t.order_id
WHERE s.amount <> t.amount;
✔ Identifies transformation or loading defects.
Finding Missing Records
SELECT s.order_id
FROM src_orders s
LEFT JOIN fact_orders t
ON s.order_id = t.order_id
WHERE t.order_id IS NULL;
✔ Detects records missing in target.
GROUP BY & Aggregation Validation
SELECT region, SUM(sales_amount)
FROM fact_sales
GROUP BY region;
✔ Validates aggregation logic.
Window Function Example
SELECT customer_id,
SUM(amount) OVER (PARTITION BY customer_id) AS total_spend
FROM fact_orders;
✔ Used for validating running totals and partition-level calculations.
Performance Tuning SQL
EXPLAIN ANALYZE
SELECT *
FROM fact_orders
WHERE order_date >= ‘2025-01-01’;
✔ Helps analyze slow queries and SLA risks.
6. Slowly Changing Dimension (SCD) Questions
Q8. What is SCD Type 1?
Overwrites old data; history is not maintained.
Q9. What is SCD Type 2?
Maintains history using:
- Start date
- End date
- Active flag
SCD2 Validation SQL
SELECT customer_id, start_date, end_date, is_active
FROM dim_customer
WHERE customer_id = 101;
Q10. Common SCD2 defects in Capgemini projects?
- Multiple active records
- Old record not expired
- Incorrect effective dates
7. Scenario-Based ETL Testing Interview Questions
Scenario 1: Record Count Mismatch
Possible Causes:
- Filter condition mismatch
- Wrong join type
- Duplicate source data
Scenario 2: Null Values in Target
SELECT *
FROM dim_customer
WHERE email IS NULL;
Action: Verify default value handling or reject logic.
Scenario 3: ETL Job Misses SLA
Resolution Steps:
- Analyze execution plan
- Optimize SQL
- Partition large tables
- Tune parallelism
8. ETL Tools Asked in Capgemini Interviews
Capgemini focuses more on conceptual clarity and real project experience than tool syntax.
Common ETL tools:
- Informatica
- Microsoft SSIS
- Ab Initio
- Talend
- Pentaho
9. ETL Defect Examples + Test Case Sample
Common ETL Defects in Capgemini Projects
| Defect Type | Example |
| Data loss | Missing records |
| Transformation error | Wrong calculation |
| Duplicate data | Incorrect join |
| SCD defect | Multiple active records |
| Performance issue | Job misses SLA |
Sample ETL Test Case
| Field | Value |
| Test Case ID | CG_ETL_TC_01 |
| Scenario | Validate SCD Type 2 |
| Source | src_customer |
| Target | dim_customer |
| Expected | Only one active record |
10. Quick Revision Sheet (Capgemini ETL Interviews)
- ETL = Extract + Transform + Load
- Always validate count + data + transformation
- Strong SQL is mandatory (JOIN, GROUP BY, window functions)
- SCD2 questions are very common
- Performance & SLA awareness is critical
11. FAQs – ETL Testing Interview Questions Capgemini
Q1. Does Capgemini ask tool-specific ETL questions?
Mostly conceptual, based on real project experience.
Q2. Is SQL mandatory for Capgemini ETL interviews?
Yes, strong SQL skills are non-negotiable.
Q3. Are scenario-based questions common?
Yes, especially for 2+ years experienced candidates.
Q4. Is ETL testing manual or automated at Capgemini?
Primarily SQL-driven manual testing with partial automation.
