Informatica ETL Testing Interview Questions – Complete SQL & Real-Time Guide

1. What is ETL Testing? (Definition + Example)

ETL Testing is the process of validating whether data is correctly Extracted, Transformed, and Loaded from source systems into a data warehouse, following defined business rules and mappings.

Real-Time Example (Informatica Project)

  • Source: Customer & Orders tables (Oracle / DB2)
  • Transformation:
    • Deduplication
    • Currency conversion
    • Slowly Changing Dimensions (SCD1 / SCD2)
  • Target: Fact and Dimension tables in EDW
  • Reporting: BI dashboards (Cognos / Tableau)

Goal: Ensure accurate, complete, and reconciled data reaches reports.

2. Data Warehouse Flow: Source → Staging → Transform → Load → Reporting

DW Layer Responsibilities

LayerPurpose
SourceOLTP DBs, flat files, APIs
StagingRaw extracted data
TransformationBusiness rules, joins, SCD
Target (DW)Fact & dimension tables
ReportingAnalytics & dashboards

3. Informatica ETL Architecture & S2T Mapping Validation

Informatica ETL Architecture

  • Source systems
  • Staging schema
  • Informatica PowerCenter mappings
  • Target data warehouse
  • BI/reporting layer

Source-to-Target (S2T) Mapping Includes

  • Source & target column mapping
  • Data type and length
  • Transformation logic
  • Default values
  • Audit fields (load_date, batch_id)
  • SCD rules

4. Informatica ETL Testing Interview Questions (Basic → Advanced)

Below are 80+ Informatica ETL testing interview questions frequently asked in real interviews.


A. Basic Informatica ETL Testing Interview Questions (1–20)

  1. What is ETL testing?
    Validates extraction, transformation, and loading of data.
  2. Why is ETL testing important in Informatica projects?
    Ensures reliable reporting and regulatory compliance.
  3. What is Informatica PowerCenter?
    Enterprise ETL tool for data integration.
  4. What is a mapping in Informatica?
    Defines data flow from source to target.
  5. What is a staging table?
    Temporary storage for raw extracted data.
  6. What is S2T mapping?
    Document mapping source fields to target fields.
  7. Difference between ETL and ELT?
    ETL transforms before load; ELT after load.
  8. What is data reconciliation?
    Comparing source and target data.
  9. What is surrogate key?
    System-generated unique identifier.
  10. Difference between fact and dimension tables?
    Fact = measures, Dimension = attributes.
  11. What is full load?
    Loads entire dataset.
  12. What is incremental load?
    Loads only new/changed records.
  13. What are audit columns?
    load_date, batch_id, updated_ts.
  14. What is data profiling?
    Analyzing source data quality.
  15. What is truncation testing?
    Ensuring no data loss due to column size.
  16. What is referential integrity?
    Fact keys must exist in dimension tables.
  17. What is CDC?
    Change Data Capture.
  18. What is reject table?
    Stores failed records.
  19. What is lookup transformation?
    Fetches related reference data.
  20. What is data lineage?
    Tracing data from source to report.

B. SQL-Based ETL Testing Questions (21–45)

Record Count Validation

SELECT COUNT(*) FROM src_orders;

SELECT COUNT(*) FROM tgt_fact_orders;

  1. How do you validate record count?
    Compare source, staging, and target counts.
  2. How to find duplicate records?

SELECT order_id, COUNT(*)

FROM stg_orders

GROUP BY order_id

HAVING COUNT(*) > 1;

  1. How do you validate JOIN logic?

SELECT o.order_id, c.customer_name

FROM orders o

JOIN customers c

ON o.customer_id = c.customer_id;

  1. How do you validate aggregation logic?

SELECT customer_id, SUM(order_amount)

FROM fact_orders

GROUP BY customer_id;

  1. How do you identify missing records?

SELECT s.id

FROM source_table s

LEFT JOIN target_table t

ON s.id = t.id

WHERE t.id IS NULL;

  1. What is GROUP BY used for in ETL testing?
    Validating totals and summaries.
  2. How to validate null handling?

SELECT COUNT(*) FROM dim_customer WHERE email IS NULL;

  1. What is Slowly Changing Dimension (SCD)?
    Technique to manage dimension changes.
  2. Difference between SCD Type 1 and Type 2?
    Type 1 overwrites, Type 2 preserves history.
  3. SCD2 validation query

SELECT customer_id, COUNT(*)

FROM dim_customer

GROUP BY customer_id

HAVING COUNT(*) > 1;

  1. How do you validate current active record?
    current_flag = ‘Y’
  2. What is hashing in Informatica ETL?
    Used to detect data changes.
  3. How do you validate derived columns?

SELECT amount * tax_rate AS expected_tax FROM stg_sales;

  1. How do you validate date transformations?

SELECT * FROM fact_orders

WHERE order_date > CURRENT_DATE;

  1. What is lookup cache testing?
    Validate cached lookup values.
  2. What is control table?
    Stores batch status and load counts.
  3. What is watermark column?
    Used for incremental loads.
  4. What is late arriving dimension?
    Fact arrives before dimension.
  5. What is late arriving fact?
    Fact arrives after reporting window.
  6. Difference between truncate and delete?
    Truncate is faster, no rollback.
  7. How do you validate decimal precision?

SELECT CAST(amount AS DECIMAL(10,2)) FROM stg_sales;

  1. What is metadata testing?
    Validating schema & data types.
  2. What is factless fact table?
    Tracks events without measures.
  3. What is idempotent ETL?
    Same result on multiple runs.
  4. What is data balancing?
    Matching totals across systems.

C. Advanced Informatica & Performance Questions (46–80)

Window Function Example

SELECT customer_id,

ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY updated_ts DESC) rn

FROM dim_customer;

  1. Why are window functions used in ETL testing?
    For deduplication & SCD logic.
  2. What is ETL performance testing?
    Measuring load time and throughput.
  3. How do you tune slow Informatica jobs?
    Indexing, partitioning, pushdown optimization.
  4. What is pushdown optimization?
    Executing transformations in database.
  5. What is partitioning?
    Splitting data for parallel processing.
  6. How do you validate data freshness?

SELECT MAX(load_date) FROM fact_sales;

  1. What is ETL regression testing?
    Ensuring changes don’t break logic.
  2. How do you test error handling?
    Validate reject tables & logs.
  3. What is data skew?
    Uneven data distribution.
  4. What is bulk load?
    High-volume data loading.
  5. How do you validate historical accuracy?
    Check effective_date ranges.
  6. What is schema evolution testing?
    Validate source schema changes.
  7. What is data latency?
    Delay from source to DW.
  8. How do you validate negative scenarios?
    Invalid & boundary data.
  9. What is reconciliation report?
    Counts, sums, rejects summary.
  10. What is ETL restartability?
    Resume job after failure.
  11. What is data anonymization testing?
    Validate masking of PII.
  12. How do you validate surrogate key uniqueness?

SELECT sk, COUNT(*) FROM dim_customer GROUP BY sk HAVING COUNT(*) > 1;

  1. What is parallel processing?
    Running sessions concurrently.
  2. What is audit trail testing?
    Validate batch_id and timestamps.
  3. What is data archival testing?
    Old data moved correctly.
  4. What is transformation logic testing?
    Validate business rules.
  5. What is end-to-end ETL testing?
    Source → report validation.
  6. Difference between OLTP and OLAP?
    Transactions vs analytics.
  7. What is data drift?
    Unexpected data pattern change.
  8. What is reject analysis?
    Root cause of rejected records.
  9. How do you validate currency conversion?

SELECT local_amt * rate = usd_amt FROM stg_sales;

  1. What is data mart testing?
    Validating subject-specific marts.
  2. What is most important Informatica testing skill?
    SQL + mapping understanding.
  3. What is a session in Informatica?
    Executes mapping logic.
  4. What is workflow testing?
    Validate job sequencing.
  5. What is session log analysis?
    Identify errors and bottlenecks.
  6. What is recovery strategy?
    Restart from last checkpoint.
  7. What is lookup override testing?
    Validate overridden SQL logic.
  8. What is the biggest challenge in Informatica ETL testing?
    Complex transformations + large data volumes.

5. ETL Tools Used in Informatica Testing Projects

  • Informatica
  • Microsoft SSIS
  • Ab Initio
  • Pentaho
  • Talend

6. ETL Defect Examples + Sample Test Case

Defect: Duplicate customer records in Dimension table
Root Cause: Missing hash comparison
Fix: Implement MD5 hashing on business keys

Sample Test Case

FieldValue
ScenarioSCD2 duplicate check
SQLGROUP BY HAVING COUNT > 1
ExpectedOne active record

7. Quick Revision Sheet (Interview Ready)

  • Validate record counts, sums, duplicates
  • Understand SCD1 vs SCD2
  • Practice JOIN, GROUP BY, window functions
  • Focus on performance tuning & logs

8. FAQs (Featured Snippet Optimised)

Q1. Is SQL mandatory for Informatica ETL testing?
Yes, SQL is the primary validation skill.

Q2. Which SCD type is most asked in interviews?
SCD Type 2.

Q3. What is the key focus in Informatica testing interviews?
Mapping logic + SQL validation.

Leave a Comment

Your email address will not be published. Required fields are marked *