SAP Knowledge Base Article - Public

3699603 - Replication Flow error due to incompatible schema in source csv file

Symptom

While configuring a Replication flow to Replicate file from SFTP server, the following issues happen:

Issue 1: Error message: "Source setup failed due to the following error: incompatible schema: column 'XXX' not found in file 'XXX.csv'."

Issue 2: Replication flow fails when attempting to replicate data from a folder containing multiple CSV files with different schemas.

Issue 3: Error message: "one or more replications have duplicate source object names: /object."

Environment

  • Replication Flows

Reproducing the Issue

  1. Configure a replication flow in SAP Datasphere.
  2. Select a source object from an SFTP server.

Cause

Configuration error.

Resolution

Issue 1: 

The error in Replication Flow is due to column name mismatch between source CSV headers and Datasphere table definition, or Source file schema change- CSV file structure modified without updating target mapping.
Please recreate the mapping to resolve the issue.

Please check the SAP Help:
Cloud Storage Provider Sources for Replication Flows | SAP Help Portal
If you define mappings and make schema changes afterwards (except for defining columns as primary key columns), the existing mappings get deleted when you update the schema. 

Issue 2:

SAP Datasphere Replication Flows require all CSV files in a source container to have identical schemas.
This is a product limitation.

Issue 3:

Adding source object behavior for SFTP as a source is similar to Object Store as a source(ADL gen2, GCP, AWS).
In this setup, the entire folder is replicated rather than a single file. The CSV file selection step is only used to infer the schema. The dataset name will always correspond to the selected container, not the selected file, since the selected file is only used for schema inference — the entire folder is replicated, not just that file. You are encountering this error because duplicate datasets cannot be added within the same replication flow.

For more information, please refer to the documentation below:
Cloud Storage Provider Sources for Replication Flows | SAP Help Portal 

As a workaround of Issue 2 and Issue 3, you can group files by schema type and create individual replication flows for each schema variant.

For example, suppose you have four CSV files — A, B, C, D. If A and B share the same schema, and C and D share another schema, you should group them accordingly

  •   Place A and B in one folder and select that folder while adding the source object.
  •   Place C and D in another folder and add it as a separate source object.

As a result, two tables will be created on the target side, each corresponding to one folder with schemas matching their respective groups of CSV files.
Then you can create individual replication flows for each folder.

See Also

Cloud Storage Provider Sources for Replication Flows | SAP Help Portal

Keywords

replication flow error, incompatible schema, csv file schema mismatch, sap datasphere, source object error, duplicate source object names, sftp server, schema definition, replication flow setup, data replication, sap datasphere replication flow , KBA , DS-DI-RF , Replication Flows , Problem

Product

SAP Datasphere all versions