Symptom
• Incorrect field length shows when creating a Data Flow based on Excel data source in SAP Datasphere
• The field Data Type is observed as String(2) when it should be String(3)
• The value with 3 strings starts in row 1190. If the column with the noticed value is cut and added into the 2nd row, the data type value is correctly defined. It appears that the system is not checking all fields
Environment
SAP Datasphere
Reproducing the Issue
- Logon to Datasphere tenant
- Go to Data Builder
- Create a Data Flow from SFTP data source
- Select a Excel file that uploaded to the SFTP server
- Check the Data Type of some Field (e.g named columnA) in Columns section
=> The Data Type should be String(3) but it shows as String(2) which is incorrect
Cause
- The observed behavior is due to the design of the product.
- According to the SAP Datasphere | Help Guide (Step 5), data sources like Excel and CSV do not contain metadata definition, hence, the system attempts to predict the metadata of such datasets by parsing the first 1000 records.
- In this case, the row where the value of columnA contains more than 2 strings falls outside the range of 1000 rows. Consequently, the correct metadata definition could not be predicted.
Resolution
- Workaround 1: Adjust the column definition accordingly. For CSV, JSON, or Excel files, the schema is calculated based on a subset of the data. Therefore, the suggested data can be adjusted or verified as needed to encompass all the data.
- Workaround 2: Manually change the Data Type of the field.
Keywords
SAP Datasphere, Data Flow, Field Length, Data Type String, SFTP, Upload, Metadata, CSV, JSON, Excel, Schema, Adjust Column Definition. , KBA , CA-DI-EMB , Data Intelligence Embedded in Data Warehouse Cloud , DS-DI-DF , Data Flows , Problem