Introduction to data transformation
Transforming data is the process of shaping your data into a usable format. Instead of using different individual tools to modify your data, you can do it in the Automation module thanks to its capability of data transformation and use its outcome at once.
The process of transforming data consists of two steps:
- Creating the transformation rule that defines the operations on the file structure based on a sample file, which reflects the source file you want to transform. It can be a fragment of the actual file.
- Using the transformation rule in workflows by means of the Data Transformation node to transform the actual file.
Purpose of transforming data
Imports and exports
The main purpose of data transformation is to structure the data to meet the requirements for importing (to Synerise) and exporting files to external tools.
After the data transformation rule is published, you can use it in the Data Transformation node while preparing a workflow that imports data to Synerise (such as transactions, events, profiles) or exports it further.
If your file meets the Synerise import requirements, you can omit the data transformation part.
Accepted file formats
You can prepare transformation rules for the files (a file can contain up to 500 rows) of the following formats:
.CSV
.JSON
.JSONL
.XML
How to prepare a transformation rule of the target file sample?
The transformation rules are created based on the sample file which must reflect the target file that will be the subject of the transformation in workflows. Transformation rules are meant to show you on a sample file, which reflects the source file you want to transform. It can be a fragment of the actual file.
The example on the screen is taken from the import transactions to Synerise use case.
Transformation rules are built by joining data operators and transformation nodes together. Such a rule always starts with the Input data node which requires you to upload a sample file. The final element of the transformation is the Output data node which contains the preview of the final effect of the transformation and allows you to change the type of data in each column.
Between the Input Data and Output Data nodes, you can place the nodes that define the operations (for example adding columns, editing values, and so on) on the sample file in the order (from the left) you want it to be done.
As a result, you end up with a transformation rule which you can reuse in workflows by means of the Data Transformation node and use it every time when you need to modify files with a similar structure in a regular workflow (for example, when importing data to Synerise).
What transformations are available?
You can perform the following operations on the file:
- Add columns
- Rename columns
- Merge columns
- Filter in/out columns
- Filter in/out rows
- Edit values in cells
- Use regular expressions to search values in cells and replace them with any value
- Change the data type of the values in columns
- Use Jinjava while editing values in columns
- Handle errors in rows which may occur during transformation