You can use this transform to split the data from the transform connected as input to the splitter transform. You can specify the condition based on which the data should be split and for each condition a connector out is created. Each of these connector out can be linked to a different transform. Based on the condition defined for each connector the data will be filtered and passed into the connected transform.
To work with Splitter transform in data flow, follow below steps:
Selecting the Splitter transform to be added to data flow
In the data flow canvas move to Data Flow pane and navigate to Transforms menu. Here, you can either select or drag and drop the Splitter transform to canvas. Now, link the required transform to Splitter Transform.
Configuring Splitter transform
General Tab: Provide the basic details for the Splitter transform.
1. The Name field auto populates the transform name and its editable.
2. In the Description text-box, provide a description and is optional.
3. By default, Diyotta does not create temporary tables during execution for the transformation in the data flow. If the temporary table needs to be created for a transform during execution, then, enable the checkbox Persist Data. The temporary table created will be dropped once the data flow executes successfully.
Conditions tab: Provide the condition based on which the input data of the splitter transform should be splitted. The condition specified is added as filter on the SQL generated for the input connected transform.
Below are different options to add new condition under Conditions tab.
Option I: Adding new condition
- When there are no conditions, then click on Click Here to add a new condition.
- New entry gets added in conditions tab. The condition name can be modified by clicking on the name and editing it.
- When there are conditions already present, then click on +(Add) to add a new condition.
Option II: Pasting conditions copied from another transform
You can paste the conditions copied from another Splitter transform.
- To paste the copied condition, click the Paste icon.
- Following operations are allowed on the attributes: Add, Cut, Copy, Paste, Up, Down, and Delete.
- From the list of attributes, multiple attributes can be selected and we can perform/apply these operations.
To add or edit the filter condition against each condition in the list, follow below steps.
- Click the icon beside the Splitter Condition field to open the expression editor.
- The Expression Editor wizard appears and it allows you to add the required expression. To verify that there are no syntax errors, click Validate.
- Upon successful validation, the success message appears and click OK.
The splitter condition can include attributes from the transform, hive database functions, Parameters, Functions, Reusable expressions, UDFs, and Sequences.
- The attributes from the transform are listed when selecting Transforms from the drop-down. Click on the attribute from the list and it will be added in the editor.
- The list of functions can be seen by selecting Functions from the drop down. The functions that can be used in the SQL is not limited by the list shown. All the hive database functions can be used in the SQL.
- The list of parameters are viewed by selecting Parameters from the drop-down. Displays only those that can be used in data flow - Data Flow Parameter, Data Flow SQL Parameter, Project Parameter and System Parameter can be used in the SQL. For more information, refer Working with Data Flow Parameters, Working with Data Flow SQL Parameters, Working with Project Parameters, Diyotta System Parameters.
- List of expressions, UDFs, and sequences are displayed under corresponding header in drop-down.
Connecting Temp Stage transform to multiple subsequent transforms
After adding the condition, you must link to transform to load the splitted data and each condition can be linked to a single transform to load the data as per the condition specified.
Runtime Properties tab: The runtime properties are displayed only when the native data point type for data flow is Hadoop or Spark.
To change the Splitter Transform runtime properties, click Runtime Properties tab.
By default these properties are set to recommended/default values from data point and the values can be overridden here. To work with runtime properties, refer Editing Runtime Properties in Hadoop Data Point.
- To revert the changes to the default values, click Reset All to Default.
- To search for a specific property, enter the keyword in the search bar, and the grid displays the related properties.
Viewing the Script generated for the transform
The Script tab allows you to view the SQL generated for the transform. The script is generated based on the configuration of the transformation.
- To view the generated script, navigate to Script tab.