Joiner Transform is used to perform join operation between the transformations connected to the joiner transform. You can specify what attributes need to be used to join between the connected transforms and pick the attributes from these connected transforms to be output of the join operation.

To work with Joiner transform in data flow, follow below steps:

Note:

Selecting the Joiner transform to be added to data flow

In the data flow canvas move to Data Flow pane and navigate to Transforms menu. Here, you can either select or drag and drop the Joiner transform to canvas. Then, link the required transforms to the Joiner transform. The transform linked first as primary object is master and all the other transforms joined are considered as Detail.

Configuring joiner transform

General Tab: Provide the basic details for the joiner transform.

1. The Name field auto populates the transform name and its editable.

2. In the Description text-box, provide a description and is optional.

3. By default, Diyotta does not create temporary tables during execution for the transformation in the data flow. If the temporary table needs to be created for a transform during execution, then, enable the checkbox Persist Data. The temporary table created will be dropped once the data flow executes successfully.

Properties tab: The properties tab displays the linked transforms to the joiner. You can specify the join condition between the the linked transforms here.

1. Primary Object: Displays the transform linked to Master in joiner transform.

2. Joiner Type: Select the join type between the primary and the secondary object. The join types can be Inner Join, Left Outer Join, Right Outer Join and Full Outer Join.

3. Secondary Object: Lists the transforms linked to Detail in joiner transform. If there are multiple transforms under secondary object and you want to change the order in which secondary objects should be used in the generated SQL during execution, use the drop-down next to the secondary object name and select the appropriate transform.

4. Joiner Condition: Provide the condition to join between the primary and secondary object. You can also have join condition between two secondary objects.

  • To add a joiner condition, click Expression Editor icon beside the Joiner Condition field.
  • The Expression Editor window opens and you can add the condition here.
  • After applying the joiner condition, to verify that there are no syntax errors, click Validate. Upon successful validation success message appears. Once done, click Ok.

The joiner condition defined can include attributes from the linked transform, hive database functions, Parameters, Functions, Reusable expressions, UDFs and Sequences.

  • The attributes from the transform will be listed when selecting Transforms from the drop-down. Click on the attribute from the list and it will be added in the editor. These transforms are the once that are linked to the joiner transform starting from the source instance transform.
  • The list of functions can be seen by selecting Functions from the drop down. The functions that can be used in the SQL is not limited by the list shown. All the hive database functions can be used in the SQL.
  • The list of parameters can be seen by selecting Parameters from the drop-down. It will show those that can be used in data flow - Data Flow Parameter, Data Flow SQL Parameter, Project Parameter and System Parameter can be used in the SQL. For more information, refer Working with Data Flow Parameters, Working with Data Flow SQL Parameters, Working with Project Parameters, Diyotta System Parameters.
  • List of expressions, UDFs and sequences can be seen under corresponding header in drop-down.

5. Where clause: You can specify the filter condition to be applied as part of the SQL generated for the joiner transform.

  • To add a filter condition, click Expression Editor icon beside the Where Clause field.
  • The Expression Editor window opens and you can add the condition here.
  • After applying the where clause condition, to verify that there are no syntax errors, click Validate. Upon successful validation success message appears. Once done, click Ok.

The filter condition can include attributes from the transform, hive database functions, Parameters, Functions, Reusable expressions, UDFs, and Sequences.

  • The attributes from the transform are listed when selecting Transforms from the drop-down. Click on the attribute from the list and it will be added in the editor.
  • The list of functions can be seen by selecting Functions from the drop down. The functions that can be used in the SQL is not limited by the list shown. All the hive database functions can be used in the SQL.
  • The list of parameters are viewed by selecting Parameters from the drop-down. Displays only those that can be used in data flow - Data Flow Parameter, Data Flow SQL Parameter, Project Parameter and System Parameter can be used in the SQL. For more information, refer Working with Data Flow Parameters, Working with Data Flow SQL Parameters, Working with Project Parameters, Diyotta System Parameters.
  • List of expressions, UDFs, and sequences are displayed under corresponding header in drop-down.

Select Attributes tab: You can select attributes in the joiner transform which will be the attributes from the connected transform. 

Note: The fields displayed in the attribute are specific to database type of the data object. For information on fields of specific type of data object, refer Working with Data Object.

Derived Attributes tab: You can add attributes in the joiner transform which will be the attributes from the connected transform.

Note: The fields displayed in the attribute are specific to database type of the data object. For information on fields of specific type of data object, refer Working with Data Object.

Below are different options to add new attributes under Attributes tab.

Option I: Adding new attribute

  • When there are no attributes present, then click on Click Here to add new attribute.

  • New entry for derived attribute is generated.

  • When there attributes already present, then click on +(Add) to add new attribute.

Option II: Pasting attributes copied from another transform 

You can paste attributes copied from another joiner transform here.

  • When there are no attributes present, then click on the drop down on Click Here to see the paste option. Click on the Paste to paste the copied attributes.
  • The copied attributes are pasted as shown below:
  • When there are attributes already present, then click on Paste icon.

Option III: Adding attributes from the connected transforms 

You can add attribute from the transforms that are connected to the joiner. The attributes added will have expression editor to provide the expression to be associated with the attribute. 

  • When there are no attributes present, then click on Click Here next to select and add attributes from transform.

  • Then Select Attributes wizard appears and displays the list of attributes available in the transforms connected to joiner. Choose required attributes and click OK.

  • When there are attributes already present, then click on (+) Select option and select attributes from transform.

  • Then Select Attributes wizard appears and displays the list of corresponding attributes available in the data object. Choose required attributes and click OK.
Note: The attributes already selected from transform are Italicized.
  • The added attributes get listed in the Attributes tab and each attribute has an associated expression editor to add expression to it.

Note:

  • The fields in the Attributes tab are specific to the data object type associated with the transform. For more information about these fields, refer Working with Data Object and select the specific database data object same as native processing platform.
  • Following operations are allowed on the attributes: Add, Cut, Copy, Paste, Up, Down, Delete, and Apply template.
  • To search for a specific attribute, enter the keyword in the search bar, and the grid displays the search results.
  • From the list of attributes, multiple attributes can be selected and we can perform/apply these operations.

After adding attributes, you can add/edit associated expression in Expression Editor field. 

  • To add/edit the expression, click the icon beside the Expression Editor field.

  • The Expression Editor wizard appears and it allows you to add the required expression. To verify that there are no syntax errors, click Validate.

  • Upon successful validation, the success message appears and click OK.

The expression can include attributes from the transform, database functions, Parameters, Functions, Reusable expressions, UDFs and Sequences.

  • The attributes from the transform will be listed when selecting Transforms from the drop-down. Click on the attribute from the list and it will be added in the editor.
  • The list of functions can be seen by selecting Functions from the drop down. The functions that can be used in the SQL is not limited by the list shown. All the hive database functions can be used in the SQL.
  • The list of parameters can be seen by selecting Parameters from the drop-down. It will show those that can be used in data flow - Data Flow Parameter, Data Flow SQL Parameter, Project Parameter and System Parameter can be used in the SQL. For more information, refer Working with Data Flow Parameters, Working with Data Flow SQL Parameters, Working with Project Parameters, Diyotta System Parameters.
  • List of expressions, UDFs and sequences can be seen under corresponding header in drop-down.
  • The Expression field displays the added/edited condition as shown below:

You can use Apply Template option to apply the expression from one attribute to other attributes in the list. You can either apply expression from existing attribute or apply new condition to existing Attributes. As an example, here we are applying function lower( ) to multiple attributes.

  • From the list of attributes, select an attribute which has the expression to be applied, and click Apply template icon.
  • You can also select any attribute and change the expression to be applied in subsequent steps.

  • Apply Template wizard displays the derived attribute name along with expression. From the list of attributes, select the attributes to apply the expression.
  • In the expression field, you can edit the expression before applying other attributes. If edited, then you can valid this expression by clicking Validate. The success or failure message appears on the bottom of the window.

  • Click Apply, to apply the condition to the selected attributes. 
  • After applying the condition, validate it by clicking the Validate button at the bottom. The success or failure message appears on the bottom of the window. 
  • Click OK, to complete the changes.

Runtime Properties tab:

The runtime properties are displayed only when the native data point type for data flow is Hadoop or Spark.

To change the Joiner Transform runtime properties, click Runtime Properties tab.

By default these properties are set to recommended/default values from data point and the values can be overridden here. To work with runtime properties, refer Editing Runtime Properties in Hadoop Data Point.

Note: 
  • To revert the changes to the default values, click Reset All to Default.
  • To search for a specific property, enter the keyword in the search bar, and the grid displays the related properties.

Viewing the Script generated for the transform

The Script tab allows you to view the SQL generated for the transform. The script is generated based on the configuration of the transformation.

  • To view the generated script, navigate to Script tab.