The DFS command is used to execute DFS commands in the HDFS location. This command is specific to Hadoop database.
The syntax and validity of commands used in DFS Command job should be validated by the user before execution. Any issues with the command will result in failure during runtime.
To create the DFS Command Job, follow the below steps.
Step I: From the Jobs menu, drag and drop the DFS Command Job on canvas.
Step II: The window displays the list of Hadoop Data Points with the database/schema defined in them. Select the Data Point for which you want to execute the DFS command.
- If there is any Global project that the user has access to, then the window displays the Project drop-down, which lists all the global projects. You can choose the global project from the Project drop-down, and select the global Hadoop data point as required.
- DFS Command is only for Hadoop Data Points.
- To create a Hadoop Data Point, refer Working with Hadoop Data Point.
Step III: Provide the General details of Job.
On Canvas, select the DFS command job, and then under Properties, provide the General details.
Name - The Name field consists of default name and is editable.
Description - In the text box, you can provide a description and is optional.
Disable task - Check Disable task, if the job need not be executed as part of the Job Flow, and you do not want to delete the Job.
Step IV: Optionally specify the retry attempts.
Under Properties, select the Properties tab to enable retry attempts for DFS command job execution.
Retry Enabled: Check the retry option if you want to enable retry attempts for DFS command job.
- No. of Retry Attempts: Specify the number of attempts to retry DFS command job execution if in case the job fails to execute. By default, the retry attempts is set to 2.
- Retry Wait Time (in Seconds): Specify the duration in seconds for the job to retry next execution. By default, the duration is set to 60 seconds. If the Job fails to execute, it retries again for next execution attempt after the specified wait time.
Step V: Enter the DFS Command
1. Under Properties, select the Command tab, and then click on the Expression Editor arrow, under the DFS Command.
2. The Expression Editor window opens and you can define the DFS command here. From the drop-down, choose Functions.
3. Functions provides the list of supported DFS commands. These commands are used to execute in the HDFS Data Point provided.
You can specify multiple DFS Commands with comma separated values. The DFS command defined can include any hard-coded value, Parameters, Runtime Status and Runtime Statistics. Once you enter the expression, click Validate to verify that there are no syntax errors. If everything is correct then, a success message is displayed.
Here for reference, the DFS command moves file from Local to HDFS.
- The Parameters can consists of Job Flow Parameter, Job Flow sql Parameter, Project Parameter, or Diyotta System Parameter. For more information, refer Working with Job Flow Parameters, Working with Job Flow SQL Parameters, Working with Project Parameters, and Working with Studio System Parameters.
- Runtime Job Status and Statistics - It consists of Runtime Status and Runtime Statistics of Job. You can use Runtime Status and Runtime Statistics to get particular details of Job. For more information, refer Working with runtime status and statistics.
Step VI: Optionally change the Data Point
1. Under Properties, click Data Points. The Data Point and/or database/schema assigned to the DFS Command job can be changed if needed. Next to the Data Point name, click Change.
2. The window lists all the available Hadoop Data Point and Databases. Choose the required Data Point.
If there is any Global project that the user has access to, then the window displays the Project drop-down, which lists all the global projects. You can choose the global project from the Project drop-down, and select the global Hadoop data point as required.
- To save the Job Flow, on the Actions menu, click Save. For more information, refer Saving Job Flow.
- To revert the changes before saving the Job Flow, on the Actions menu, click Revert. For more information, refer Reverting changes in Job Flow.
- To execute individual job in the Job Flow, on the Actions menu, click Run Job. For more information, refer Executing individual job in Job Flow.
- To execute the Job Flow, on the Actions menu, click Run. For more information, refer Executing Job Flow.
- Once the Job is created and the changes are saved, then, close or unlock the Job Flow so that it is editable by other users. For more information, refer Closing Job Flow and Unlocking Job Flow.