Working with dicmd command

dicmd is Purplecube command line utility that provides capability to access Purplecube from Linux machine and perform operations in Purplecube without logging into the web user interface.

dicmd is installed as part of controller installation. By default, this command can be run only from the user under which Purplecube controller is installed. You can run the dicmd from a different user as well by providing appropriate folder permissions. For more details on this, refer Enabling Purplecube dicmd Command Execution from Application User.

This utility can also be installed externally on separate Linux machine and configured to connect and work with remote Purplecube installation. For mode details to install and setup dicmd externally, refer Installing Purplecube dicmd Command.

For performing actions in Purplecube using dicmd it is mandatory to pass the username and password of the Purplecube user as parameter. The actions performed in Purplecube using dicmd will be logged under the username provided. Alternately, the credentials can be set onetime in the profile file of the Linux user. If credentials are set in profile file then, the username and password parameters need not be used when executing dicmd. If the username and password parameters are provided even when these are defined in the profile file then, the parameters used will override the values in the profile file. To define the credentials in profile file follow below steps. Here, for reference .bash_profile is being considered as the profile file.

Step I: Open the profile file.

vi ~/.bash_profile

Step II: Add username and password parameter in this file. Below parameters needs to be added.

export DI_USER="<<Purplecube login user>>" (Example: "Administrator")

export DI_PASSWORD="<<password>>" (Example: "ysAxD_324")

The plain text password can be masked before adding in the profile file or passing through dicmd. This helps to secure the password by not exposing the password in plain text in profile file or dicmd command parameter. To mask or encrypt the password, Purplecube provides an option in dicmd called passwd. You can provide the plain text as input to it and the result will be a masked text. See example below. When this masked text is provided as password then, Purplecube internally converts it to actual plain password to authenticate and execute the command.

$ dicmd passwd -e 'P2wd_4321'

M2FThB8QQAsv8fTo9KxhnSmLsjXgI18POI0qCQDUYiE=

For these username and password to be picked up during dicmd execution, ensure to logout and login from the server and then restart the controller.

It is required that the user specified has sufficient privileges to perform the operation.

Below are various options available in dicmd to perform different actions in Purplecube.

version: This option is used to view the current version of the Purplecube Controller.

Syntax

dicmd version [-u Username -w UserPassword]

Arguments

-u Username (optional)

Username of the Purplecube user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the Purplecube user. This parameter is required if username parameter is used in the command.

Privileges

The user requires "Studio Read" privilege for performing this action.

Example:

$ dicmd version

4.1.0.3114.002

ExitCode:0

serverstatus: This option is used to view the status of Purplecube controller at any moment. If the Controller is up and running without any issues then, it exists without any messages. In case of issues the command exists with appropriate error message.

Syntax

dicmd serverstatus

Arguments

None

Privileges

Not restricted by user specific privileges.

Example: Output when server is working fine.

$ dicmd serverstatus

ExitCode:0

agentstatus: This option is used to view the status of a specific agent registered with the Controller. Alternately, this option can provide status of all agents registered with the Controller.

Syntax

dicmd agentstatus [-u Username -w UserPassword] [-a AgentName]

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-a AgentName (optional)

Name of the agent to check the status. Only one agent name can be specified with this argument. If this argument is not provided then, this option prints the status of all the agents registered with the Controller.

Privileges

Not restricted by user specific privileges.

Example: Print the status of all the agents registered with the Controller.

$ dicmd agentstatus

-------------------------------------------------------------------
Agent Name Status
-------------------------------------------------------------------
Default Active
src_agnt_1 Not Active
tgt_agnt_1 Not Active

ExitCode:0

status: This option is used to view the status of the last execution of dataflow and jobflow in a particular project or layer. You can also view the output based on a specific status type. The output is synonymous to the status that can be seen on the Monitor module.

Syntax

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

Restrict the output based on the execution status. Only one of these values can be specified with this argument. If this argument is not provided then, the command will consider all the statuses.

ACTIVE - Jobs with status as active	FAILED - Jobs with status as failed	SUCCESS - Jobs with status as succeeded
ABORT - Jobs with status as aborted	STOP - Jobs with status as stopped

-p ProjectName (optional)

Restrict the output to a specific project. Only one project can be specified with this argument. If this argument is not provided then, the command will consider all the projects to which the user has access.

-l LayerName (optional)

Restrict the output to a specific layer in a project. Only one layer can be specified with this argument. If this argument is provided then, it is mandatory to provide the project name also. If this argument is not passed and project name is specified as argument then, the command will consider all the layers in the projects to which the user has access.

-d DataFlowName (optional)

Restrict the output to a specific data flow in a layer in a project. Only one data flow can be specified with this argument. If this argument is provided then, it is mandatory to provide the project name and the layer name to which it belongs. You can either provide data flow name argument or job flow name argument for this option. If this argument and job flow name argument is not provided and project name and/or layer name is specified as argument then, the command will consider all the data flows in the project/layer specified.

-s JobFlowName (optional)

Restrict the output to a specific job flow in a layer in a project. Only one job flow can be specified with this argument. If this argument is provided then, it is mandatory to provide the project name and the layer name to which it belongs. You can either provide data flow name argument or job flow name argument for this option. If this argument and data flow name argument is not provided and project name and/or layer name is specified as argument then, the command will consider all the job flows in the project/layer specified.

Privileges

The user requires "Monitor Read" privilege for performing this action.

Example: Print all the failed job flows and data flows executed in a layer.

$ dicmd status -c FAILED -p Project_1 -l Layer_1

RunId,FldrName,LayerName,Name,Status,StartTime,EndTime

7042,Project_1,Layer_1,ds_Kaf_DFS,F,20180605102925,20180605103029

7276,Project_1,Layer_1,DATA_FLOW_NAME_1,F,20180611022954,20180611023011

TotalCount:2

ExitCode:0

compile: This option is used to recompile the data flows or job flows in a project or layer.

Syntax

dicmd compile [-u Username -w UserPassword] -p ProjectName [-l LayerName] [-d [DataFlowName | DataStreamName]|-s JobFlowName]

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-p ProjectName

Compile all the data flows and job flows in a specific project. Only one project can be specified with this argument. It is mandatory to provide this argument.

-l LayerName (optional)

Compile all the data flows and job flows in a specific layer in a project. Only one layer can be specified with this argument. If this argument is provided then, it is mandatory to provide the project name also. If this argument is not passed and project name is specified as argument then, the command will consider all the layers in the projects to which the user has access.

-d DataFlowName (optional)

Compile a specific data flow in a layer in a project. Only one data flow can be specified with this argument. If this argument is provided then, it is mandatory to provide the project name and the layer name to which it belongs. You can either provide data flow name argument or job flow name argument for this option. If this argument and job flow name argument is not provided and project name and/or layer name is specified as argument then, the command will consider all the data flows in the project/layer specified.

-s JobFlowName (optional)

Compile a specific job flow in a layer in a project. Only one job flow can be specified with this argument. If this argument is provided then, it is mandatory to provide the project name and the layer name to which it belongs. You can either provide data flow name argument or job flow name argument for this option. If this argument and data flow name argument is not provided and project name and/or layer name is specified as argument then, the command will consider all the job flows in the project/layer specified.

Privileges

The user requires "Studio Read" and "Studio Write" privilege for performing this action.

Example: Compile a specific dataflow in a project.

$ dicmd compile -p Project_1 -l Layer_1 -d df_Snowflake_to_Hive_jnr ExitCode:0

execute: This option is used to either run, rerun, abort or stop the execution of dataflows and jobflows.

Syntax

dicmd execute [-u Username -w UserPassword] -c start|abort|stop|rerun|restartfromfailure -p ProjectName -l LayerName [-d [DataFlowName] |-s JobFlowName] [-j jobName] [-param Parameters] [-i instanceName] [-email [-mailTo mail@example.com] [-cc mail@example.com] [-subject subject] [-message message] [-logs]]

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-c start|abort|stop|rerun|restartfromfailure

Specify the operation to be performed on any data flow or job flow. It is mandatory to specify this argument. Only one of these operation can be specified with the argument.

start - To execute a job flow or data flow	abort - To abort an active job flow or data flow execution	stop - To stop an active job flow or data flow after completing execution
rerun - To execute a job flow or data flow with same Purplecube generated internal run id as prior run.	restartfromfailure - To start executing a failed job flow or data flow from point of failure or prior run

-p ProjectName

Specify the name of the project to which the data flow or job flow which needs to be executed belongs. It is mandatory to provide this argument. Only one project name can be specified with this argument.

-l LayerName

Specify the name of the layer to which the data flow or job flow which needs to be executed belongs. It is mandatory to provide this argument. Only one layer name can be specified with this argument.

-d DataFlowName

Specify the name of the data flow that needs to be executed. You can either provide data flow name argument or job flow name argument for this option. It is mandatory to provide data flow name argument or job flow name argument. Only one data flow name can be specified with this argument.

-s JobFlowName

Specify the name of the job flow that needs to be executed. You can either provide data flow name argument or job flow name argument for this option. It is mandatory to provide data flow name argument or job flow name argument. Only one job flow name can be specified with this argument.

-j jobName (optional)

Specify the name of the job within the job flow that needs to be executed. This argument can be specified only when job flow name argument is provided. Only one job name can be specified with this argument.

-param Parameters (optional)

Applicable only for -c start|rerun|restartfromfailure. Specify the parameters and the value that need to be used to override the default parameter value defined in the data flow or job flow during execution. This argument allows you to override project parameter, data flow parameter and Job Flow parameter. For more details on these parameters, refer Working with Project Parameters, Working with Data Flow Parameters, Working with Job Flow parameters.

The parameter name and value needs to be encapsulated in single quotes ('') and multiple parameters need to be separated by comma (,).

The parameter names should be suffixed with identifiers based on the type of the parameter being passed with the argument. The project parameter should be suffixed with $PP_, the data flow parameter should be suffixed with $MP_ and job flow parameter should be suffixed with $FL_.

-i instanceName (optional)

Applicable only for -c start|rerun|restartfromfailure. For start, specify a name for the instance to be associated with the execution of job flow. This argument is used to run multiple instances a job flow in parallel. This argument can be specified only when job flow name argument is provided. If no instance name is specified then in the monitor the job flow name will appear as it is. When instance name is specified then the job flow name will appear with [instance name] suffixed to the job flow name. For rerun and restartfromfailure options, specify the instance name associated with the initial run that needs to be run again. For more details on running instance of job flow, refer Executing Job Flow as an Instance.

-email (optional)

Applicable only for -c start|rerun|restartfromfailure. Use this argument to send email notification in case of failure of data flow or job flow being executed. Below are the associated arguments that need to be specified with this argument.

-mailTo mail@example.com

Specify the email id of the recipient to whom email should be sent. Multiple email ids can be specified as comma separated values. It is mandatory to provide this argument to send the email. You can use project parameters, job flow parameters and system parameters to specify the value to be passed with this argument. For more details on these parameters, refer Working with Project Parameters, Working with Job Flow parameters, Studio System Parameters.

-cc mail@example.com (optional)

Specify the email id of the recipient to whom copy of the email should be sent. Multiple email ids can be specified as comma separated values. You can use project parameters, job flow parameters and system parameters to specify the value to be passed with this argument. For more details on these parameters, refer Working with Project Parameters, Working with Job Flow parameters, Studio System Parameters.

-subject subject (optional)

Specify the subject of the email to be sent. You can use project parameters, job flow parameters and system parameters to specify the value to be passed with this argument. For more details on these parameters, refer Working with Project Parameters, Working with Job Flow parameters, Studio System Parameters.

-message message (optional)

Specify the message to be included in the the email to be sent. You can use project parameters, job flow parameters and system parameters to specify the value to be passed with this argument. For more details on these parameters, refer Working with Project Parameters, Working with Job Flow parameters, Studio System Parameters.

-logs (optional)

Use this option if you want monitor logs associated with this failed execution to be attached to the email sent.

Privileges

The user requires "Studio Read" and "Studio Execute" privilege to perform this action.

Example 1: Execute a Job Flow

$ dicmd execute -c start -p Project_1 -l Layer_1 -s S_Netezza_to_Hive

ExitCode:0

Example 2: Execute a Job Flow by overriding project parameters used in job flow.

$ dicmd execute -c start -p Project_1 -l Layer_1 -s S_Netezza_to_Hive -param '$PP_PRD_FLTR=250','$PP_START_TIME=20181231000101'

ExitCode:0

Example 3: Execute an instance of job flow and send notification with email attachment in case of failure.

$ dicmd execute -c start -p Project_1 -l Layer_1 -s S_Netezza_to_Hive -i ins_eu -email -mailTo USER2@Purplecube.com -cc USER3@Purplecube.com -subject '$$JobFlowName Failed' -message 'Job flow $$JobFlowName in the project $$ProjectName failed. Please see attached logs for more details.' -logs

ExitCode:0

Example 4: Abort a Job Flow.

$ dicmd execute -c abort -p Project_1 -l Layer_1 -s S_Netezza_to_Hive

ExitCode:0

exportlog: This option is used to save monitor log of a specific data flow or job flow executed.

Syntax

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

Specify the log level at which the monitor log needs to be saved. Only if the log has the details at the required level only then, the logs will be saved. It is mandatory to provide this argument.

ERROR - This will have all error entries in the log	WARN - This will have all the warning entries in the log	INFO - This will only have informational log that highlight the progress of the application
DEBUG - This will have all entries in the log that provide more granular and diagnostic information	TRACE - This will have all entries in the log that are finer-grained informational events than DEBUG	ALL - This will have all the entries in the log irrespective of the level at which it is generated

-p ProjectName

Specify the name of the project to which the data flow or job flow for which the log needs to saved. It is mandatory to provide this argument. Only one project name can be specified with this argument.

-l LayerName

Specify the name of the layer to which the data flow or job flow for which the log needs to saved. It is mandatory to provide this argument. Only one layer name can be specified with this argument.

-r runId

Specify the Purplecube generated internal run id for the data flow or job flow execution for which the log needs to saved. It is mandatory to provide this argument. Only one run id can be specified with this argument.

-d DataFlowName

Specify the name of the executed data flow for which the log needs to saved. You can either provide data flow name argument or job flow name argument for this option. It is mandatory to provide data flow name argument or job flow name argument. Only one data flow name can be specified with this argument.

-s JobFlowName

Specify the name of the executed job flow for which the log needs to saved. You can either provide data flow name argument or job flow name argument for this option. It is mandatory to provide data flow name argument or job flow name argument. Only one job flow name can be specified with this argument.

-j jobName (optional)

Specify the name of the job within the executed job flow for which the log needs to saved. This argument can be specified only when job flow name argument is provided. Only one job name can be specified with this argument.

-t UnitName (optional)

Specify the name of the unit within the executed job flow for which the log needs to saved. The unit name is the name of the transform that executed within the data flow. This argument can be specified only when data flow name argument or job flow name argument is provided. Only one unit name can be specified with this argument.

-f OutFileName (optional)

Specify the name with which the monitor log needs to be saved. The file name should be appended with the path where it should be saved. The user with which dicmd is executed should have write access permission to the path specified. If this argument is not specified then, the log will be saved with Purplecube generated file name in the folder from where the dicmd is executed.

Privileges

The user requires "Monitor Read" and "Monitor Execute" privilege to perform this action.

Example: Save Monitor logs for the specified data flow.

$ dicmd exportlog -o ALL -p Project_1 -l Layer_1 -r 13052 -d d_Oracle_to_Hive -f /home/disupport/log_13052.txt

ExitCode:0

export: This option is used to export the Purplecube objects in JSON file specification. You can export an entire project or layer or individual objects. Whenever an object is exported all the lower level objects that are used in it are also included in the exported file. Which means if a job flow is exported then all the jobs and data flows and it's associated objects like, data objects, data points, reusable expression will be included in the exported file.

Syntax

dicmd export [-u Username -w UserPassword] -p ProjectName [-l LayerName] [-o DOBJ|DATAPOINT|NSEQ|EXPR|UDF|DATASUBFLOW|DATAFLOW|DATASTREAM|JOBFLOW|SCHTASK|SCHCAL|SCHEMAIL|SCHFILE|SCHEDULER|PROJECTPARAMS] [-t NZ|TD|OR|FF|PG|DB|HV|MS|BI|CO|SF|ED|GP|HQ|HD|JS|HB|XD|SP|TW|FB|SK|JM|KK|CS|SY|SS|TS|SN|MY|BQ|RT|AV] [-c DataPointName] [-g GroupName] [-n ObjectName] -f OutJSONFileName

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-p ProjectName

Specify the name of the project which needs to be referred when exporting. It is mandatory to provide this argument when exporting any object. Only one project name can be specified with this argument. If only project name argument is provided then the entire project and the objects within it will be exported.

-l LayerName (optional)

Specify the name of the layer which needs to be referred when exporting. If this argument is provided then, it is mandatory to provide the project name also. Only one layer name can be specified with this argument. If only layer name and project name is provided then the entire layer and the objects within it will be exported. It is mandatory to provide this argument when exporting data flow or job flow.

Specify the type of object to be exported. This argument is mandatory to be provided when exporting a specific object type. You can specify one of these types for this argument.

DOBJ - To export data object	DATAPOINT - To export data point	NSEQ - To export sequence
EXPR - To export reusable expression	UDF - To export user defined functions	DATASUBFLOW - To export data subflow
DATAFLOW - To export data flow	JOBFLOW - To export job flow	SCHTASK - To export scheduler task
SCHCAL - To export scheduler calendar	SCHEMAIL - To export scheduler email event	SCHFILE - To export scheduler file watcher event
SCHEDULER - To export all tasks in scheduler	PROJECTPARAMS - To export Project Parameters

-t NZ|TD|OR|FF|PG|DB|HV|MS|BI|CO|SF|ED|GP|HQ|HD|JS|HB| XD|SP|TW|FB|SK|JM|KK|CS|SY|SS|TS|SN|MY|BQ|RT|AV (optional)

Specify the database type for the database object to be exported. This argument is mandatory to be provided when exporting group level objects - data point, data object, sequence, expression and udf. You can specify one of these data types for this argument.

NZ - Netezza	TD - Teradata	OR - Oracle	FF - Flatfile	PG - PostgreSQL	DB - DB2
BI - Biginsights	CO - Cobol	SF - Salesforce	HD - HDFS	JS - JSON	HB - Hadoop
MS - MSSQL	SP - Splice Machine	SS - SAS	HV - Hive	XD - XSD	SY - Sybase
TW- Twitter	FB - Facebook	SK - Spark	JM - JMS	KK - Kafka	CS - Cassandra
TS - thoughspot	SN - Snowflake	MY - Mysql	BQ - big Query	RT - RestFul	AV - Avro

-c DataPointName (optional)

Specify the data point associated with the database object to be exported. This argument is mandatory to be provided when exporting group level objects - data point, data object, sequence, expression and udf.

-g GroupName (optional)

Specify the name of the group to which the database object to be exported belongs. This argument is mandatory to be provided when exporting group level objects - data point, data object, sequence, expression and udf.

-n ObjectName (optional)

Specify the name of the object which needs to be exported. When exporting group level objects - data point, data object, sequence, expression and udf, it is mandatory to provide project name, object type, database type, data point name and group name. When exporting layer level objects - data flow and job flow, it is mandatory to provide project name and layer name. When exporting scheduler objects it is mandatory to provide project name.

-f OutJSONFileName (optional)

Specify the name with which the exported json file should to saved. You can specify the path along with the file name where the exported file should be placed. Make sure the user with which the export option is being run has permission to write in the path specified. If path is not specified then the file will be saved in folder from where dicmd is being executed. If this argument is not specified the the file will be named same as the name of the object being exported.

Privileges

The user requires "Studio Read" privilege to export studio objects and "Scheduler Read" privilege to export scheduler objects.

Example 1: Export data object

$ dicmd export -p Project_1 -o DOBJ -t NZ -c CONN_NZ -g INGESTION -n CUSTOMER -f /home/disupport/customer_nz.json

ExitCode:0

Example 2: Export data flow

$ dicmd export -p Project_1 -l Layer_1 -o DATAFLOW -n df_oracle_to_pg -f /home/disupport/df_oracle_to_pg.json

ExitCode:0

Note:

The exported JSON file can be used to maintain backup or to deploy the objects in higher environment. If any modifications are made to the exported JSON file before importing, then it can cause unexpected behaviour.

import: This option is used to import the Purplecube Studio and Scheduler code in JSON format. The json file to be imported could have code corresponding to entire project or layer or individual objects.

Syntax

dicmd import [-u Username -w UserPassword] -p ProjectName [-l LayerName] -f InJSONFileName [-o ImportOptionsFile] [-g GlobalProjectName] [-h GlobalLayerName] [-s option]

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-p ProjectName

Specify the name of the project into which the json file needs to be imported. It is mandatory to provide this argument when importing any object. Only one project name can be specified with this argument.

-l LayerName (optional)

Specify the name of the layer into which the json file needs to be imported. It is mandatory to provide this argument when importing layer level objects - data flow and job flow. Only one layer name can be specified with this argument.

-f InJSONFileName

Specify the the file with the absolute path to be imported. The user running the dicmd command should have permission to read the file.

-g GlobalProjectName (optional)

Specify the name of the global project into which the global objects needs to be imported. It is mandatory to provide this argument when json file includes global objects. Only one global project name can be specified with this argument.

-h GlobalLayerName (optional)

Specify the name of the layer in the global project into which the global objects needs to be imported. It is mandatory to provide this argument when json file includes global data objects. This argument need not be provided when importing global objects at group level - data point, data object, sequence, reusable expression and udf. Only one layer name can be specified with this argument.

-s option(optional)

Specify the import option for each type of object in the import json file. By default all the data points and project parameters if exist in Purplecube Studio will be reused and all the other objects if exist will be replaced. If the object does not exist in Purplecube then, it will be imported. This default behavior can be overridden by specifying how each object type should be imported. Specify the import option for each object type as comma separated key value pair. The import options can be replace or reuse and the allowed object type names are as below.

all - Use this to specify the import option for all the objects in the json file.	datapoint - Use this to specify the import option for all the data points in the json file.
dataobject - Use this to specify the import option for all the data objects in the json file.	expression - Use this to specify the import option for all the expressions in the json file.
sequence - Use this to specify the import option for all the sequences in the json file.	udf - Use this to specify the import option for all the UDFs in the json file.
subflow - Use this to specify the import option for all the data subflows in the json file.	dataflow - Use this to specify the import option for all the data flows in the json file.
jobflow - Use this to specify the import option for all the job flows in the json file.	param - Use this to specify the import option for all the project parameters in the json file.
schtaskcalendar - Use this to specify the import option for all the scheduler calendars in the json file.	schtaskemail - Use this to specify the import option for all the scheduler email events in the json file.
schtaskfile - Use this to specify the import option for all the scheduler file event in the json file.	schtask - Use this to specify the import option for all the scheduler tasks in the json file.

Privileges

The user requires "Studio Write" privilege to perform this action.

Example: Import a data flow

$ dicmd import -p Project_2 -l Layer_1 -f /home/disupport/df_oracle_to_pg.json

ExitCode:0

Example 2: Import a data flow by replacing all objects

$ dicmd import -p Project_2 -l Layer_1 -s all=replace -f accountdetails.json

ExitCode:0

Example 3: Import a data flow by reusing data point and data object

$ dicmd import -p Project_2 -l Layer_1 -s datapoint=reuse,dataobject=reuse -f accountdetails.json

ExitCode:0

cleanup: This option is used to perform adhoc cleanup of Purplecube logs, temporary data files, temporary tables, operational logs, and control files. Purplecube generates logs and entries in metadata operational tables for each execution of the data flow or job flow. Similarly, temporary stage files and stage tables are not deleted for the failed execution of data flow or job flow. These need to be cleaned up periodically by running the cleanup option.

Syntax

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

Specify the type of cleanup that needs to be performed. You can specify one of these types with this argument.

tform – To drop temporary tables from assigned tform database/schema created during execution of data flow.	local - To drop transient tables created during execution of data flow.	stage - To delete left over data files from failed jobs created when extracting data from external systems.
applogs - To delete monitor log files created to log and display the progress of data flow and job flow execution.	opsruns - To delete the entries from metadata operational tables created for data flow, job flow and scheduler task execution.	srvlogs - To delete the Purplecube generated Controller and Agent logs.
controlfiles - To delete the control files present in Controller and Agent.

-p ProjectName (optional)

Specify the name of the project which needs to be referred when cleaning up. Only one project name can be specified with this argument. If this argument is provided then the cleanup will be performed only for those tables and files that were created as part of execution of jobs in the project specified. If this argument is not specified then cleanup will be performed irrespective of the project where the jobs were executed.

-n NumberOfDays

Specify the number of days prior to which the files, tables and operation table entries created should be deleted. The number of days of history that needs to be shown in run history of Monitor and Scheduler is set in Purplecube Admin. Make sure the value specified with this argument for the cleanup types applogs and opsruns is more than that defined in Admin.

Privileges

The User Requires "Studio Write" privilege to perform this action.

Example: The below command cleans up the stage files older than a day from the project Project_1

$ dicmd cleanup -c stage -p Project_1 -n 1

No.of Files Deleted From Controller : 0

No.of Files Deleted From Agent[test] : 0

No.of Files Deleted From Agent[Default] : 21

ExitCode:0

passwd: This option is used to mask any text passed as argument to it. This is usually used to mask the plain text password for Purplecube login ids before adding in the profile file or using in CLI arguments.

Syntax

dicmd passwd -e 'TextToEncrypt'

Arguments

-e 'TextToEncrypt'

Specify the text that needs to be masked or encrypted.

Privileges

Not restricted by user specific privileges.

Example: Generate masked text for a password.

$ dicmd passwd -e 'P2wd_4321'

M2FThB8QQAsv8fTo9KxhnSmLsjXgI18POI0qCQDUYiE=

ExitCode:0

genkey: This option is used to generate RSA, AES and PGP encryption keys. These are used to configure Purplecube agents to encrypt the data at source while extracting and decrypt the data at target before loading. For more details on using these keys and configuring agents to encrypt data, refer Setting up encryption keys at Agents.

Syntax

dicmd genkey rsa|pgp|aes

Arguments

-genkey rsa|pgp|aes

Choose the type of key to be and used for encryption. Only one of these types can be specified with this argument.

rsa - Generates keys based on rsa algorithm. This is a asymmetric key and generates a public and private key files.

pgp - Generates keys based on pgp algorithm. This will prompt to provide identity of the signer, secure passphrase and the agent to which the key will be associated. This generates public and private key along with the secret key.

aes - Generates keys based on aes algorithm. This is a symmetric key. This will prompt to provide passphrase, key size and the agent to which the key will be associated. This generates a single key file.

Privileges

Not restricted by user specific privileges.

Example 1: Generate aes 192 key for agent named Default.

$ dicmd genkey aes

Enter AES Passphrase: gvrt_84s

Enter AES keysize(128/192/256): 192

Enter Agent Name: Default

diaes Key generated at /home/disupport/Default_diaes.key

Example 2: Generate rsa key.

$ dicmd genkey rsa

PublicKey generated at /home/disupport/dipublic.key

PrivateKey generated at /home/disupport/diprivate.key

Example 3: Generate pgp key for agent named Default.

$ dicmd genkey pgp

Enter Identity: Administrator

Enter Passphrase: gvrt_84s

Enter Agent Name: Default

PublicKey generated at /home/disupport/PGPdipublic.key

PrivateKey generated at /home/disupport/PGPdiprivate.key

SecretKey generated at /home/disupport/Default_secretKey.key

encrypt: This option is used to futher encrypt AES Key generated by genkey option using the RSA public key. The RSA key used to encrypt could be the one generated using genkey option or generated externally.

Syntax

dicmd encrypt -f aesKeyFile -k rsaPublicKeyFile

Arguments

-f aesKeyFile

Specify the AES key file with fully qualified path to be encrypted. The command will prompt to specify the agent name to which the encrypted key file generated will be associated.

-k rsaPublicKeyFile

Specify the RSA public key file to be used to encrypt the AES key.

Privileges

Not restricted by user specific privileges.

Example: Generate encrypted AES key file.

$ dicmd encrypt -f /home/disupport/Default_diaes.key -k /home/disupport/dipublic.key

Enter Agent Name:default

diaes_enc key generated at /home/disupport/default_diaes_enc.key

scheduler: This option is used to pause and resume the entire scheduler or a specific task or task group in a project.

Syntax

dicmd scheduler [-u Username -w UserPassword] -c pause|resume [-p ProjectName -t TaskName|TaskGroupName]

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-c pause|resume

Specify if pause or resume operation needs to be performed. When paused then, the active tasks will complete the execution and wait for the scheduler to resume for next scheduled run to execute.

-p ProjectName (optional)

Specify the name of the project to which the task or task group specified belongs. It is mandatory to provide this argument if task or task group name argument is provided. Only one project name can be specified with this argument.

-t TaskName|TaskGroupName (optional)

Specify either task name or task group name to be paused or resumed. It is mandatory to provide the project name argument if this argument is provided.

Privileges

The user requires "Scheduler Execute" privilege to perform this action.

Example: Pause a specific task group in the scheduler

$ dicmd scheduler -c pause -p Project_1 -t tg_sales_dm_incremental

ExitCode:0

createlayers: This option is used to add new layers to the existing projects using CLI.

Syntax

dicmd createlayer [-u Username -w UserPassword] -p ProjectName -l LayerNames

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-p ProjectName

Specify the name of the project to which the layers need to be added. Only one project name can be specified with this parameter.

-l LayerNames

Specify the layer names which needs to be created. Multiple layer names can be provided separated by comma.

Privileges

The user requires "Administrator" privilege for performing this action.

Example: Create two more layers to existing Project.

$ dicmd createlayer -p Project_1 -l Layer_3,Layer_4

ExitCode:0

changeProjectParamValue : This option is used to modify the value of a project parameter in a project.

Syntax

dicmd changeProjectParamValue [-u Username -w UserPassword] -p ProjectName -s Param

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-p ProjectName

Specify the name of the project in which the value of project parameter needs to be replaced. It is mandatory to provide this parameter. Only one project name can be specified with this parameter.

-s Param

Specify the project parameter and the value that needs to be assigned to it. Multiple project parameters with values can be provided separated by comma.

Privileges

The user requires "Project Parameter Editor" privilege for performing this action.

Example: Use Linux dicmd command to execute Purplecube Rest API call to modify the project parameters in the project.

$ dicmd changeProjectParamValue -p Project_1 -s '$PP_PRD_FLTR_PROJ=500'

ExitCode:0

createqueue: This option is used to add new scheduler queue using CLI.

Syntax

dicmd createqueue [-u Username -w UserPassword] -q QueueName

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-q QueueName

Specify the name of the scheduler queue to be created.

Privileges

The user requires "Administrator" privilege for performing this action.

Example: Use dicmd command to create scheduler Queue.

$ dicmd createqueue -u USER-2 -w P2wd_4321 -q ING_Default

ExitCode:0

addorremoveprojectsfromqueues: This option is used to add or remove projects from scheduler queue using CLI.

Syntax

dicmd addorremoveprojectsfromqueues [-u Username -w Userpassword] -o ADD|DELETE -q QueueName

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-o ADD|DELETE

Specify if add or delete operation needs to be performed on Queue. Add is used to assign projects to scheduler queue and delete is used to remove projects assigned from scheduler queue.

-q QueueName

Specify the name of the scheduler queue along with project name that you want to add or remove. Multiple Queue names along with associated project name can be provided separated by semicolon.

Privileges

The user requires "Administrator" privilege for performing this action.

Example: Use dicmd command to add scheduler Queue.

$ dicmd addorremoveprojectsfromqueues -o ADD -q "Q1=ING_Default"

ExitCode:0

delete: This option is used to delete the objects from specified project and layer.

Syntax

dicmd delete [-u Username -w UserPassword] -p ProjectName [-l LayerName] [-g GroupName] [-o DOBJ|DATAPOINT|NSEQ|EXPR|UDF|DATASUBFLOW|DATAFLOW|DATACONNECT|DATASTREAM|JOBFLOW|SCHTASK|SCHCAL|SCHEMAIL|SCHFILE|SCHEVENT|SCHEDULER|PP] [-t NZ|TD|OR|FF|PG|DB|HV|MS|BI|CO|SF|HD|JS|HB|XD|SP|TW|FB|SK|JM|KK|CS|SY|SS|TS|SN|MY|BQ|RT|AV|PR|AZ|GC|RS|DR|AU|OF|DC|DX|ES|GA|GW|PS|MD|BS|AD] -n ObjectName [-c ConnNm] [-f ForceDelete]

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-p ProjectName

Specify the name of the project from which you want to delete objects. It is mandatory to provide this parameter when deleting any object. Only one project name can be specified with this parameter. If only project name parameter is provided then the entire project and the objects within it will be deleted.

-l LayerName (optional)

Specify the name of the layer which needs to be referred when deleting. If this parameter is provided then, it is mandatory to provide the project name also. Only one layer name can be specified with this parameter. If only layer name and project name is provided then the entire layer and the objects within it will be deleted. It is mandatory to provide this parameter when deleting data flow or job flow.

-g GroupName (optional)

Specify the name of the group to which the database object to be deleted belongs. This parameter is mandatory to be provided when deleting group level objects - data point, data object, sequence, expression and udf.

Specify the type of object to be deleted. This parameter is mandatory to be provided when deleting a specific object type. You can specify one of these types for this parameter.

DOBJ - To delete data object	DATAPOINT - To delete data point	NSEQ - To delete sequence
EXPR - To delete reusable expression	UDF - To delete user defined functions	DATASUBFLOW - To delete data subflow
DATAFLOW - To delete data flow	DATACONNECT - To delete data connect	DATASTREAM - To delete data stream
JOBFLOW - To delete job flow	SCHTASK - To delete scheduler task	SCHCAL - To delete scheduler calendar
SCHEMAIL - To delete scheduler email event	SCHFILE - To delete scheduler file watcher event	SCHEVENT - To delete scheduler event
SCHEDULER - To delete all tasks in scheduler	PROJECTPARAMS - To delete Project Parameters

-t NZ|TD|OR|FF|PG|DB|HV|MS|BI|CO|SF|ED|GP|HQ|HD|JS|HB| XD|SP|TW|FB|SK|JM|KK|CS|SY|SS|TS|SN|MY|BQ|RT|AV (optional)

Specify the database type for the database object to be deleted. This parameter is mandatory to be provided when deleting group level objects - data point, data object, sequence, expression and udf. You can specify one of these data types for this parameter.

NZ - Netezza	TD - Teradata	OR - Oracle	FF - Flatfile	PG - PostgreSQL	DB - DB2
HV - Hive	MS - MSSQL	BI - Biginsights	CO - Cobol	SF - Salesforce	ED - Exadata
GP - Greenplum	HQ - HAWQ	HD - HDFS	JS - JSON	HB - Hadoop	XD - XSD
SP - Splice Machine	TW- Twitter	FB - Facebook	SK - Spark	JM - JMS	KK - Kafka
CS - Cassandra	SY - Sybase	SS - SAS	TS - thoughspot	SN - Snowflake	MY - Mysql
BQ - big Query	RT - RestFul	AV - Avro

-n ObjectName

Specify the name of the object which needs to be deleted. When deleting group level objects - data point, data object, sequence, expression and udf, it is mandatory to provide project name, object type, database type, data point name, and group name. When deleting layer level objects - data flow and job flow, it is mandatory to provide project name and layer name.

-c ConnNm(optional)

Specify the data point associated with the database object to be deleted. This parameter is mandatory to be provided when deleting data object.

-f ForceDelete (optional)

Specify this parameter as Yes to forcibly delete the object, though the object to be deleted has other dependent objects. If you do not want to forcibly delete the object then specify the parameter as No.

Privileges

The user requires "Studio Write" privilege to delete studio objects and "Scheduler Write" privilege to deleted scheduler objects.

Example: Use dicmd command to forcibly delete data point from project

$ dicmd delete -u Administrator -w Administrator -p test -g Netezza -o DATAPOINT -t NZ -n CONN_NAME_2 -f Yes

ExitCode:0

getdependency: This option is used to get the list of dependent objects for any object.

Syntax

dicmd getdependency [-u Username -w UserPassword] -p ProjectName [-l LayerName] [-g GroupName] [-o DOBJ|DATAPOINT|DATAFLOW|JOBFLOW|NSEQ|EXPR|UDF|DATASUBFLOW|PROJECTPARAMS] [-t NZ|TD|OR|FF|PG|DB|HV|MS|BI|CO|SF|HD|JS|HB|XD|SP|TW|FB|SK|JM|KK|CS|SY|SS|TS|SN|MY|BQ|RT|AV|PR|AZ|GC|RS|DR|AU|OF|DC|DX|ES|GA|GW|PS|MD|BS|AD] -n ObjectName [-c ConnNm]

Arguments

-u Username (optional)

Username of the user. This parameter is required if username is not predefined in the profile file.

-w UserPassword (optional)

Password of the user. This parameter is required if username parameter is used in the command.

-p ProjectName

Specify the name of the project from which you want to get the dependencies of Job Flow.

-l LayerName (optional)

Specify the name of the layer which needs to be referred when getting dependencies. If this parameter is provided then, it is mandatory to provide the project name also.

-g GroupName (optional)

Specify the name of the group which needs to be referred when getting dependencies for job flow.

Specify the type of object to get the dependency for. This parameter is mandatory to be provided when getting dependency for a specific object type. You can specify one of these types for this parameter.

DOBJ - To get dependency for data object	DATAPOINT - To get dependency for data point	NSEQ - To get dependency for sequence
EXPR - To get dependency for reusable expression	UDF - To get dependency for user defined functions	DATASUBFLOW - To get dependency for data subflow
DATAFLOW - To get dependency for data flow	DATACONNECT - To get dependency for data connect	DATASTREAM - To get dependency for data stream
JOBFLOW - To get dependency for job flow	SCHTASK - To get dependency for scheduler task	SCHCAL - To get dependency for scheduler calendar
SCHEMAIL - To get dependency for scheduler email event	SCHFILE - To get dependency for scheduler file watcher event	SCHEVENT - To get dependency for scheduler event
SCHEDULER - To get dependency for all tasks in scheduler	PROJECTPARAMS - To get dependency for Project Parameters

-t NZ|TD|OR|FF|PG|DB|HV|MS|BI|CO|SF|HD|JS|HB|XD|SP|TW|FB|SK|JM|KK|CS|SY|SS|TS|SN|MY|BQ|RT|AV|PR|AZ|GC|RS|DR|AU|OF|DC|DX|ES|GA|GW|PS|MD|BS|AD (optional)

Specify the database type for the database object to get the dependency for. This parameter is mandatory to be provided when getting dependency for group level objects - data point, data object, sequence, expression and udf. You can specify one of these data types for this parameter.

NZ - Netezza	TD - Teradata	OR - Oracle	FF - Flatfile	PG - PostgreSQL	DB - DB2
HV - Hive	MS - MSSQL	BI - Biginsights	CO - Cobol	SF - Salesforce	HD - HDFS
JS - JSON	HB - Hadoop	XD - XSD	SP - Splice Machine	TW- Twitter	FB - Facebook
SK - Spark	JM - JMS	KK - Kafka	CS - Cassandra	SY - Sybase	SS - SAS
TS - Thoughspot	SN - Snowflake	MY - Mysql	BQ - big Query	RT - RestFul	AV - Avro
PR - Parquet	AZ - Amazon S3	GC - GCS	RS - Redshift	DR - Druid	AU - Aurora
OF - Office365	DC - Double Click	DX - Dentrix	ES - Eaglesoft	GA - Google Analytics	GW - Google Adwords
PS - Pub/Sub	MD - Maria DB	BS - Blob Storage	AD - Active Directory

-n ObjectName

Specify the name of the object to get the dependency. When getting dependency for group level objects - data point, data object, sequence, expression and udf, it is mandatory to provide project name, object type, database type, data point name, and group name. When getting dependency for layer level objects - data flow and job flow, it is mandatory to provide project name and layer name.

-c ConnNm(optional)

Specify the data point associated with the database object to get the dependency for. This parameter is mandatory to be provided when getting dependency for data object.

Privileges

The user requires "Studio Read" privilege to get dependencies.

Example: Use dicmd command to get dependency for data object.

$ dicmd getdependency -u Administrator -w Administrator -p test [-l LayerName] -g Netezza -o DATAPOINT -t NZ -n nz_dp

ExitCode:0

Cookie Notice