Amazon S3 data point is used to configure connectivity to the Amazon S3 servers. For each Amazon S3 bucket, a separate data point will have to be created. Any location within the bucket can be accessed using the Data point as long as the user has necessary privileges. The Amazon S3 data point can be associated to any file object created like, delimited, fixed width, xml, json, etc.
To work with Amazon S3 File Data Point, follow below steps:
Step I: Create a New Data Point
- To open and edit an existing data point, refer Opening Data Point.
- To create a new data point, refer Create New Data Point.
Step II: Provide connection details
1. To connect to Amazon S3 data point following details need to be provided in the Properties tab.
- Amazon S3 URL: Specify the Amazon S3 hostname link.
- Bucket Name: An Amazon S3 bucket name is globally unique, and the namespace is shared by all AWS accounts
- Access Key: Access key is an Access key ID created while creating access keys as AWS security credentials. To use the project parameter for the Access key, check the Use Project Parameters option, and you can view and select the required Project Parameter from the Access key drop-down.
- Secret Key: Secret Access key is created while creating access keys as AWS security credentials. To use the project parameter for the Secret key, check the Use Project Parameters option, and you can view and select the required Project Parameter from the Secret key drop-down.
- Enable AWS CLI: If AWS CLI will be used to upload or download the files from the host specified then the keys need to be added as profile variables in the agent installation user. Enable this option for Diyotta to set this automatically during runtime.
- Region: Specifies the location on the globe and based upon the type of Cloud Platform the locations(options) may vary.
- Mandatory field names are suffixed with *. To establish the connection, provide all the mandatory property field values.
- Post upgrade the Region field must be specified for existing Amazon S3 data point.
- All the fields in the Properties tab can be parameterized using project parameters. To parameterize the fields, refer Working with Project Parameters.
2. Assign Agent: To assign or change the associated agent click Change. The Change Agent window appears and displays the list of available Agents. From the list, select the required Agent Name.
- If Default agent is assigned to the Project then automatically, the Default agent will be associated with the new Data point created.
- If Default agent is not assigned to the Project then, no agent will be assigned automatically and appropriate agent needs to be assigned to the data point.
- When connecting to the Agent server then, the agent installation user should have appropriate privilege to access the path where file will be placed.
- When connecting to the remote server then, the firewall needs to be opened from the Agent server to it and user specified to connect should have appropriate privilege to access the path where file will be placed.
Step III: Test the data point connection
To validate that the data point is able to connect to the Amazon S3 data point database using the details provided, refer Test Data Point Connection.
Step IV: Save the data point
- To save the changes made to the data point, refer Saving Data Point.
- If the changes made to the data point need to be reverted and not saved then, refer Reverting changes in Data Point.
- Once the data point has been created and the changes have been saved then, Close or Unlock the data point so that it is editable by other users. For more information, refer Closing Data Point and Unlocking Data Point.
Step V: Modify the configured Extract and Load properties
When moving data from one system to another the data is extracted from the source system, moved over the network and loaded into the target system. The SQLs and commands generated during execution of the jobs to extract and load data are generated based on the properties defined for these. The properties associated with the extraction and load of data should depend on the format, performance and variety of the data being moved. These properties vary based on the environment and the type of the system. Diyotta comes with default properties that covers most of the known scenarios.
To modify these properties, refer Editing Extract Properties in Amazon S3 Data Point and Editing Load Properties in Amazon S3 Data Point.
- The default values for extract and load properties can be configured in the Admin module and these properties reflect in the Studio module.
- The extract and load properties set in data point are by default used in the source and target instance of the data flow and the job flows.
- It is a good practice to set the extract and load properties as per the company standards in the data point.
- However, if needed any specific property can be overridden in the data flow or job flow.