Diyotta Data Security Features


Topic: This article explains the data security features. 

Environment: This article is written for Diyotta version 

Working with Secured Hadoop Cluster

Diyotta supports connecting to Hadoop services on a secure cluster configured with Kerberos. For the Diyotta application user account to make a secured connection, user principal for the account has to be created on the Kerberos server and a key tab also has to be generated for the user principal. The key tab file has to be copied over to a location on the DI Server machine and should be accessible by the Diyotta application user account.
The following table details the information required in creating a connection object for a Kerberos secured Hadoop cluster.

Property

Description

Host Name

Hive Server 2 server host or IP address

Port

Port where Hiver Server 2 is listening from

Security Authentication

User should choose between "None" for unsecured cluster and "Kerberos" for secured cluster

Hive Principal

Hive principal for connecting to Hive Service e.g. hive/hiverserver2.scotia.com@Kerberos.Realm.Com

User Principal

The diyotta user account's principal who will be authenitcated on the Hadoop services e.g. disuer@Kerberos.Realm.Com

KeyTab Name

The keytab file copied from the Kerberos server, that was created for the diyotta user principal

Default Realm

Name of the kerberos realm

Kerberos Server

Hostname of the kerberos server, this is required to generate the krb.conf file

KDC Host

Hostname of the KDC, this is required to generate the krb.conf file (alternatively the krb.conf can be directly copied over to the location $DI_HOME/app/control from where Diyotta application will load it automatically while making connection)

DFS Principal

Principal for accessing HDFS service from Diyotta

Yarn Principal

Principal for accessing Yarn service from Diyotta

Map Reduce Principal

Principal for accessing map reduce service from Diyotta


Encryption and Compression – Agent to Agent Data Transportation

Diyotta supports payload encryption and compression between agents. Source and target agents needs to configured to enable encryption/compression
The diagram given below depicts high-level understanding of this feature-



Enabling encryption/compression feature:
Diyotta admin needs to configure this feature using Admin module.

1 . Turn the flag flagent.encryption as "Y" in both source and target agents.

Note: Default property is "N"

Without encrypted AES Secret Key:

2. Select flagent.encryptedAESSecretKey as "N" in both source and target agents

Note: Default property is "N" only

3. Select flagent.encryption type as ='AES128/AES 192/AES 256' in both source and target agents

Note: flagent encryption type must be same in both agents

4. Start using the secured agents in stream

With encrypted AES Secret Key:

5. Select flagent.encryptedAESSecretKey as "N" in both source and target agents

Note: Default property is "N" only

6. Select flagent.encryption type as = 'AES128/AES 192/AES 256' in both source and target agents

Note: flagent encryption type must be same in both agents.

Generate cryptographic keys in our environment using DICMD command [check below screen shot]It generates public key and private key in below pathPath: "diserver /keys/"

7. Generate AES key using DICMD command given below-

8. Enter the text to encrypt (which is secret key already generated in source agent)

9. Now diaes key is generated.

10. Move these public and diaes keys into target agent.

Path: diagent/keys/

11. Start using the secured agents in stream.

How to check encrypted and decrypted data:

12. Change log level as TRACE in both source and target agents.

13. Execute the stream.

14. Connect the source or target agent and check the diserver.log for encrypted and decrypted data.



Attribute level data security using Masking, Encryption and Tokenization


Diyotta installation comes with pre-packaged Hadoop functions which users can use in the expression to mask, encrypt, decrypt or Tokenize the attributes.