The DFS data object groups multiple file systems into logical DFS Name space and the folders are distributed across multiple servers. It creates an illusion to the users that they are accessing multiple files from single server but actually not. The user actually navigates to the Namespace (Hive DFS location) and access the files, they don't access the server or the actual file systems where the data is located.

For more details on these data points, refer Working with Data Point. 

To work with DFS Data Object, follow below steps:

Step I: Edit, create or import a DFS data object

It is recommended to create data object by importing the structure from HDFS location using hive Data point.

Step II: Configuring attributes

The DFS Data Object contains following fields. These fields can be edited by clicking on the field and inputting the required value.

  • Attribute: Displays the name of the field at a certain position in the file. This is a text field and name can be defined as needed. For DFS data object the name of the field is immaterial. These field names will be used when this data object is used to create data objects of other types.
  • Data Type: Displays the datatype associated with the field. This is a list of applicable data types. You can select appropriate type for the field.
  • Precision: Displays the precision for the varchar, number and decimal data type. This is a text field and name can be defined as per the data in the file.
  • Scale: Displays the scale for the number and decimal data type. This is a text field and name can be defined as per the data in the file.
  • Not Null: Displays if the field can have NULL values. This is a checkbox and can be defined as per the data in the file. For DFS data object this field is immaterial. This property will be used when this data object is used to create data objects of other types.
  • Key Type: Displays if the field is key field. This is a dropdown with values Primary Key and Foreign key. Select one of these if the field is one of these keys. For DFS data object this field is immaterial. This property will be used when this data object is used to create data objects of other types.
  • Partition: Specify partition column. You can use partitioning to improve the performance of queries that restrict results by the partitioned column.
  • Order: Rearranging the order of the columns for performance improvement.
  • Description: Displays any details provided for the field. This is a text field and you can enter any details as need.

Note:

  • Following operations are allowed on the database entries: Add, Cut, Copy, Paste, Up, Down, Delete, and Search.
  • From the list of attributes, multiple attributes can be selected and we can perform/apply these operations.
  • To add a new attribute, click Add. By default a new attribute is populated in the last row. If you want to add an attribute at a specific position then, select the attribute in the position prior to it and click Add.
  • To search for a specific attribute, enter the keyword in the search bar, and the page displays the related attributes.

Step III: Configuring properties

The properties tab displays the object level details relevant to extract the data from file.

The grid displays following fields,

PropertyDescriptionDefault ValueOther possible values
Column DelimiterSpecifies the delimiter on the extracted file data.

,

Any ASCII character

Row DelimiterSpecifies the character to be used to indicate the end of the row in the extracted data.

\\n (New Line Character)

Any ASCII character
Date FormatSpecifies how to interpret the date formatYYYY-MM-DD
Time FormatSpecifies how to interpret the time formatyyyy-MM-dd HH:mm:ss
Escape Characters

The character immediately following the escape character is escaped.

This needs to be specified if the text qualifier is provided and the text qualifier character can appear in the source data.

04

\\ (Recommended)

Any ASCII character

Null ValueSpecifies what string literal should indicate the null value in the extracted data. During the data load the column value matching this string will be loaded as null in the target.

NULL 

Any string literal

Text QualifierSpecify if the text columns in the source data needs to be enclosed in quotes.

Empty

Single

Double

DFS NameDisplays the selected DFS file nameCLIENT.txt

Step IV: Save the changes

To save the changes made to the data object, refer Saving Data Object.

Note: