COBOL to Flat File - Data Ingestion

Introduction

Diyotta supports various types of Files such as COBOL, Apache Avro, Fixed Width, Fixed Length. Double Quotes, Fixed Block, Flat File. It provides data ingestion from any of these File data sources to target platforms optimally. This document illustrates how Diyotta provides data ingestion from COBOL to Flat file.

Data Ingestion 

To create a Data Flow, the first step is to create a Data Point. The next sections will guide you through creating a Data Flow.

Create a Data Point

1 . Click New Data Point on the right menu bar under the Actions tab. This launches a window prompting you to select the data source type.

2. Select the data source as File Server and click OK as illustrated in the screen shot given below.

3. The Data Point details are displayed on the canvas. You can change the name, give a short description or declare the Data Point as private under this General tab.

4. Click the Properties tab to select the location of the file. You have the option of changing the Agent from here by clicking the Change link. A File type Data Point is now created.

Import a Data Object

The next step is to import schema from COBOL file at the location previously mentioned while creating the Data Point.

1 . On the right menu bar click Import Data Object. A new Import Data Objects window opens. Select the data source type as COBOL and from the list on the right, choose the Data Point which you created earlier.

2. Click Ok.

The Import COBOL window opens, and the wizard will take you through the process of importing the COBOL schema.

3. Select the location of the file, whether Local or Server. In this case, click Server.

4. A list of all the files on the server appear. Click to select the file to be imported.


5. The details are now populated in the Import COBOL window. Click Next.


6. You can see the attributes in this window. Next, select the name and data type.

7. Click Finish and do not forget to save the Data Object.

8. Open the Data Object on the canvas. Under the General tab, the general details pertaining to the Data Object such as the name, group name, associated Data Point etc. are displayed.


Attributes Tab 

The Attributes tab has the following fields. Enter the details as per the description against each field.


Sr. No

Field Name

Description

1

Level

Level number is used to specify the level of data in a record. Level numbers describe the hierarchy of the data items/ variables declared. They are used to differentiate between elementary items and group items. The level numbers include levels from 01 to 49 and special purpose levels 66, 77 and 88.

2

Attribute

This field consists of all the attributes of the data file.

3

Data Type

Data types describe the characteristics of the data. Six data types are available- Int, nstring, number, bigint, string, and param.

4

Prec

Specifies the maximum number of characters that the selected attribute can accommodate.

5

Scale

Specifies the number of characters that the selected attribute can accommodate after the decimal.

 6

Key Type

Primary or Foreign

  • Primary keys are those fields that are unique to each record and are used to identify a particular record. A primary key column cannot have NULL values.
  • A foreign key is a key used to link two tables together. A Foreign Key is a column or a combination of columns whose values match a Primary Key in a different table. If a table has a primary key defined on any field(s), then you cannot have two records having the same value of that field(s).

 7

Dist Value

Distinguished Value: Complex data sets usually cannot store all their data in just one record type, so they have multiple record types. The Distinguished value defines a record in a specific way and is applicable to all corresponding record format. Note: The size of this field is specified in the Properties tab.

 8

Occurs

Occurs specifies how many times a field or group of fields is repeated.

 9

Redefines

Redefines is used to define a storage with different data description. If one or more data items are not used simultaneously, then the same storage can be utilized for another data item. Hence, the same storage can be referred with different data items.
Note: Level numbers of redefined item and redefining item must be the same and it cannot be 66 or 88 level number. Do not use VALUE clause with a redefining item. In File Section, do not use a redefines clause with 01 level number. Redefines definition must be the next data description you want to redefine. A redefining item will always have the same value as a redefined item.

 10

Depending On

COBOL permits tables that occur a variable number of times, depending on the VALUE field. This is similar to the COBOL OCCURS, except the number of times it occurs varies from record to record. 

 11

Value

In any particular record, the occurrence of that record is defined by the value given in this field. This creates records that vary in size from record to record.

 12

Format

COBOL defines several binary data types. The different options available are as follows –

  • BINARY: Binary format, usually in 2's compliment and usually 2, 4 or 8 bytes.
  • COMP: The COBOL standard intends that the comp data type be implemented using the most efficient data type for a particular machine. The compiler vendor will choose the best type for the CPU, probably binary.
  • COMP-1: Data type is similar to Real or Float and is represented as a single precision floating point number.
  • COMP-2: Data type is similar to Long or Double and is represented as double precision floating point number.
  • COMP-3: Data is stored in packed decimal format. Each digit occupies half a byte (1 nibble) and the sign is stored at the rightmost nibble.
  • PACKED: Packed decimal is usually implemented as comp-3.

USAGE clause specifies in which format the data is stored internally or the operating system. If the USAGE clause is specified at the group level, the USAGE clause applies to all elementary levels under the group item.
Note: It cannot be used with level numbers 66, 77 or 88. If usage clause is specified on a group, then all the elementary items will have the same usage clause.

  • USAGE COMP: Data is stored in binary or other type.
  • USAGE COMP-1: Data type is similar to Real or Float and is represented as a single precision floating point number.
  • USAGE COMP-2: Data type is similar to Long or Double and is represented as double precision floating point number.
  • USAGE COMP-3: Data is stored in packed decimal format. Each digit occupies half a byte (1 nibble) and the sign is stored at the rightmost nibble.

 13

Options

  • S
  • T
  • L
  • R

Signed data type is all about declaring the sign value for the numeric data type.

  • SSigned data: Sign data type uses S to declare the data type with sign.
  • TTrailing type: Indicates that the sign is at the end of the data.
  • LLeading type: Indicates that the sign is at the beginning of the data.
  • RReal Decimal: Indicates whether the data is a real decimal.

 14

Description

Specifies any other description of the attribute.


Properties Tab

Define each field according to the description given in the following table. 

Sr. No.

Fields

Description

1

Conversion Table

Specifies whether the conversion is to ASCII or CP037.

2

Distinguished Field Size

Specifies the size of the distinguished value.

3

New Line Size

Specifies the size of a newline if a COBOL data file has any new line among records.

4

Offset

Specifies the value after which the Distinguished value should be defined.

5

Header Offset

Specifies if the header of a COBOL file needs to be ignored.

Creating a Data Flow

Once a Data Point and Data Object is created, you are ready to go with creating a Data Flow. To create a Data Flow, follow the given steps:

1 . On the right menu bar, click New Data Flow. This will launch New Data Flow window prompting you to choose the processing platform.

2. In this case, select Generic.

3. Change the name of the Data Flow typically to suit the kind of Data Flow you are creating.

4. Choose the layer where your Data Flow will reside and click Ok.


5. The Data Flow will open in the canvas. You can now start creating the design. Next is to create a source. On the right menu bar, click the Transformations tab.

6. From the Sources panel click or drag the COBOL icon on to the canvas.

7. A new window opens. Select the Data Object from the list and click Ok.


8. You can now see the source Data Object on the canvas. In the lower pane, all the details of the source Data Object are visible.


9. Click to select the source Data Object and on the right menu bar click Create as Target.

10. This will open a new window. Select the target database type. In this case, select File.

11. From the right menu select the Data Point and click Ok.


The Data Flow should now typically look like this- 


12. Click Run on the right menu bar to execute the Data Flow.

13. Select the target object. In the lower pane, click the Data tab to get a preview of the data.


You have successfully ingested data from COBOL to a Flat file type.



On this page