How to Export Data From Biginsights To MSSQL


1 Overview
2 Environment
3 Export Data from BigInsights to MSSQL
3.1 Import DataObjects from BigInsights database:
3.2 Design Creation:
3.3 Stream creation:
4 Glossary


Overview

This article explains about how to Export Data from BigInsights to MSSQL using various Extract and Load properties.
Diyotta DI Suite supports data Export from Big SQL tables, to commonly available RDBMS data platforms. To mention a few, DB2, Oracle, IBM pure data systems for Analytics (Netezza) are also supported. Typically this feature of DI Suite is used for exporting data from a data warehouse or Big SQL Native Environment.

Environment

This knowledge base document applies to Diyotta Data Integration Suite version v3.4.

Export Data from BigInsights to MSSQL

Import DataObjects from BigInsights database

To import BigInsights DataObject as a source into studio Refer the 'How to Import Dataobjects from Database' article.

Design Creation

The below steps are involved in creation of design to export data from a Big SQL table to MSSQL platform.

  1. Right click on the 'Designs' group of project Explorer and click on the 'Create Design'.
  2. You will be prompted to give a name for the new design, Name for the design is given and layer under which the design is grouped is chosen


3. The source DataObject which should have been already imported from BigInsights, is dragged on to the design canvas. You will be prompted to give a name for the instance of this data object, a name for the instance of this data object is provided. Now the source table is BigInsights, it automatically appears as an "Native" source instance as seen in the figure below-

4. Drag the MSSQL data object as a target into design canvas, if it is not already present in the database and not imported as a dataobject, there is a convenient method to create the dataobject during this design creation step as shown in the figure below, it automatically appears as an "External" Target instance. 


5. Right-clicking on the source instance of the design and choosing "Create as Target" provides user with the option of creating dataobject which is not already present in Studio.

6. The created design thus appears as shown below-

7. During the run time, for creation of target table on the Netezza database, the option on the target instance "Create Target" is chosen so that the target table is created if the same does not already exist in the database.


Stream Creation

The steps involved in creating stream for the above design is as follows,

  1. Right click on 'Streams' group under the Project Explorer and click on the 'Create Stream'.
  2. You will be prompted to give a name for the new stream, provide an appropriate name for the stream. The stream should remain in the same layer as the design

3. The design is dragged onto the stream canvas to create an instance of the same in the stream. The instance of the design is called design job. As it is dragged, a name for the design is job is chosen.


4. The next step is assign connections to the source and target instances of the design job. The target table is in MSSQL database, hence a MSSQL connection is chosen for target and the source being a BigInsights table, a BigInsights connection is chosen. These connection objects should have been created and available in studio already.

5. The DI Suite provides the user with different data extraction and Load options. This is available on clicking the arrow icon next to connection parameter drop down.



The following table explains each of the available load/extraction property in detail-

Property Name

Values

Default

Description

Extraction Type

BI Extract

BI Extract

BI Extract– This option provides the user to unload data from Big SQL by making use of BI Extract Option

Extraction Mode

File

Pipe|Pipe|The extraction mode can either be File or Pipe. Pipe is chosen for executing both the extraction and loading process simultaneously|

File Type

Delimited

Delimited

This option define that the extracted data is in text format with delimited column data

Column Delimiter

Single byte string literal or ASCII equivalent

01

This input option defines the delimiter on the extracted file data

Null Value

String literal

NULL

The value for this input property defines what string in the extracted data should be treated as null by the database e.g. NULL

Load Type

JDBC

BCP|JDBC|The load approach to load from the extracted file to that target table. For extraction type of "BI Extract" the available load type option is "JDBC"|

Batchsize

Integer

500000

JDBC/BCP execution batch size

6. In addition to the basic properties as explained above, for specific use cases that demand additional extraction and load properties as supported by Big SQL and MSSQL respectively are also available. These properties can be defined by clicking on "Add Advanced Properties" from the "Design Job Properties" dialog shown above

7. These additional properties are detailed in the table below-

Property Name

Values

Default

Description

Extraction properties

Date Style

YMD

YMD

Specifies how to interpret the date format

Text Qualifier

  • Single Quote

Double Quote|Empty|Can be a single quotes or double quotes|

Time Style

24HOUR

24HOUR

Specifies the time format used in the data file.

Time Delimiter

Colon ((smile)

Colon ((smile)

Specifies the single-byte character that separates the time components. If you specify the timeDelim option as an empty string, you must specify the hour, minutes, and optional seconds as two-digit numbers. If you specify the 12-hour format, you can precede the AM or PM token with a single space. The AM and PM tokens are not case-sensitive

Date Delimiter

Hyphen (minus)

Hyphen (minus)

Specifies the delimiter character that separates the date components used with the dateStyle option.

Escape character

Empty
Backslash ()

Empty

The character immediately following the '\' is escaped. The only supported value is '\', and the default is no escaping

8. Execute the stream by clicking on Execute icon presents in the Icon tool bar or Right click on Stream canvas and click on 'Execute'.





Glossary

  • Database: database is a way to logically group objects such as tables, views, stored procedures etc. 
  • Connection: Connections are objects that contain the source and target platform credentials, host and database details and other properties as applicable to that platforms. The connections defined are used to communicate with the source/target platforms during a stream execution.
  • Design: A design represents a dataflow form source database to target database along with Transformations.
  • Stream: A stream controls the execution of tasks such as design jobs, command job, DB command job.



 
© Copyright Diyotta Inc.