Data Input/Output Operations
CSV Operations
Read from CSV File
Reads data from a comma (or any character) separated text file. The data is then fed into the next tasks for further processing. The file can be a comma-separated file but you can also specify other delimiters such as semi-colon.
Input parameters:
File Path - Full path and file name of the file to be read. E.g. C:\data\data.csv or \server\data\data.csv
Separator - Separator between values. Defaults to “,”
Has Header - Does the file have a header? If the file has a header it is used for naming of the columns by subsequent operations. Defaults to True.
Quote Character - If values are in quotes (e.g. “AB”;“AC;AD”;“AE”), specify double-quote as the quoteCharacter.
Write to CSV File
Writes all the data passed to it be previous tasks into a comma-separated file. If there is an existing file with the same name, it will be overwritten.
Input parameters:
File Path - Full path and file name of the file to be written. E.g. c:\data\data.csv or \server\data\data.csv
Append to existing CSV - If true will add data to end of existing any CSV at the save location. If false will overwrite any CSV at save location.
Excel Operations
Excel Reader
Read operation for reading an Excel file. The data is then fed into the next tasks for further processing.
Input parameters:
File Path - The path to the Excel file. It should support both .xls and .xlsx files.
Sheet Name - The name of the sheet to read from. If it’s left blank, it will attempt to extract the name from the Cell Range, or failing that, fall back to the first sheet.
Cell Range - The range of cells to read using the standard cell reference syntax supported by Excel (e.g. A1:B20). If left blank, it will attempt to guess the range from the data available on the specified sheet.
Has Header - This is a boolean field which specifies whether the first row of data should be treated as a header and used for the column names, much like the CSV Reader.
Assign to - what should the output of this operation be called, if this is left blank then the output will be sent to the next operation in the task queue by default.
Excel Writer
Write operation for writing to an Excel file.
Input parameters:
File Path - The path to the Excel file. If the file doesn’t exist, a new spreadsheet will be created (Excel 97 format if the extension is .xls, else the Excel 2007 format).
Sheet Name - The name of the sheet to write to. If it’s left blank, it will attempt to extract the name from the Cell Name reference. If that is also blank it will create a new sheet.
Cell Name - The starting cell position at which the data will be written, using the standard cell reference syntax supported by Excel. If left blank, it will use the top left corner (i.e. A1).
Include Header - If this is set to true (the default) the column names are written out as a header in the first row.
Assign to - what should the output of this operation be called, if this is left blank then the output will be sent to the next operation in the task queue by default.
Style Background
Style operation for setting the background colour.
Input parameters:
Cell Range - The range of cells to which the style should be applied.
Colour - The name of the colour.
Condition - The condition takes the form of a standard Excel boolean expression. The style will only be applied to a cell in the given range if the expression evaluates to true for that cell.
Style Boarders
Style operation for setting the cell borders.
Input parameters:
Cell Range - The range of cells to which the style should be applied.
Arrangement - How the borders will be applied: outline or grid.
Style - The style of the borders: thin, medium, dashed, hair, thick, double, dotted, medium dashed, dash dot, medium dash dot, dash dot dot, medium dash dot dot, slanted dash dot.
Colour - The colour of the borders.
Style Font
Style operation for setting the font.
Input parameters:
Cell Range - The range of cells to which the style should be applied.
Font Size - The font size in points.
Font Colour - The colour of the font, in the Excel colour palette e.g. Blue
Bold - Whether the font should be bold or not. (true/false)
Italic - Whether the font should be italics or not. (true/false)
Condition - The condition takes the form of a standard Excel boolean expression. The style will only be applied to a cell in the given range if the expression evaluates to true for that cell.
Assign to - what should the output of this operation be called, if this is left blank then the output will be sent to the next operation in the task queue by default.
Console Operations
Database Operations
DB Reader
Reads data from a database based on the provided SQL statement. The data is then fed into the next tasks for further processing.
Input parameters:
DB driver - Database driver. E.g. org.hsqldb.jdbc.JDBCDriver
DB connection String - Connection string to the database. E.g. jdbc:hsqldb:hsql://localhost:9001
Username - User name for login into the database.
Password - Password for login into the database.
SQL query - SQL query to be executed. E.g. Select * from Table1.
Assign to - what should the output of this operation be called, if this is left blank then the output will be sent to the next operation in the task queue by default.
DB Writer
Write input data to a database table. If the table doesn’t exist the operation will fail with an exception.
Input parameters:
DB driver - Database driver. E.g. org.hsqldb.jdbc.JDBCDriver
DB connection String - Connection string to the database. E.g. jdbc:hsqldb:hsql://localhost:9001
Username - User name for login into the database.
Password - Password for login into the database.
Target table - Name of the database table to write to.
Assign to - what should the output of this operation be called, if this is left blank then the output will be sent to the next operation in the task queue by default.
Get Table Schema
Gets table structure from the database
Input parameters:
DB driver - Database driver. E.g. org.hsqldb.jdbc.JDBCDriver
DB connection String - Connection string to the database. E.g. jdbc:hsqldb:hsql://localhost:9001
Username - User name for login into the database
Password - Password for login into the database
SQL query - Query that will retrieve the schema
SQL Statement
Performs a task on or retreves data from a database.
Input parameters:
DB driver - Database driver. E.g. org.hsqldb.jdbc.JDBCDriver
DB connection String - Connection string to the database. E.g. jdbc:hsqldb:hsql://localhost:9001
Username - User name for login into the database
Password - Password for login into the database
SQL statement - The SQL statement to be executed
Text Reader Operations
This operation is called “Regex Reader” in the newest version of ORQA.
Text Reader
Parses almost any kind of text file into a record set using the provided regular expression parameters. The data is then fed into the next tasks for further processing.
Input parameters:
File Path - Full path and file name of the file to be read.
Has Header - Does the file have a header? If the file has a header it is used for naming of the columns by subsequent operations. Defaults to True.
Row Separator - A regular expression defining the character sequence that separates rows.
Column Separator - A regular expression defining the character sequence that separates columns.
Row Extract - An optional regular expression whose capturing group defines the part of each row that you wish to retain.
Column Extract - An optional regular expression whose capturing group defines the part of each column that you wish to retain.
Row Filter - An optional regular expression which, if matched, will result in the input line being excluded from further processing.
Examples
Example 1:
Your data:
Name#@Surname#@DOB
Jane#@Johnson#@15-12-78
This can easily be parsed with a basic separator:
Column Separator: #@
Example 2:
Your data:
“Name”#@“Surname”#@“DOB”
“Jane”#@“Johnson”#@“15-12-78”
So for that we’d also need a Column Extract parameter:
Column Separator: #@
Column Extract: “(.*)”
Example 3:
Your data:
“Name”#@“Surname”#@“DOB”
“Jane#@Jane”#@“Johnson”#@“15-12-78”
We can solve this by including the quotes as part of the separator (with zero-width lookbehind and lookahead) which makes it more specific:
Column Separator: (?<=“)#@(?=”)
Column Extract: “(.*)”
Example 4:
Your data:
For this we need to filter out certain rows (anything comprised entirely of pluses and minuses), extract only part of the row (removing the outer bars), and extract part of the column (trimming whitespace):
Row Filter: [-+]+
Row Extract: \|(.*)\|
Column Separator: \|
Column Extract: \s*(.*?)\s*
Example 5:
Your data:
—- Data for processing —-
Name,Address,DOB; Gender
John,45 Some Road,1984;Male
Emma,63 Some other road,1974;Female
Peter,258 Yet another road,1986;Male
—- End of Data —-
You can simply specify these:
Column Separator: ,|;
Row Filter: —-.*
Learn More
Learn about other ORQA operations
Control Flow
Data Aggregation
Data Filters
Data Input/Output
Data Reconciliation
Data Transformations
File System
Web
XML
Groovy