CSV Delimited and Fixed Format
actionETL uses the http://www.filehelpers.net/ library via the FileHelpers NuGet package to read and write ETL dataflow rows to and from delimited (e.g. CSV) and fixed format files, Streams, and strings.
This allows you to create FileHelpers ETL applications, fully integrated with all the other actionETL dataflow and control flow capabilities.
Note
You must add a dependency to the FileHelpers
package in your project since
you will be using FileHelpers
attributes to specify what ETL data to read or write.
- Source workers:
- Target workers and factories:
The data format is specified by:
- Fields and auto properties (including their order) in the row class, which corresponds to the data fields to read or write
- Adding attributes to the row class and its members
- Worker
Engine
property, which provides access to the underlying FileHelpers class performing the processing - Target workers
HeaderText
andFooterText
properties, to set custom headers and footers
Important
- Columns are not mapped by name
- When importing, use the
IgnoreFirst(int numberOfLines)
attribute on the row class to exclude any header lines - By default the CSV file column order must match the order of the row class fields and auto-properties. You can override this with the FieldOrder attribute.
- When importing, use the
- Auto properties are supported, but non-auto properties are not supported by FileHelpers
using FileHelpers;
using System;
[DelimitedRecord(",")]
public class Category
{
public int CategoryId { get; set; }
public string CategoryName { get; set; }
[FieldQuoted]
public string CategoryDescription { get; set; }
[FieldConverter(ConverterKind.Date, "yyyy-MM-dd")]
public DateTime CreationDate { get; set; }
}
The above row class specifies:
[DelimitedRecord(",")]
- A comma delimited (CSV) format[FieldQuoted]
- TheCategoryDescription
column can be quoted with"
- The column order is
CategoryId
,CategoryName
,CategoryDescription
,CreationDate
This examples uses the above Category
class to read a
CSV file
into a database table, after truncating the database table:
using actionETL;
using actionETL.Adb;
using actionETL.Adb.SqlClientExternal;
using actionETL.FileHelper;
class Program
{
static void Main()
{
const string categoryTableName = "Media.Category";
new WorkerSystem()
.Root(ws =>
{
var acs = AdbSqlClientProvider.Get()
.CreateConnectionString(ws.Config["DemoDatabase"]);
var truncate = new AdbTableNonQueryWorker(ws, acs.CreateConnectionBuilder()
, AdbTableNonQueryOperation.TruncateTable, categoryTableName);
new FileHelperFileSource<Category>(ws, "Read CSV"
, () => truncate.IsSucceeded, ws.Config["CategoryFilename"])
.Output.Link.AdbInsertTarget("Insert", acs.CreateConnectionBuilder()
, categoryTableName);
})
.Start()
.Exit();
}
}
To instead write delimited files, see this example.
Row Class Attributes
Note:
- Either of
DelimitedRecord
orFixedLengthRecord
must be specified - D/F shows if the attribute is applicable to Delimited files and/or Fixed format files
- S/T shows if the attribute is applicable to Source workers and/or Target workers
- In the external examples below, you don't need to use the FileHelpers engine to explicitly read and write records, that is done automatically by the actionETL workers
D/F | S/T | Attribute | Description |
---|---|---|---|
D, F | S | ConditionalRecord(RecordCondition condition, string conditionSelector) | Set which records to include or exclude while reading. Also see RecordCondition. |
D | S, T | DelimitedRecord(string delimiter) | Delimited format, e.g. CSV. Read and write external examples. |
F | S, T | FixedLengthRecord(FixedMode fixedMode) | Fixed length format, also see FixedMode, and external Mapping Class, read, read, and write examples. |
D, F | S | IgnoreEmptyLines(bool ignoreSpaces) | Skip empty lines. true to also ignore whitespace lines. |
D, F | S | IgnoreFirst(int numberOfLines) | Skip initial lines |
D, F | S | IgnoreLast(int numberOfLines) | Skip ending lines |
D, F | S, T | IgnoreInheritedClass | Ignore members inherited from base classes |
Row Member Attributes
D/F | S/T | Attribute | Description |
---|---|---|---|
F | T | FieldAlign(int length) | Fixed format AlignMode |
D | T | FieldCaption(string caption) | Sets header text for this field. Default is the member name. |
D, F | S, T | FieldConverter(...) | The ConverterKind or other options for data conversion, see external examples with predefined and custom converters. |
D | S, T | FieldDelimiter(string separator) | Use a different field end delimiter for this field |
F | S, T | FieldFixedLength(int length) | The length of a fixed format field |
D, F | S, T | FieldHidden | Ignore the member completely. Also see FieldValueDiscarded . |
D, F | S, T | FieldInNewLine | Records have multiple lines, and this field has a new line before this value |
D | S | FieldNotEmpty | Error on reading an empty field |
D, F | S | FieldNullValue(...) | Default value in case of null . External example. |
D, F | S | FieldOptional | Field is optional, and will default to null . Consider instead using a nullable field type, see external example. |
D | S, T | FieldOrder(int order) | Read or write columns in increasing order. If specified, must be on all fields. External example. |
D, F | S, T | FieldQuoted(...) | Quoting and single vs. multi line field |
F | S | FieldTrim(...) | Fixed format TrimMode. External example. |
D | S | FieldValueDiscarded | Read the field from the source, but don't populate the row class. Also see FieldHidden . |
FileHelpers Engine
The FileHelper workers (only) use the FileHelperAsyncEngine class to read and write the ETL data, which is also available in the workers as the Engine property. Use:
- Encoding to set the text
encoding, e.g.
UTF8
. Default is the system's current ANSI code page. - NewLineForWrite to set the newline characters when writing
- Options to set
RecordOptions at runtime
(instead of with attributes):
IgnoreCommentedLines
,IgnoreEmptyLines
,IgnoreFirstLines
,IgnoreLastLines
, andRecordCondition
- Events to preprocess and post process the ETL data records, both of which happen separate from the actionETL dataflow. See external examples INotifyRead, INotifyWrite, Before/After Read Event Handling, and Before/After Write Event Handling
Note
Retrieving and inspecting the Engine
property is thread-safe, but modifying the Engine
state is not thread-safe, and should only be done from a single thread, before
the worker has started.