Search Results for

    Show / Hide Table of Contents

    XLSX (Excel) Spreadsheets

    You can read and write XLSX (Microsoft® Excel) spreadsheet files in your ETL dataflow with:

    • Source worker: XlsxSource<TOutput>
    • Target worker and factory: XlsxTarget<TInput>, XlsxTargetFactory

    This allows you to create Excel ETL applications, fully integrated with all the other actionETL dataflow and control flow capabilities.

    It uses the https://github.com/JanKallman/EPPlus library via the EPPlus NuGet package version 4.x (which is free) to process the XLSX (Excel) files. No driver installation is required, which avoids otherwise common issues with 32-bit vs. 64-bit drivers.

    Apart from this article, also see the ReadSortWriteXlsx Sample.

    Example

    In this simple Excel ETL application we read dataflow rows from one XLSX (Excel) file and write them to a new XLSX file. You can easily add transformations to the dataflow rows before writing them out.

    The row class specifies possible columns to read and write, and column names to optionally map to:

    using System;
    
    public class Budget
    {
        public DateTime Month { get; set; }
        public decimal OfflineRevenue { get; set; }
        public decimal OnlineRevenue { get; set; }
    }
    

    The worker system reads all row columns from the top left corner of the first sheet in an XLSX (Excel) file, and writes to a different XLSX file, starting at cell "B2" on sheet "Brief", overwriting any existing file:

    using actionETL;
    using actionETL.EPPlus;
    
    public static partial class XlsxReadWrite
    {
        public static SystemOutcomeStatus Run()
        {
            return new WorkerSystem()
                .Root(root =>
                {
                    new XlsxSource<Budget>(root, $"Read XLSX"
                        , @"Src/XlsxReadWrite/Budget.xlsx"
                        , xsc => xsc.FromAllByName())
    
                    // Add any dataflow transformations here
    
                    .Output.Link.XlsxTarget("Write XLSX"
                        , xtc => xtc.Address("Brief!B2").FromAll()
                        , @"Src/XlsxReadWrite/BriefBudget.xlsx"
                        , XlsxTargetWriteOption.Overwrite);
                })
                .Start();
        }
    }
    
    Note
    • Column name matching is ordinal case insensitive, but a case sensitive match takes precedence over a case insensitive match.
    • Mapping commands that can map multiple columns ignore already mapped columns. Use the Name(string, string) command to map the same column multiple times.

    XlsxSource Configuration

    The constructor parameter xlsxSourceCommands and the XlsxSourceCommands property defines mappings between the XLSX (Excel) spreadsheet file source columns and downstream row columns. Note that multiple commands can be chained together, e.g.:

    xsc => xsc
        .Address("'My Sheet'!B5")
        .ToNames("Category", "Subcategory")
        .Name("Id", "ProductId").Ignore()
    

    IXlsxSourceCommand methods detail all the mapping commands.

    It supports reading from a rectangular area on one sheet in the XLSX (Excel) file. The default address is the top left cell (i.e. "A1") on the first sheet.

    If the HeaderRow property is false or the column header is missing, use ordinal position or source column addresses ("B", "AD" etc.) to specify which columns to import.

    Note that some of the mapping commands explicitly control in which order to read and populate columns, while others use OrderAttribute added to the row members. You can mix the two approaches.

    By default, any column errors are rejected to the ErrorOutput.

    Furthermore, XlsxSource properties allow e.g. providing a file password.

    XlsxTarget Configuration

    The factory method parameter xlsxTargetCommands) and the XlsxTargetCommands property defines mappings between upstream row columns and the XLSX (Excel) spreadsheet file target columns, where to write, and what formatting to use. Note that multiple commands can be chained together, e.g.:

    xtc => xtc
        .Address("'My Sheet'!B5")
        .FromNames("Category", "Subcategory")
        .Name("Date").Format("yyyy-mm-dd")
    

    IXlsxTargetCommand methods detail all the mapping commands.

    It supports writing to a rectangular area on one sheet in the XLSX (Excel) file. The default address is the top left cell (i.e. "A1") on the first sheet. If there is no existing sheet, a new sheet called "Sheet1" will be created.

    Note that some of the mapping commands explicitly control in which order to write columns, while others use OrderAttribute added to the row members. You can mix the two approaches.

    Furthermore, XlsxTarget properties allow e.g. providing a file password, and specifying an XLSX (Excel) file to use as a template file.

    Supported Data Types

    The following types are supported as dataflow column types when reading and writing XLSX (Excel) files:

    .NET Type Display Name Nullable Type
    System.Boolean System.Boolean?
    System.Byte System.Byte?
    System.DateTime System.DateTime?
    System.Decimal System.Decimal?
    System.Double System.Double?
    System.Int16 System.Int16?
    System.Int32 System.Int32?
    System.Int64 System.Int64?
    System.SByte System.SByte?
    System.Single System.Single?
    System.String
    System.UInt16 System.UInt16?
    System.UInt32 System.UInt32?
    System.UInt64 System.UInt64?

    See Also

    • Common Tasks
    • Release Notes
    • Getting Started
    • Worker System
      • Configuration
    • Workers
    • Dataflow
    • Data Formats
    • ReadSortWriteXlsx Sample
    • https://github.com/JanKallman/EPPlus
    In This Article
    Back to top Copyright © 2021 Envobi Ltd