Search Results for

    Show / Hide Table of Contents

    SQL Database Access

    The Adb infrastructure (in actionETL.Adb and its child namespaces) is used in ETL applications for accessing databases via SQL, both on-premises, and in the cloud. This includes both relational SQL databases, NoSQL databases, and many other querying systems, accessible via dedicated providers or via the Adb ODBC Provider.

    Adb consists of wrappers around the standard .NET database routines residing in System.Data and similar namespaces, adding ETL functionality such as:

    • Get tables and columns available
    • Support for provider-specific data types, which are very useful for avoiding conversion issues
    • Enhanced automated mapping between .NET CLR types and database types
    • Encapsulating database connections, to support both short-term connections that are opened and closed as needed inside a worker, and connections that are kept open until explicitly closed by the library user, e.g. for using a single transaction or temporary table across multiple workers
    • Execute predefined table commands, e.g. delete all rows, truncate, drop, exists
    • Provide additional information about the database, which assists with writing provider-independent code, so that a worker (such as AdbInsertTarget) can support all Adb database providers.
      • Note: This aspect is mainly used when developing your own custom Adb workers
    Note

    The wrappers that have a corresponding System.Data class (e.g. AdbCommand that wraps a DbCommand) behaves as their underlying .NET classes do, except where otherwise noted

    Database Support

    For Adb workers to access a database, actionETL requires that:

    • An underlying .NET database provider for the database is available, and
    • actionETL has implemented an Adb wrapper for that database provider

    An actionETL application can use different Adb providers for different workers, including for the same data source.

    There are targeted Adb wrappers for specific databases (MariaDB, MySql, PostgreSQL, SQLite, SQL Server), as well as universal ODBC wrappers compatible with most ODBC data sources. Please see Adb Supported Databases for full details.

    Data Type Support

    actionETL supports both regular .NET data types (int, string etc.), as well as (when using a dedicated provider e.g. "Microsoft.Data.SqlClient") many database specific data types, e.g. System.Data.SqlTypes.SqlDateTime. The latter are very useful for avoiding conversion issues, for further details see:

    • Dataflow Columns
    • Database Parameter Data Types
    • Adb Supported Databases

    Adb Workers

    The type names of all Adb workers (which are listed below) are prefixed with "Adb".

    Non-Dataflow Adb Workers

    Worker Description
    AdbExecuteNonQueryWorker Execute a query on an SQL database (e.g. INSERT, UPDATE, DELETE and DML statements) without returning any result set. RecordsAffected is returned.
    Examples: Adb Commands and Parameters, Adb Keep Open Connection, Slowly Changing Dimension Example.
    AdbExecuteScalarWorker<TResult> Execute a query on an SQL database and return a single result value.
    Examples: See AdbExecuteNonQueryWorker above.
    AdbTableNonQueryWorker Execute a query on a table-like object in an SQL database: DELETE all rows (with RecordsAffected), IF EXISTS, DROP, TRUNCATE.
    Requires the Adb provider to support IAdbTableCommandService.
    Examples: SQL Database Access, Samples.
    AdbTransactionActionWorker Creates an AdbKeepOpenConnection and a local AdbTransaction, runs a user supplied callback inside the transaction, and commits/rolls back and disposes the transaction and connection. Example: Adb Keep Open Connection.

    Adb Dataflow Sources

    Worker Description
    AdbDataReaderSource<TOutput> Execute an SQL query on a database, and pass result set rows to the downstream worker.
    Examples: AdbConnectionString Example, Dataflow Lookups, Slowly Changing Dimension Example.

    Adb Dataflow Targets

    Factory Class Description and Worker Class
    AdbExecuteNonQueryTargetFactory Execute an SQL query for each incoming row. Worker: AdbExecuteNonQueryTarget<TInputError>.
    AdbInsertTargetFactory Inserts incoming rows into an SQL table or view. Worker: AdbInsertTarget<TInputError>.
    Examples: SQL Database Access, column mapping example 1 and example 2, Batch Insert details SQL Database Access, Process Incoming Files Example, AggregateCsvCreateInsertTable Sample
    AdbMySqlConnectorBulkInsertTargetFactory High performance bulk insert of incoming records into a MySQL compatible table or view using MySqlConnector.MySqlBulkCopy. Worker: AdbMySqlConnectorBulkInsertTarget<TInput>.
    Examples: See the similar AdbInsertTarget in SQL Database Access and column mapping examples, see the similar AdbSqlClientBulkInsertTarget in Slowly Changing Dimension Example.
    AdbSqlClientBulkInsertTargetFactory High performance bulk insert of incoming records into a SQL Server table or non-partitioned view using SqlBulkCopy. Worker: AdbSqlClientBulkInsertTarget<TInput>.
    Examples: See the similar AdbInsertTarget in SQL Database Access and column mapping examples, Slowly Changing Dimension Example.

    Classes and Interfaces

    These are the main Adb infrastructure types:

    Name Description
    AdbProvider Wraps an ADO.NET database provider and supplies related services. Creates AdbConnectionString, AdbConnectionBuilder, AdbKeepOpenConnectionBuilder, and AdbCommandBuilder instances.
    AdbConnectionString A read-only connection string for a particular provider. Creates AdbConnectionBuilder, AdbKeepOpenConnectionBuilder, and AdbCommandBuilder instances.
    AdbConnectionBuilder Builder for a single AdbConnection, used by a single worker. Also creates AdbCommandBuilder instances.
    AdbKeepOpenConnectionBuilder Builder for a single AdbKeepOpenConnection connection that can be kept open across multiple workers. Also creates AdbCommandBuilder instances.
    IAdbConnectionBuilder Interface for a connection builder, use this for method parameters etc. (rather than AdbConnectionBuilder and AdbKeepOpenConnectionBuilder classes).
    AdbConnection A single connection for a single worker. Can be opened, closed, and disposed. Similar to a DbConnection with extra information and functionality. Creates AdbCommand instances.
    AdbKeepOpenConnection A single connection that can be kept open across multiple workers, closed, and disposed. Similar to a DbConnection with extra information and functionality. Creates AdbCommand instances.
    IAdbConnection Interface for a connection, implemented by AdbConnection and AdbKeepOpenConnection classes.
    AdbCommandBuilder Builder for a single AdbCommand database command and its parameters.
    AdbCommand A database command, similar to DbCommand, with extra information and functionality. Executed against the database.
    AdbParameterCollection Collection of AdbParameter, similar to DbParameterCollection. Used by AdbCommandBuilder and AdbCommand. Main way to create, add, and inspect parameters.
    AdbParameter Database parameter for passing data to and from a database, similar to DbParameter. Includes methods for setting parameter database type, which affects how values are converted.
    AdbTransaction A transaction instance, similar to a DbTransaction with extra information and functionality. See Adb Keep Open Connection.
    AdbDataSourceInformation Data source information: quoting character, parameter markers etc.
    AdbTableInformation Optional service that checks if table exists, returns table column information, manipulates table identifiers.
    AdbTableCommand Optional service that executes table commands (truncate, drop, delete all rows).

    Builders and Factories

    Adb workers that require a single database connection take as a parameter one of the following, from which the worker gets a single actual connection:

    • IAdbConnectionBuilder connection builder (from AdbConnectionBuilder or AdbKeepOpenConnectionBuilder)
    • AdbCommandBuilder command builder (with an embedded connection builder)

    Workers that require many or an unknown number of database connections instead take an AdbConnectionString connection builder factory as a parameter, from which the worker can create many connection builders, and from them, multiple actual connections.

    In particular, these builders and factory specifies which database provider and connection string to use when the actual connection is later created.

    This separation allows workers to be agnostic about the type of the connection - Adb workers execute create/open/close/dispose commands on all actual Adb connections, and only the library user knows or cares what type (AdbConnection or AdbKeepOpenConnection) the connection is.

    A minor second benefit is that the most common cases do not require the library user to dispose the connection builder or factory, only the more rarely used AdbKeepOpenConnectionBuilder requires explicit disposal.

    Note

    An ETL worker can of course take more than one Adb parameter, e.g. an AdbConnectionString for creating multiple connections to one particular database, as well as an IAdbConnectionBuilder for accessing a different database.

    AdbConnectionBuilder Example

    This ETL example loads a CSV file into a database table, which includes passing connection details (using a connection builder) to a database worker:

        using actionETL;
        using actionETL.Adb;
        using actionETL.Adb.SqlClientExternal;
        using actionETL.FileHelper;
        using FileHelpers;
    
        public static partial class CreateWorkerViaCallback
        {
            [DelimitedRecord(",")]
            private class Category
            {
                public int CategoryId;
                public string CategoryName;
            }
    
            public static SystemOutcomeStatus RunExample()
            {
                var provider = AdbSqlClientProvider.Get();
    
                return new WorkerSystem()
                    .Root(root =>
                    {
                        new FileHelperFileSource<Category>(root, "Read CSV"
                            , @"Src/CreateWorkerViaCallback/Category.csv")
    
                        .Output.Link.AdbInsertTarget("Insert"
                            , provider.CreateConnectionBuilder(root.Config["SqlServer"])
                            , "dbo.Category");
                    })
                    .Start();
            }
        }
    
        /* The example assumes the actionetl.aconfig.json file contains:
    {
      "configurations": [
        // Notice escaped backslash:
        { "SqlServer": "Trusted_Connection=True;server=(localdb)\\MSSQLLocalDB;database=MyDatabase" }
      ]
    }
        */
    
        /* The example assumes the following table already exists:
            CREATE TABLE Category
            (
                CategoryId int,
                CategoryName varchar(50)
            )
         */
    

    Key points:

    • The Category class is used for:
      • The CSV column specification
      • The schema and data rows passed between the two dataflow workers
      • The schema for inserting into the database table "Category"
        • By default, AdbInsertTarget maps dataflow columns with database columns by name. Custom mappings can also be specified.
    • An AdbConnectionBuilder is created from the connection string named "SqlServer" in the app.config (or web.config) file. It is passed to the worker, which uses it to create a database connection.

    Providers

    Any Adb database work starts with getting an AdbProvider instance (e.g. a SQL Server instance or MySQL or MariaDB instance), and use it to create any required Adb connection strings, Adb connection builders, and Adb command builders.

    Note that there can be multiple 'Adb provider' instances for a single type of data source, each with different settings. For instance, use Get(Boolean, Boolean) to get a SQL Server provider instance that uses ANSI double quotes for quoting identifiers.

    Provider Services

    An Adb provider instance consists of a set of services, and the user can easily create a new, modified provider instance using AdbProvider methods, by supplying a new or modified service, which will change the behavior of the workers that use them.

    Some services (IAdbInsertStatementService, IAdbTableCommandService and IAdbTableInformationService) are optional, and any workers that rely on a service that a particular provider does not supply are unsupported. Dedicated providers such as AdbSqlClientProvider and AdbMySqlClientProvider will typically provide all services, while universal providers such as AdbOdbcProvider will typically lack at least IAdbTableCommandService. See the documentation for the individual services for details.

    Batch Insert

    By default, the AdbInsertTarget<TInputError> worker uses batch insert (multiple rows per insert statement) and multiple statements per periodic transaction to improve throughput while reducing database round-trips as well as client and database load.

    Batch and transaction sizes can be set, and explicit transactions can be disabled altogether. In decreasing precedence:

    1. If the AdbInsertTarget.ErrorOutput port is linked to a downstream worker, the creation of transactions is automatically disabled.
    2. For individual workers, via SetRowLimits(Int32, Int64).
    3. For all or individual workers, via the AConfig Configuration Facility using the InsertStatementValuesPerBatch and InsertStatementValuesPerTransaction string keys.
    4. For all AdbInsertTarget<TInputError> workers using a specific Adb provider instance, set the provider limits using WithSql92InsertStatement(Int32, Int32, Int64). To e.g. get single row batches without explicit transactions:

      var provider = AdbSqlClientProvider.Get()
          .WithSql92InsertStatement(1, 0, -1);
      
    Note
    • While 2. above (SetRowLimits(Int32, Int64)) uses 'rows per batch and transaction', 3. and 4. above use 'values (or parameters) per batch and transaction', since that is more useful across different row widths. To get the number of values, multiply the row count with the number of columns you are inserting in each row.
    • An insert error will reject the whole multi-row batch or transaction, and the error message won't explicitly specify which row caused the error. Only the failed batch insert will be sent to the ErrorOutput port, even if transactions are used and some or all previous batches are rolled back.
    • AdbInsertTarget participates in any pre-existing transaction on the connection, which will then include all rows the worker inserts.
      • Using both a transaction and linking the ErrorOutput port to a downstream worker is incompatible, and would fail the worker.
    • Bulk Insert (supported by the Adb MySqlConnector Provider and the Adb SqlClient Provider) is much faster than batch insert for multi-row inserts.
    • Furthermore, Adb SqlClient Provider batch insert performance is currently impacted by an issue in the underlying Microsoft.Data.SqlClient provider. Hopefully this will soon be resolved by the SqlClient team, but until then you can alleviate the issue by doing either of:
      • Use Bulk Insert
      • Use the Adb ODBC Provider for SQL Server) for the AdbInsertTarget<TInputError> worker

    Bulk Insert

    Bulk insert also improves throughput while reducing database round-trips as well as client and database load, typically even more than batch insert.

    Bulk insert however requires a database specific facility. This is currently supported for SQL Server (AdbSqlClientBulkInsertTargetFactory) and MySQL compatible databases with the MySqlConnector provider (AdbMySqlConnectorBulkInsertTargetFactory).

    Note

    With bulk insert an insert error will reject the whole batch, and the error message won't explicitly specify which row caused the error.

    See e.g. the Slowly Changing Dimension Example which uses bulk insert workers:

    var insertScd = lookup.FoundOutput.Link.AdbSqlClientBulkInsertTarget(
          "INSERT SCD Rows"
        , rcc => rcc.AutoName(type2Tri.ColumnCount)
        , _adbConnectionString.CreateConnectionBuilder()
        , SqlBulkCopyOptions.Default, "dbo.Update_Dim" + _tableName);
    

    Connection Strings

    You choose where to store and retrieve your connection strings from. actionETL does however have direct support for retrieving them from:

    • The AConfig Configuration Facility, as in the above example, which uses the worker system Config property and is the primary and often simplest option
    • The .NET Framework Run-time Settings. Note that this is only supported on .NET Framework, and the 'providerName' setting is not used or checked:
    <connectionStrings>
      <add name="SqlServer"
        providerName="Microsoft.Data.SqlClient"
        connectionString="Trusted_Connection=True;server=(local);Database=actionETLDocumentationDatabase" />
    </connectionStrings>
    

    Retrieving from app.config / web.config uses one of the *FromAppConfig() methods (also see ConnectionStringSettingsCollectionExtensions):

    var connectionBuilder = provider.CreateConnectionBuilderFromAppConfig("SqlServer")
    

    http://www.connectionstrings.com/ is a good external resource for how to set connection strings for different databases and versions.

    Thread Safety

    A connection builder can only create a single actual connection. A particular actual connection (i.e. an AdbConnection or AdbKeepOpenConnection instance) must never be used by more than one thread at any one time. This means that:

    • An AdbConnectionBuilder instance must be passed to no more than a single Adb worker
    • An AdbKeepOpenConnectionBuilder instance can be passed to multiple workers, but the library user must ensure that no more than one of those workers run at the same time:
      • Create these workers serially (i.e using normal, sequential code)
      • Start constraints, worker grouping, and (less commonly) fully blocking dataflow workers must ensure they don't run in parallel

    AdbConnectionString Example

    This example truncates the database target table before copying all rows from a source table to the target table, so we use an AdbConnectionString to create multiple connection builders. Instead of hard coding TRUNCATE TABLE, we use AdbTableNonQueryWorker, which can also handle DROP, IF EXISTS, and DELETE all rows.

    using actionETL;
    using actionETL.Adb;
    using actionETL.Adb.SqlClientExternal;
    
    public static partial class AdbSynchronizeTable
    {
        private class Category
        {
            public int CategoryId;
            public string CategoryName;
        }
    
        private const string _sourceTableName = "dbo.CategorySource";
        private const string _targetTableName = "dbo.Category";
    
        private static void RunExample()
        {
            new WorkerSystem()
                .Root(root =>
                {
                    var acs = AdbSqlClientProvider.Get()
                        .CreateConnectionString(root.Config["SqlServer"]);
    
                    var truncate = new AdbTableNonQueryWorker(root
                        , acs.CreateConnectionBuilder()
                        , AdbTableNonQueryOperation.TruncateTable, _sourceTableName);
    
    
                    _ = new AdbDataReaderSource<Category>(root, "Extract Table"
                        , () => truncate.IsSucceeded, acs.CreateConnectionBuilder()
                        , "SELECT CategoryId, CategoryName FROM " + _targetTableName
                        , cmca => cmca.AutoName())
    
                    .Output.Link.AdbInsertTarget("Insert Table"
                        , acs.CreateConnectionBuilder(), _targetTableName);
                })
                .Start()
                .ThrowOnFailure();
        }
    }
    

    You can obviously expand this simple example. At the other end of the scale, also see these two large examples: Process Incoming Files Example and Slowly Changing Dimension Example.

    See Also

    • Common Tasks
    • Release Notes
    • Getting Started
    • Worker System
      • Configuration
    • Workers
    • Dataflow
      • Dataflow Columns
    • SQL Database Access
      • Adb Supported Databases
        • Adb MySqlClient Provider
        • Adb MySqlConnector Provider
        • Adb Npgsql Provider
        • Adb ODBC Provider
        • Adb ODBC Provider for SQL Server
        • Adb SqlClient Provider
        • Adb SQLite Provider
      • Adb Keep Open Connection
      • Adb Commands and Parameters
      • Adb Information
      • Custom Adb Development
      • Custom Non-Adb Development
    • .NET Framework Data Providers (external)
    • Data Formats
    • Transfer Protocols
    In This Article
    Back to top Copyright © 2021 Envobi Ltd