Search Results for

    Show / Hide Table of Contents

    Class DictionaryLookupTransform<TInputOutputError, TKey, TValue>

    A dataflow worker that performs a lookup in an IReadOnlyDictionary<TKey,TValue> collection for each Input row, optionally modifying the input rows, before sending them to the output or error port. All ports have the same row type.

    Note: Use the factory methods in DictionaryLookupTransformFactory to create instances of this class.

    To customize the key lookup, e.g. to make a case insensitive lookup, either add code to the selectRowKeyFunc callback to process the row data to match the case of the lookup reference keys, or create and set the underlying Dictionary as a case insensitive one (see e.g. Dictionary<TKey,TValue>(IEqualityComparer<TKey>)).

    By supplying a pre-populated dictionary, the worker by default performs a fully cached lookup, which is both the most common configuration, and the easiest to configure.

    It is however also possible to implement a partially cached lookup by starting with an empty dictionary, or a partially pre-populated dictionary, and then add missing dictionary items on the fly in the notFoundKeyFunc callback. This avoids loading dictionary items that will never be used, which can be advantageous when it is impractical to retrieve all keys and lookup values ahead of time.

    Note that multiple rows can often match the same lookup key and value. To avoid issues where modifying one row inadvertently also changes another row, best practice is to make the lookup value only consist of value types and/or immutable types. If the lookup value is, or contains, a mutable reference type, the user must ensure that either there are no lookup value references that are shared and modified across rows, or that the lookup value is cloned, so that each row gets its own unique instance.

    Also see DictionaryLookupTransform<TInputOutputError, TDictionaryInput, TKey, TValue>, which loads the dictionary from a second input port, and DictionaryLookupSplitTransform<TInputOutputError, TKey, TValue>, which has a separate output port for unmatched rows.

    Also see Dataflow Lookups.

    Inheritance
    Object
    WorkerParent
    WorkerBase
    WorkerBase<DictionaryLookupTransform<TInputOutputError, TKey, TValue>>
    DictionaryLookupTransform<TInputOutputError, TKey, TValue>
    Implements
    IDisposeOnFinished
    Inherited Members
    WorkerBase<DictionaryLookupTransform<TInputOutputError, TKey, TValue>>.AddCompletedCallback(Func<DictionaryLookupTransform<TInputOutputError, TKey, TValue>, OutcomeStatus, Task<OutcomeStatus>>)
    WorkerBase<DictionaryLookupTransform<TInputOutputError, TKey, TValue>>.AddRanCallback(Func<DictionaryLookupTransform<TInputOutputError, TKey, TValue>, OutcomeStatus, WorkerParentChildrenState, Task<OutcomeStatus>>)
    WorkerBase<DictionaryLookupTransform<TInputOutputError, TKey, TValue>>.AddStartingCallback(Func<DictionaryLookupTransform<TInputOutputError, TKey, TValue>, Task<ProgressStatus>>)
    WorkerBase.AddCompletedCallback(Func<WorkerBase, OutcomeStatus, Task<OutcomeStatus>>)
    WorkerBase.AddRanCallback(Func<WorkerBase, OutcomeStatus, WorkerParentChildrenState, Task<OutcomeStatus>>)
    WorkerBase.AddStartingCallback(Func<WorkerBase, Task<ProgressStatus>>)
    WorkerBase.DefaultIsStartable()
    WorkerBase.ErroredPortErrorsWorkerProtected
    WorkerBase.ErrorOutputs
    WorkerBase.EscalateError
    WorkerBase.Inputs
    WorkerBase.IsStartable
    WorkerBase.Outputs
    WorkerBase.Parent
    WorkerBase.SucceededSequence<TLastWorker>(WorkerBase, WorkerBase, WorkerBase, WorkerBase, WorkerBase, TLastWorker)
    WorkerBase.SucceededSequence<TLastWorker>(WorkerBase, WorkerBase, WorkerBase, WorkerBase, TLastWorker)
    WorkerBase.SucceededSequence<TLastWorker>(WorkerBase, WorkerBase, WorkerBase, TLastWorker)
    WorkerBase.SucceededSequence<TLastWorker>(WorkerBase, WorkerBase, TLastWorker)
    WorkerBase.SucceededSequence<TLastWorker>(WorkerBase, TLastWorker)
    WorkerBase.SucceededSequence<TLastWorker>(TLastWorker)
    WorkerParent.AddChildCompletedCallback(Action<WorkerBase>)
    WorkerParent.AddStartingChildrenCallback(Func<WorkerParent, Task<ProgressStatus>>)
    WorkerParent.BytesPerRowBuffer
    WorkerParent.Children
    WorkerParent.DisposeOnFinished<TDisposable>(TDisposable)
    WorkerParent.GetDownstreamFactory<TInput>()
    WorkerParent.HasChildren
    WorkerParent.IsCanceled
    WorkerParent.IsCompleted
    WorkerParent.IsCreated
    WorkerParent.IsError
    WorkerParent.IsFailed
    WorkerParent.IsFatal
    WorkerParent.IsRunning
    WorkerParent.IsSucceeded
    WorkerParent.KeepChildrenLevels
    WorkerParent.Locator
    WorkerParent.LogFactory
    WorkerParent.Logger
    WorkerParent.MaxRunningChildren
    WorkerParent.Name
    WorkerParent.RemoveChildren()
    WorkerParent.RescheduleChildren()
    WorkerParent.RunChildrenAsync(Boolean)
    WorkerParent.RunChildrenAsync()
    WorkerParent.Status
    WorkerParent.Item[String]
    WorkerParent.ToString()
    WorkerParent.WorkerSystem
    WorkerParent.DebugCommands
    WorkerParent.AggregateErrorOutputRows
    WorkerParent.AggregateOutputRows
    WorkerParent.AggregateWorkersCompleted
    WorkerParent.InstantCompleted
    WorkerParent.InstantCreated
    WorkerParent.InstantStarted
    WorkerParent.RunningDuration
    Namespace: actionETL
    Assembly: actionETL.dll
    Syntax
    public class DictionaryLookupTransform<TInputOutputError, TKey, TValue> : WorkerBase<DictionaryLookupTransform<TInputOutputError, TKey, TValue>>, IDisposeOnFinished where TInputOutputError : class
    Type Parameters
    Name Description
    TInputOutputError

    The type of all input and output port rows.

    TKey

    The type of the lookup key.

    TValue

    The type of the lookup value.

    Remarks

    The concrete dictionary used by the transform is often a Dictionary<TKey,TValue>, which must be provided either to the worker constructor or via Dictionary. Note that it can be shared by other workers if it's not modified after the other workers start to run.

    If the Dictionary<TKey,TValue> is only used by a single worker at a time, the worker can modify its dictionary while it runs. This can be used to e.g. build up the dictionary on the fly: In the notFoundKeyFunc callback, attempt to add a new dictionary entry for the missing key, and if successful, invoke the foundKeyFunc or foundKeyAction callback on the row and return Found; otherwise, return NotFound.

    The dictionary can of course be any IReadOnlyDictionary<TKey,TValue> implementation. E.g.:

    • Set initial capacity (to reduce re-allocations) and/or a custom equality comparer (e.g. case insensitive) using an appropriate Dictionary<TKey,TValue> constructor
    • Use ConcurrentDictionary<TKey,TValue> to allow multiple threads (i.e. workers) to use and modify the dictionary simultaneously. This can save memory by sharing a single mutable dictionary copy; note however that it can be an order of magnitude slower than a non-concurrent dictionary.
    • Use a caching dictionary that discards seldom used items, thereby limiting its memory use
    • Create an IReadOnlyDictionary<TKey,TValue> wrapper around an out of process web lookup service

    Properties

    Dictionary

    Gets or sets the dictionary to lookup incoming rows against. The property can only be accessed when the worker is not running.

    A dictionary must be provided either here, or via the worker constructor. The callback methods (normally notFoundKeyFunc) can optionally be used to populate the dictionary on the fly.

    Note: To modify the dictionary in the callback methods, use the original dictionary instance, since this property has a read-only type, and also adds synchronization overhead.

    Note that the dictionary can optionally be set to null after the worker has completed, to allow it to be garbage collected before the worker itself is garbage collected, which in rare circumstances can be useful with a large collection.

    Note: This property is thread-safe.

    Declaration
    public IReadOnlyDictionary<TKey, TValue> Dictionary { get; set; }
    Property Value
    Type Description
    IReadOnlyDictionary<TKey, TValue>

    The dictionary.

    Exceptions
    Type Condition
    InvalidOperationException

    Cannot set the dictionary while the worker is running.

    ErrorOutput

    Gets the error output port for sending error rows to logging and an optional downstream worker.

    It will receive the first row that throws an exception, rows where the lookup key was not found, and rejected rows.

    Declaration
    public ErrorOutputPort<TInputOutputError> ErrorOutput { get; }
    Property Value
    Type Description
    ErrorOutputPort<TInputOutputError>

    Input

    Gets the input port for receiving rows from an upstream worker.

    Declaration
    public InputPort<TInputOutputError> Input { get; }
    Property Value
    Type Description
    InputPort<TInputOutputError>

    Output

    Gets the output port for sending rows, where the key is found, and (optionally) where it is not found, to the downstream worker.

    Declaration
    public OutputPort<TInputOutputError> Output { get; }
    Property Value
    Type Description
    OutputPort<TInputOutputError>

    Methods

    RunAsync()

    This method can be overridden to add custom functionality to the derived worker that runs before and after the row processing. In this case, the base class base.RunAsync() must be called for the worker to function correctly.

    Typically, this worker is used without overriding this method.

    Declaration
    protected override Task<OutcomeStatus> RunAsync()
    Returns
    Type Description
    Task<OutcomeStatus>

    A Task describing the success or failure of the worker. An asynchronous async implementation would e.g. return OutcomeStatus.Succeeded on success, while a synchronous implementation would return OutcomeStatus.SucceededTask.

    Overrides
    WorkerParent.RunAsync()

    Implements

    IDisposeOnFinished

    See Also

    DictionaryLookupTransformFactory
    DictionaryLookupRowTreatment
    DictionaryLookupTransform<TInputOutputError, TDictionaryInput, TKey, TValue>
    DictionaryLookupSplitTransform<TInputOutputError, TKey, TValue>
    DictionaryLookupSplitTransform<TInputOutputError, TDictionaryInput, TKey, TValue>
    DictionaryTarget<TInput, TKey, TValue>
    In This Article
    • Properties
      • Dictionary
      • ErrorOutput
      • Input
      • Output
    • Methods
      • RunAsync()
    • Implements
    • See Also
    Back to top Copyright © 2023 Envobi Ltd