Creating Mendz.ETL-based Source Adapters

Mendz.ETL suggests three basic ingredients to ETL solutions: source adapter, mapper and target adapter. Here are some ideas for your source adapters.

Source adapters extract data from a source. In ETL, source adapters perform the "E" or extract part. In fact, when derived from SourceAdapterBase, you only need to implement the ExtractInput() method. Basically, source adapters read data from the source and stream them back as IEnumerable data. The source can be files, or databases or web services, for example. The implementation should take care of opening access to the source, reading from it and releasing resources when done.

Mendz.Data.Common provides the FullSourceAdapter and the LineSourceAdapter classes ready to use. FullSourceAdapter can be used for small document sources that can be read and streamed in full through the ETL flow. LineSourceAdapter can be used for document sources that you want to be read line per line in the ETL flow.

For example, FullSourceAdapter can be used to read a small XML file, which can then be passed to an XSLT mapper, the result of which can be passed to a FullTargetAdapter. Since .Net does not support streaming XSLT (yet) anyway, and for as long as the XML file is small, this easily makes sense. In this context, the call to Router.Route(source, mapper, target); is direct to the point: extract the full source, transform via XSLT, and load the full result to a target.

Mendz.ETL itself also provides IData* interfaces, which can be used to create data level conversions, filters, insertions, joins and splits. Implement IData* to create re-useable data manipulation utilities in your ETL solution. IData* implementations can be used in adapters, mappers, validators and/or joiners.

Callers call the source adapter's Extract() method. Router.Route(), for example, consumes the source adapter passed to it by calling its Extract() method. For source adapters derived from SourceAdapterBase, note that it implements Extract() as follows:
  1. If set, raises OnSourceAdapterStart.
  2. If set, calls SourceValidator.Validate(SourceSpecification). Otherwise, the source is assumed valid.
  3. If the source is valid:
    1. If set, raises OnExtracting.
    2. Loops through ExtractInput().
      1. If set, raises OnExtracted.
      2. Yields the extracted input.
  4. If set, raises OnSourceAdapterEnd.
An event handler is invoked only if it is set (not null). If none of the events are set, the call is basically a loop to yield the result of the ExtractInput() implementation.

Get Mendz.ETL and start building your source adapters like a pro!

Comments