Importing Data with Dataimport

From Opentaps Wiki
Jump to navigationJump to search

In this technical reference document, we will cover the standard approach to importing data from external sources. Everything you need for this can be found in the dataimport module in hot-deploy.

Opentaps Data Import Strategy

The goal of the Data Import module is not to build a set of data import tools against a particular "standard," but rather to recognize that each organization has legacy or external data in its own unique format. Therefore, the Data Import module is a set of flexible tools which you can use as a reference point for setting up your own custom import and export. The existing services and entities can be used "as is" or with little modification if your data happens to be similar, or you can add to and extend them if you have additional data.

The Data Import module sets up "bridge entities" which are de-normalized and laid out in a way that is similar to most applications' data definitions. There are no foreign key relationships to any other opentaps entity, so any data could be imported into them. You would use your own database's import tools to import records into the bridge entities. Then, you would run one of the Data Import module's import services to transform the data in the bridge entities into the opentaps system. The Data Import services all follow a common standard:

  1. Each row of data in a bridge entity is wrapped in its own transaction when it is imported and succeeds or fails on its own.
  2. When a row of data in a bridge entity is imported successfully, the importStatusId field will be set to DATAIMP_IMPORTED
  3. If the import failed for the row, the status will be DATAIMP_FAILED and the importError field will contain any error messages.

Import Framework

To support this pattern, we have created a simple and extensible import framework. All the difficult details about setting up an import, starting transactions and handling errors are encapsulated into the OpentapsImporter class. Additionally, we have an interface called an ImportDecoder which is responsible for processing a single row from the bridge entity and mapping it onto a set of Opentaps Entities.

When used properly, you will be able to focus the majority of your development on the problem of mapping the import data into the opentaps model. You will also be able to take advantage of polymorphism to re-use common mapping patterns or customize existing ones for the particularities of your data.

Overview of Import Process

A brief outline of the import process is as follows,

  1. Break your original data into a set of suitably de-normalized CSV files.
    1. For example, put all your customer data in one CSV and all your product data in another.
    2. The goal here is to minimize the amount of data manipulation. This will be handled in the import service.
  2. For each CSV file, create an Opentaps Import Entity (i.e., the bridge table) that has the same fields as the CSV.
    1. Add three more fields for use by the import system: importStatusId, importError, and processedTimestamp
  3. Import your CSV data into this table using standard SQL procedures for your database
  4. Define a transactionless opentaps service that will execute your import (use-transaction="false")
    1. You may wish to implement the opentapsImporterInterface service, which defines parameters to control the way the import runs
  5. Create an implementation of ImportDecoder, which requires a decode() method
    1. In the decode() method, you are passed a row from the bridge entity.
    2. Use the row data to create the equivalent set of opentaps entities.
    3. If there are problems that should cause the row to not be imported, throw any kind of exception. The exception message will be stored in importError
    4. All operations in decode() will be rolled back if an exception is thrown.
    5. Return a list of opentaps entities to persist, they will be done in one update operation for efficiency.
  6. In the service implementation, create an instance of OpentapsImporter
    1. Specify the name of your Opentaps Import Entity in the constructor
    2. Specify the ImportDecoder that you just created
    3. Run the import by calling opentapsImporter.runImport()