Difference between revisions of "Transforming Data with Kettle"

From Opentaps Wiki
Jump to navigationJump to search
Line 5: Line 5:
 
== Setting up Kettle ==
 
== Setting up Kettle ==
  
 +
If you want to design transformation with Kettle, you have first to setup your databases connections. To do this you have to edit the file simple-jndi/jdbc.properties in the Kettle installation directory.
  
If you want to design transformations between the opentaps enterprise database and the warehouse, then it is recommended to also set up a database for the Pentaho Data Integration project (also known as Kettle).
+
Here is an example on how to setup this file
  
  create database opentaps_analytics_kettle default character set utf8 collate utf8_general_ci;
+
<pre>
  grant all privileges on opentaps_analytics_kettle.* to 'opentaps'@'localhost' identified by 'opentaps';
+
opentaps/type=javax.sql.DataSource
 +
opentaps/driver=org.postgresql.Driver
 +
opentaps/url=jdbc:postgresql://127.0.0.1/ofbiz
 +
opentaps/user=ofbiz
 +
opentaps/password=ofbiz
  
Next we need to start Spoon and have it populate the kettle database with information. When you start Spoon, create a new repository named Opentaps Analytics Kettle. We will have to add a Database Connection, call it Opentaps Analytics Kettle. [[Image:Kettle database connection.png|thumb]] Specify the connection information to localhost and add the following option under the Options tab,
+
opentaps_analytics/type=javax.sql.DataSource
 +
opentaps_analytics/driver=com.mysql.jdbc.Driver
 +
opentaps_analytics/url=jdbc:mysql://localhost/opentaps_analytics
 +
opentaps_analytics/user=opentaps
 +
opentaps_analytics/password=opentaps
  
  MySQL  characterEncoding  UTF8
+
opentaps_analytics_testing/type=javax.sql.DataSource
 
+
opentaps_analytics_testing/driver=com.mysql.jdbc.Driver
You can use the Test button to make sure you can connect. Once done, use the Create or Update button to populate this database with kettle seed data. Now you can log in with user admin password admin.
+
opentaps_analytics_testing/url=jdbc:mysql://localhost/opentaps_analytics_testing
 +
opentaps_analytics_testing/user=opentaps
 +
opentaps_analytics_testing/password=opentaps
 +
</pre>
  
 
== Designing Transformation and Job Templates ==
 
== Designing Transformation and Job Templates ==
Line 26: Line 38:
 
### Specify the connection information to opentaps_analytics, our data warehouse.  Also ensure characterEncoding of UTF8 in options as above.
 
### Specify the connection information to opentaps_analytics, our data warehouse.  Also ensure characterEncoding of UTF8 in options as above.
 
### Test the connection and ensure it works
 
### Test the connection and ensure it works
 +
 +
On the right is a screenshot of the way to configure a database.
 +
 +
[[Image:Kettle_database_configuration.png|thumb]]
  
 
I have created a kettle template for generating the seed data.  You can import this in Spoon and change the database configuration if necessary.  That is only if you need to edit the template.
 
I have created a kettle template for generating the seed data.  You can import this in Spoon and change the database configuration if necessary.  That is only if you need to edit the template.
Line 33: Line 49:
 
To edit an existing transformation in Spoon, use File > Import from an XML file, and import the file (for example, StoreDimensionTemplate.ktr from analytics/templates/ directory.)  When you are done, use File > Export to an XML file to save the transformation as an XML file again, instead of storing it in the database.
 
To edit an existing transformation in Spoon, use File > Import from an XML file, and import the file (for example, StoreDimensionTemplate.ktr from analytics/templates/ directory.)  When you are done, use File > Export to an XML file to save the transformation as an XML file again, instead of storing it in the database.
  
Here is a screenshot of the way to configure a database.
+
== Debuging a Transformation ==
  
[[Image:Kettle_database_configuration.png|thumb]]
+
After clicking on the run button in the menubar, there is an option to activate logs with more or less verbosity.
 +
 
 +
You can use the preview button in the menubar, which let watch the whole result of one step

Revision as of 15:44, 21 May 2008

These are preliminary notes.

Setting up Kettle

If you want to design transformation with Kettle, you have first to setup your databases connections. To do this you have to edit the file simple-jndi/jdbc.properties in the Kettle installation directory.

Here is an example on how to setup this file

opentaps/type=javax.sql.DataSource
opentaps/driver=org.postgresql.Driver
opentaps/url=jdbc:postgresql://127.0.0.1/ofbiz
opentaps/user=ofbiz
opentaps/password=ofbiz

opentaps_analytics/type=javax.sql.DataSource
opentaps_analytics/driver=com.mysql.jdbc.Driver
opentaps_analytics/url=jdbc:mysql://localhost/opentaps_analytics
opentaps_analytics/user=opentaps
opentaps_analytics/password=opentaps

opentaps_analytics_testing/type=javax.sql.DataSource
opentaps_analytics_testing/driver=com.mysql.jdbc.Driver
opentaps_analytics_testing/url=jdbc:mysql://localhost/opentaps_analytics_testing
opentaps_analytics_testing/user=opentaps
opentaps_analytics_testing/password=opentaps

Designing Transformation and Job Templates

To create our date dimension seed data using Spoon, the steps are,

  1. Create a New Transformation using New -> New Transformation
  2. On the left, you'll see three collapsed sections, Main Tree, Core Objects and Favorite Steps. Expand the Main Tree section.
    1. Under Transformations right click Database Connections and select New
      1. Specify the connection information to opentaps_analytics, our data warehouse. Also ensure characterEncoding of UTF8 in options as above.
      2. Test the connection and ensure it works

On the right is a screenshot of the way to configure a database.

Kettle database configuration.png

I have created a kettle template for generating the seed data. You can import this in Spoon and change the database configuration if necessary. That is only if you need to edit the template.

Editing a Transformation

To edit an existing transformation in Spoon, use File > Import from an XML file, and import the file (for example, StoreDimensionTemplate.ktr from analytics/templates/ directory.) When you are done, use File > Export to an XML file to save the transformation as an XML file again, instead of storing it in the database.

Debuging a Transformation

After clicking on the run button in the menubar, there is an option to activate logs with more or less verbosity.

You can use the preview button in the menubar, which let watch the whole result of one step