hasafrica.blogg.se - Pentaho data integration url

#Pentaho data integration url Patch
#Pentaho data integration url software

Here’s a nice, easy, quick (30k r/s on my MacBook) method to load a million rows using PDI and LucidDB:

#Pentaho data integration url Patch

Some users would just assume avoid the patch instructions above and have posed the question: In a general sense, if not the streaming loader how would I load data into LucidDB?Īgain, LucidDB likes to “pull” data from remote sources.

Most all of these contortions will have sorted themselves out and by the time 4.2 GA PDI and 0.9.4 GA of LucidDB are released the streaming loader should be working A-OK. There’s mutliple threads involved, when exceptions happen users have received cruddy error messages such as “Broken Pipe” that are unhelpful at best, frustrating at worse. Our streaming loader “fakes” a pull data source, and allows PDI to “push” into it. LucidDB wants to PULL data from remote sources, with it’s integrated ELT and DML based approach (with connectors to databases, salesforce, etc). In some ways, we’ve built an unnatural approach to loading for PDI: PDI wants to PUSH data into a database. Early and often comes with some risk, and many have felt the pain of some of the issues that have been discovered with the streaming loader.

#Pentaho data integration url software

In some ways, we have to admit, that we released this piece of software too soon. In fact, until PDI 4.2 GA and LucidDB 0.9.4 GA it’s pretty problematic unless you run through the process of patching LucidDB outlined on this page: Known Issues.

Enables the metadata for the load to be managed, scheduled, and run in PDI.

In particular, in addition to simple INSERTs it allows for MERGE (aka UPSERT) and also UPDATE.

Lets users choose more interesting (from a DW perspective) loading type into tables.

Enables high performance loading, directly over the network without the need for intermediate IO and shipping of data files.

The streaming loader is a native PDI step that:

By far, the most popular way for PDI users to load data into LucidDB is to use the PDI Streaming Loader.