![]() ELT leverages RDBMS engine hardware for scalability – but also taxes DB resources meant for query optimization.Data needs to travel across one more layer before it lands into data mart – unless the mart were just another output of the ETL process, typical of multi-target Voracity operations.Reduced flexibility due to dependency on ETL tool vendor – I’m not sure how that’s improved by relying on a single ELT/appliance vendor instead isn’t vendor-independence the key to flexibility and cost savings?.Specialized skills and learning curve required for implementing the ETL tool – unless you’re using an ergonomic GUI like Voracity’s which provides multiple job design options in the same Eclipse IDE.Possible reduced performance of row-based approach – right, and why Voracity’s ability to profile, acquire, transform, and output data in larger chunks is faster.Extra cost of building ETL system or licensing ETL tools – which are still cheaper relative to ELT appliances, but cheaper still are IRI tools like Voracity which combine Fast Extract (FACT) and CoSort to speed ETL without such complexity.Additional hardware investment is needed for ETL engines – unless you run it on the database server(s). ![]() ETL processes information row-by-row and that seems to work well with data integration into third party products – better still though is full block, table, or file(s)-at-a-time, which Voracity runs in volume.ETL can run on SMP or MPP hardware – which again you can manage and exploit more cost-effectively, and not worry about performance contention with the DB.ETL captures huge amounts of metadata lineage today- how well or intuitively can one staging DB do that?.ETL does not require co-location of data sets in order to do it’s work – allowing you to maintain existing data source platforms without data synchronization worries.ETL can process data in-stream, as it transfers from source to target – or in batch if that makes sense, too.ETL can handle partitioning and parallelism independent of the data model, database layout, and source data model architecture – though Voracity’s CoSort SortCL jobs needn’t be partitioned at all ….ETL can scale with separate hardware – on commodity boxes you can source and maintain yourself at much lower costs than single-vendor appliances.ETL can perform more complex operations in single data flow diagrams via data maps – like with Voracity mapping and workflow diagrams that also abstract short, open 4GL scripts vs.ETL can balance the workload and share the workload with the RDBMS – and in fact remove that workload by transforming data via SortCL program or Hadoop without coding in Voracity. ![]() This approach prevents burdening databases designed for storage and retrieval (query optimization) with the overhead of large-scale data transformation. In ELT, the extracts are fed into the single staging database that also handles the transformations.ĮTL remains prevalent because the marketplace flourishes with proven players like Informatica, IBM, Oracle - and IRI with Voracity, which combines FACT (Fast Extract), CoSort or Hadoop transforms, and bulk loading in the same Eclipse GUI - to extract and transform data. ![]() In ETL (extract, transform, load) operations, data are extracted from different sources, transformed separately, and loaded to a DW database and possibly other targets. The question of whether data transformation will occur inside or outside the target database has become a critical one because of the performance, convenience, and financial consequences involved. Due to dramatic growth in data volumes, these same DWAs are challenged to implement their data integration and staging operations more efficiently. Since their beginnings, data warehouse architects (DWA) have been tasked with creating and populating a data warehouse with disparately sourced and formatted data. Nevertheless it is still meant to present food for thought, and opens the floor to discussion. Full disclosure: As this article is authored by an ETL-centric company with its strong suit in manipulating big data outside of databases, what follows will not seem objective to many.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |