Stop wasting time on 'no-code visual programming'. Define your ETL processes as a code on a declarative language instead. Write it in your favorite code editor, store in the git repo, and deploy via your favorite CI serviceDocumentation
ETL code is familiar from first SELECTLanguage Specs
Open Source and simple object modelFork Me On GitHub
Batch execution. Fully interactive debuggingDocumentation
Hadoop File Systems, S3-compatible*, JDBC** Get On GitHub
* Adapters are pluggable, and can be extended via Java API* Get On GitHub
Data Cooker Operations are like SQL stored procedures or UDFs, except they're written using low-level Spark RDD API. They execute unbelievably fast comparing to mentioned things.
Date and time, geohashing, data series calculation, population statistics, geofencing, track data analysis — 22 Operations out of the box.Fork Me On GitHub
Support of pluggable Transforms (like Operations, written in Java) with object-oriented SELECT capabilities allow to flexibly and easily transform each supported data format into another one.
21 Transforms out of the box!Fork Me On GitHub
...we had two dozen of operations, same amount of transforms, a bunch of storage adapters, and a^W Oops, this is wrong genre.
What we really want to convey: if it's not enough out of the box — you may implement your own! Code is open, extension API is simple. Also, docs on the object model extensions will be generated automagically.Fork Me On GitHub
Batch Local, Batch On-Cluster, Local Interactive, Server On-Cluster, Interactive Console Client... all with additional options.
Simply speaking, single FatJAR includes everything necessary for testing and production environments. And considering simple REST protocol between Client and Server, you may easily integrate Data Cooker ETL in any browser-based Dashboard or Notebook your data analysts prefer.Documentation
We're madly in love with Open Source and won't mind if you just fork the code and never call us back.
But also we have six years of real world production experience in a serious geoinformatic analysis project, which implemented Data Cooker ETL in the Amazon cloud. So we would share our expertise, accumulated through hundreds of ETL processes executed many thousands times, for a reasonable fee.Connect with us