Big data ETL : ETL tools combine three important functions (extract, transform, load) required to get data from one big data environment and put it into another data environment
ETL is evolving to support integration across much more than traditional data warehouses. ETL can support integration across transactional systems, operational data stores, BI platforms, MDM hubs, the cloud, and Hadoop platforms
ETL tools are needed for the loading and conversion of structured and unstructured data into Hadoop. Advanced ETL tools can read and write multiple files in parallel from and to Hadoop to simplify how data is merged into a common transformation process
One of the leaders in the Big data ETLs is Talend ( see our detailed comparison here ) , Talend is the first ETL in the market that has native support for Spark, plus Mapreduce and has surpassed Informatica in adopting new emerging Big data technologies.
Talend has the free version that you can download here
Why choose Talend for your big data project ?
- Talend is an open-source data integration tool (with the full suite , ESB , MDM , BPM , DQ).
- It uses a code-generating approach. Uses a GUI, but within Eclipse RC, with an intuitive use
- Very large community , and more than 800 connectors ( the biggest connectors library )