February 2016 - Silicon Tern

Talend : How to load a text file into Hive ?

In this tutorial we will show you how to Load a file into a Hive table. First of all create the following components tHiveConnection tHiveCreateTable tHiveRow   In your HDFS distribution, create a small file with only two columns Name, age Here is the capture of the job   Configure the tHiveRow  

Read More >

Talend : tJoin vs tMap

Talend : tJoin vs tMap   tJoin tJoin Accept only 2 inputs ,1 main flow + 1 lookup tJoin joins two tables by doing an exact match on several columns. It compares columns from the main flow with reference columns from the lookup flow and outputs the main flow data and/or the rejected data.   tMap Accept multiple inputs ( can join multi tables on multiples keys ) tMap is an advanced component, which integrates itself as plugin to…

Read More >

Split file rows into multiple files depending on a column’s value

Split file rows into multiple files depending on a column’s values In this tutorial we will show you how to split a file into multiple small files depending on a value in one specific column. Scenario :    Collect all the values of our “pivot” column chosen for the decomposition of the file Then, for each value of this column, extracting the corresponding records to the current value and save them in a new file. Let’s take a file…

Read More >