Talend split flow to multiple files

Talend split flow to multiple files : Split file rows into multiple files depending on a column’s values

In this tutorial we will show you how to split a file into multiple small files depending on a value in one specific column.

Scenario : SplitaFile8

 

Collect all the values of our “pivot” column chosen for the decomposition of the file

Then, for each value of this column, extracting the corresponding records to the current value and save them in a new file.

Let’s take a file ( Chicago crime data ) , we will choose only a small sample of rows ( around 20 000 rows ) and depending on the value of District we will create a separate file.

Here is the schema of the file :

SplitaFile7

 

for this we need the following components :

  1. tfileInputDelimited
  2. tSampleRow
  3. tfilterColums
  4. tUnique
  5. tFlowToIterate
  6. TInputDelimited
  7. tSampleRow
  8. TfilterRows
  9. TfileOutputDelimited

SplitaFile9

 

 

 

 

Download this job here :

Talend Job : Split file rows into multiple files depending on a column’s value

Please contact us using the form below for more info


Leave a Reply