Code review of talend jobs by reading talend job .item file


In this post i will like to explain a solution which you can develop and use to do some basic checks/code review for your talend jobs against your coding/development standards.


What this means is that if you have multiple developers working on your talend project and you have some defined code standards/conventions around talend jobs – you can validate all your jobs for this with no manual effort involved. This would save time and will reduce manual errors and make sure jobs conform to standards and best practices. Coding standards could start from naming convention for your jobs, naming convention for components in job, advance properties for components like “use cursor” option for tjdbcinput, commit size, batch size etc…

If you had to review jobs manually its a very cumbersome and daunting task and is not full proof. Its highly prone to errors. And with talend every developer can have his own way of writing jobs.The review task becomes increasingly difficult with more developers on team and more jobs to review.

Logic for review utility job

Talend job’s entire metadata is available in *.items, *.properties and *.screenshot files.
From above 3 files *.item file is of utmost importance as this contains XML which contains all details about components and their settings in a job (all components, comments, settings, properties etc..).

You would have to create a new talend job read this above .items file for all jobs in a loop and process this xml structure with xPath – to the level/depth you want to retrieve information. For example you can read this xml at element parameter level (“/talendfile:ProcessType/node/elementParameter”) to fetch properties for each component and then validate each of these properties against your review checks and build your review report.

You can design a code standard repository or metadata table against which you can validate your jobs metadata and raise deviations and alerts

