July 2015 - Silicon Tern

Run Spark job with Talend v6

Talend v6 supports the native integration of Spark , this means you don’t need to code in order to  process your data. With Talend Spark jobs you do , for example : Read files from HDFS , Convert them to Avro, Convert them to Parquet file Perform a join between tables , Select aggregate .. etc For DW projects you can build the fact tables using Spark    

Read More >

Advanced Analytics for Spark

Authored by a substantial portion of Cloudera’s Data Science team (Sean Owen, Sandy Ryza, Uri Laserson, Josh Wills), Advanced Analytics with Spark (currently in Early Release from O’Reilly Media) is the newest addition to the pipeline of ecosystem books by Cloudera engineers. I talked to the authors recently. Why did you decide to write this book? We think it’s mostly to fill a gap between what a lot of people need to know to be productive with large-scale analytics on…

Read More >

How to use tGoogleAddressRow

This component will be available in the Palette of Talend Studio on the condition that you have subscribed to one of the Talend Platform products.   tGoogleAddressRow accesses the Google Geocoding API via a HTTP request to obtain geographic coordinates and other geographic information according to the address information you provided. For further information about Google Geocoding API, see The Google Geocoding API. tGoogleAddressRow allows you to converts human-readable addresses into geographic coordinates and other geographic information you can…

Read More >