[Tutorial] Talend machine learning job using Kafka & Cassandra on Spark

[Tutorial] Talend machine learning job using Kafka & Cassandra on Spark

This job is to generate your recommendations Pipeline.

It is a Spark Streaming job so it will continue to run until it is stopped.

After starting this job, open and run the Push_Clickstream_Data_Kafka job to simulate the flow of Real-Time website traffic. Then come back to this job to watch
the recommendations display to the execution window as the Clickstream data streams in through Kafka. The recommendation data is also sent to Flat File for
Big Data Analytics as well as to a Cassandra NoSQL database table which can be referenced by a WebUI.

The tKafkaCreateTopic component creates a Kafka topic that the other Kafka components can use.

tKafkaInput is a generic message broker that transmits messages to the Job that runs transformations over these messages.

Cassandra : This component will be available in the Palette of Talend Studio on the condition that you have subscribed to one of the Talend solutions with Big Data.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.