Map reduce Tutorial with Python

Hadoop Map reduce Tutorial with Python

Pre-requisities

 

Download Canopy

On the console download Map reduce package : !pip install mrjob

 

Create map reduce code class : 

from mrjob.job import MRJob

class MRRatingCounter(MRJob):

def mapper(self, key, line):

(userID, movieID, rating, timestamp) = line.split(‘\t’)

yield rating, 1

def reducer(self, rating, occurences):

yield rating, sum(occurences)

if __name__ == ‘__main__’:

MRRatingCounter.run()

 

Save this File in your current workspace  ex : RatingCounter.py

Leave a Reply