Map reduce Tutorial with Python

Hadoop Map reduce Tutorial with Python



Download Canopy

On the console download Map reduce package : !pip install mrjob


Create map reduce code class : 

from mrjob.job import MRJob

class MRRatingCounter(MRJob):

def mapper(self, key, line):

(userID, movieID, rating, timestamp) = line.split(‘\t’)

yield rating, 1

def reducer(self, rating, occurences):

yield rating, sum(occurences)

if __name__ == ‘__main__’:


Save this File in your current workspace  ex :

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.