Use MapReduce to join two datasets

The two datasets are:

To join the two tables above by “student id”, we need to use MultipleInputs. The code is:

Compile and run it:

And the result in /my is:

One Comment

    Use hive to join two datasets – Robin On Linux

    […] previous article, I write java code of MapReduce-Framework to join two datasets. It need almost 100 lines of java […]

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.