Use MapReduce to join two datasets

The two datasets are:

To join the two tables above by “student id”, we need to use MultipleInputs. The code is:

Compile and run it:

And the result in /my is:

One thought on “Use MapReduce to join two datasets

  1. Pingback: Use hive to join two datasets – Robin On Linux

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.