Use Oozie to run terasort

      No Comments on Use Oozie to run terasort

The better choice of “Action” for running terasort test case in Oozie is “Java Action” instead of “Mapreduce Action” because terasort need to run

first and then load ‘partitonFile’ by “TotalOrderPartitioner”. It’s not a simple Mapreduce job which need merely a few propertyies.

The directory of this”TerasortApp” which using “Java Action” of Oozie looks just like:

The core of this App is “workflow.xml”:

Note 1. In Cloudera environment, The Web UI will fail in the last step of creating sharelib for Oozie Service. To fix this problem:

Note 2. We can’t use property of ‘mapred.map.tasks’ to change the number of mappers in Terasort because it is actually decided by class ‘TotalOrderPartitioner’. Therefore I use ‘mapreduce.input.fileinputformat.split.minsize’ property to limit the number of mappers.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.