[Hadoop]MapReduce中的Partitioner
partitioner在处理输入数据集时就像条件表达式(condition)一样工作。分区阶段发生在Map阶段之后,Reduce阶段之前。partitioner的个数等于reducer的个数(The number of partitioners is equal to the number of reducers)。这就意味着一个partitioner将根据reducer的个数来划分数据(That means a partitioner will divide the data according to the number of reducers)。因此,从一个单独partitioner传递过来的数据将会交由一个单独的reducer处理(the data passed from a single partitioner is processed by a single Reducer)。 1. Partitioner partitioner对Map中间输出结果的键值对进行分区。使用用户自定义的分区条件来对数据进行分区,它的工作方式类似于hash函数。partitioner的总个数与作...