Spark学习之RDD简单算子
collect 返回RDD的所有元素 scala>varinput=sc.parallelize(Array(-1,0,1,2,2)) input:org.apache.spark.rdd.RDD[Int]=ParallelCollectionRDD[15]atparallelizeat<console>:27 scala>varresult=input.collect result:Array[Int]=Array(-1,0,1,2,2) count,coutByValue count返回RDD的元素数量,countByValue返回每个值的出现次数 scala>varinput=sc.parallelize(Array(-1,0,1,2,2)) scala>varresult=input.count result:Long=5 scala>varresult=input.countByValue result:scala.collection.Map[Int,Long]=Map(0->1,1->1,2->2,-1->...

