[Spark][Python][RDD][DataFrame]从 RDD 构造 DataFrame 例子
[Spark][Python][RDD][DataFrame]从 RDD 构造 DataFrame 例子 from pyspark.sql.types import * schema = StructType( [ StructField("age",IntegerType(),True), StructField("name",StringType(),True), StructField("pcode",StringType(),True) ] ) myrdd = sc.parallelize([(40,"Abram","01601"),(16,"Lucia","87501")]) mydf = sqlContext.createDataFrame(myrdd,schema) mydf.limit(5).show() +---+-----+-----+ |age| name|pcode| +---+-----+-----+ | 40|Abram|01601| | 16|Lucia|87501| +---+-----+-----+