[Spark][Python][DataFrame][SQL]Spark对DataFrame直接执行SQL处理的例子
[Spark][Python][DataFrame][SQL]Spark对DataFrame直接执行SQL处理的例子 $cat people.json {"name":"Alice","pcode":"94304"} {"name":"Brayden","age":30,"pcode":"94304"} {"name":"Carla","age":19,"pcoe":"10036"} {"name":"Diana","age":46} {"name":"Etienne","pcode":"94104"} $ hdfs dfs -put people.json $pyspark sqlContext = HiveContext(sc) peopleDF = sqlContext.read.json("people.json") peopleDF.registerTempTable("people") tmpDF=sqlContext.sql(""" select * FROM people WHERE name like "A%" """) tmpDF.limit(3).show() ...