Spark Streaming Programming Guide
参考,http://spark.incubator.apache.org/docs/latest/streaming-programming-guide.html Overview SparkStreaming支持多种流输入,like Kafka, Flume, Twitter, ZeroMQ or plain old TCP sockets,并且可以在上面进行transform操作,最终数据存入HDFS,数据库或dashboard 另外可以把Spark’s in-builtmachine learningalgorithms, andgraph processingalgorithms用于spark streaming,这个比较有意思SparkStreaming的原理,下面那幅图很清晰,将stream数据离散化,提出的概念DStream,其实就是sequence ofRDDs Spark Streaming is an extension of the core Spark API that allows enables high-throughput, fault-tolerant...