Flink落HDFS数据按事件时间分区解决方案
0x1 摘要 Hive离线数仓中为了查询分析方便,几乎所有表都会划分分区,最为常见的是按天分区,Flink通过以下配置把数据写入HDFS, BucketingSink<Object> sink = new BucketingSink<>(path); //通过这样的方式来实现数据跨天分区 sink.setBucketer(new DateTimeBucketer<>("yyyy/MM/dd")); sink.setWriter(new StringWriter<>()); sink.setBatchSize(1024 * 1024 * 256L); sink.setBatchRolloverInterval(30 * 60 * 1000L); sink.setInactiveBucketThreshold(3 * 60 * 1000L); sink.setInactiveBucketCheckInterval(30 * 1000L); sink.setInProgressSuffix(".in-progress"); sink.setPe...