您现在的位置是:首页 > 文章详情

大数据工具篇之flume1.4-安装部署指南

日期:2014-01-16点击:709

一、引言

flume-ng是一个分布式、高可靠和高效的日志收集系统,flume-ng是flume的新版本的意思,其中“ng”意为new generate(新一代),目前来说,flume-ng 1.4是最新的版本。flume-ng与flume相比,发生了很大的变化,因为之前一直在flume0.9的版本,一直没有升级到flume-ng,最近因为项目需要,做了一次升级,发现了一些问题,特记录下来,分享给大家。

二、版本说明

flume-ng 1.4.0

三、安装步骤

下载、解压、安装JDK、设置环境变量部分已经有很多介绍性的问题,不做说明。需要特别说明之处的是,flume-ng不需要要zookeeper,无需设置。

四、flume-ng bug

安装完成后运行flume-ng会出现错误信息,这主要是因为shell脚本的问题,我将修改后的flume-ng完整的上传如下,其中标注:#zhangzl下面的行是需要修改的部分。完整脚本如下所示:

 1 #!/bin/bash  2 #  3 #  4 # Licensed to the Apache Software Foundation (ASF) under one  5 # or more contributor license agreements. See the NOTICE file  6 # distributed with this work for additional information  7 # regarding copyright ownership. The ASF licenses this file  8 # to you under the Apache License, Version 2.0 (the  9 # "License"); you may not use this file except in compliance  10 # with the License. You may obtain a copy of the License at  11 #  12 # http://www.apache.org/licenses/LICENSE-2.0  13 #  14 # Unless required by applicable law or agreed to in writing,  15 # software distributed under the License is distributed on an  16 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY  17 # KIND, either express or implied. See the License for the  18 # specific language governing permissions and limitations  19 # under the License.  20 #  21  22 ################################  23 # constants  24 ################################  25  26 FLUME_AGENT_CLASS="org.apache.flume.node.Application"  27 FLUME_AVRO_CLIENT_CLASS="org.apache.flume.client.avro.AvroCLIClient"  28 FLUME_VERSION_CLASS="org.apache.flume.tools.VersionInfo"  29 FLUME_TOOLS_CLASS="org.apache.flume.tools.FlumeToolsMain"  30  31 CLEAN_FLAG=1  32 ################################  33 # functions  34 ################################  35  36 info() {  37 if [ ${CLEAN_FLAG} -ne 0 ]; then  38 local msg=$1  39 echo "Info: $msg" >&2  40 fi  41 }  42  43 warn() {  44 if [ ${CLEAN_FLAG} -ne 0 ]; then  45 local msg=$1  46 echo "Warning: $msg" >&2  47 fi  48 }  49  50 error() {  51 local msg=$1  52 local exit_code=$2  53  54 echo "Error: $msg" >&2  55  56 if [ -n "$exit_code" ] ; then  57  exit $exit_code  58 fi  59 }  60  61 # If avail, add Hadoop paths to the FLUME_CLASSPATH and to the  62 # FLUME_JAVA_LIBRARY_PATH env vars.  63 # Requires Flume jars to already be on FLUME_CLASSPATH.  64 add_hadoop_paths() {  65 local HADOOP_IN_PATH=$(PATH="${HADOOP_HOME:-${HADOOP_PREFIX}}/bin:$PATH" \  66 which hadoop 2>/dev/null)  67  68 if [ -f "${HADOOP_IN_PATH}" ]; then  69 info "Including Hadoop libraries found via ($HADOOP_IN_PATH) for HDFS access"  70  71 # determine hadoop java.library.path and use that for flume  72 local HADOOP_CLASSPATH=""  73 local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" \  74  ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty \  75  java.library.path)  76  77 # look for the line that has the desired property value  78 # (considering extraneous output from some GC options that write to stdout)  79 # IFS = InternalFieldSeparator (set to recognize only newline char as delimiter)  80 IFS=$'\n'  81 for line in $HADOOP_JAVA_LIBRARY_PATH; do  82 #if [[ $line =~ ^java\.library\.path=(.*)$ ]]; then  83 if [[ "$line" =~ "^java\.library\.path=(.*)$" ]]; then  84 HADOOP_JAVA_LIBRARY_PATH=${BASH_REMATCH[1]}  85  break  86 fi  87 done  88  unset IFS  89  90 if [ -n "${HADOOP_JAVA_LIBRARY_PATH}" ]; then  91 FLUME_JAVA_LIBRARY_PATH="$FLUME_JAVA_LIBRARY_PATH:$HADOOP_JAVA_LIBRARY_PATH"  92 fi  93  94  # determine hadoop classpath  95 HADOOP_CLASSPATH=$($HADOOP_IN_PATH classpath)  96  97  # hack up and filter hadoop classpath  98 local ELEMENTS=$(sed -e 's/:/ /g' <<<${HADOOP_CLASSPATH})  99  local ELEMENT 100 for ELEMENT in $ELEMENTS; do 101  local PIECE 102 for PIECE in $(echo $ELEMENT); do 103  #zhangzl 104 if [[ $PIECE =~ "slf4j-(api|log4j12).*\.jar" ]]; then 105 info "Excluding $PIECE from classpath" 106  continue 107 else 108 FLUME_CLASSPATH="$FLUME_CLASSPATH:$PIECE" 109 fi 110 done 111 done 112 113 fi 114 } 115 add_HBASE_paths() { 116 local HBASE_IN_PATH=$(PATH="${HBASE_HOME}/bin:$PATH" \ 117 which hbase 2>/dev/null) 118 119 if [ -f "${HBASE_IN_PATH}" ]; then 120 info "Including HBASE libraries found via ($HBASE_IN_PATH) for HBASE access" 121 122 # determine HBASE java.library.path and use that for flume 123 local HBASE_CLASSPATH="" 124 local HBASE_JAVA_LIBRARY_PATH=$(HBASE_CLASSPATH="$FLUME_CLASSPATH" \ 125  ${HBASE_IN_PATH} org.apache.flume.tools.GetJavaProperty \ 126  java.library.path) 127 128 # look for the line that has the desired property value 129 # (considering extraneous output from some GC options that write to stdout) 130 # IFS = InternalFieldSeparator (set to recognize only newline char as delimiter) 131 IFS=$'\n' 132 for line in $HBASE_JAVA_LIBRARY_PATH; do 133  #zhangzl 134 if [[ $line =~ "^java\.library\.path=(.*)$" ]]; then 135 HBASE_JAVA_LIBRARY_PATH=${BASH_REMATCH[1]} 136  break 137 fi 138 done 139  unset IFS 140 141 if [ -n "${HBASE_JAVA_LIBRARY_PATH}" ]; then 142 FLUME_JAVA_LIBRARY_PATH="$FLUME_JAVA_LIBRARY_PATH:$HBASE_JAVA_LIBRARY_PATH" 143 fi 144 145  # determine HBASE classpath 146 HBASE_CLASSPATH=$($HBASE_IN_PATH classpath) 147 148  # hack up and filter HBASE classpath 149 local ELEMENTS=$(sed -e 's/:/ /g' <<<${HBASE_CLASSPATH}) 150  local ELEMENT 151 for ELEMENT in $ELEMENTS; do 152  local PIECE 153 for PIECE in $(echo $ELEMENT); do 154  #zhangzl 155 if [[ $PIECE =~ "slf4j-(api|log4j12).*\.jar" ]]; then 156 info "Excluding $PIECE from classpath" 157  continue 158 else 159 FLUME_CLASSPATH="$FLUME_CLASSPATH:$PIECE" 160 fi 161 done 162 done 163 FLUME_CLASSPATH="$FLUME_CLASSPATH:$HBASE_HOME/conf" 164 165 fi 166 } 167 168 set_LD_LIBRARY_PATH(){ 169 #Append the FLUME_JAVA_LIBRARY_PATH to whatever the user may have specified in 170 #flume-env.sh 171 if [ -n "${FLUME_JAVA_LIBRARY_PATH}" ]; then 172 export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${FLUME_JAVA_LIBRARY_PATH}" 173 fi 174 } 175 176 display_help() { 177 cat <<EOF 178 Usage: $0 <command> [options]... 179 180 commands: 181  help display this help text 182  agent run a Flume agent 183 avro-client run an avro Flume client 184 version show Flume version info 185 186 global options: 187 --conf,-c <conf> use configs in <conf> directory 188 --classpath,-C <cp> append to the classpath 189 --dryrun,-d do not actually start Flume, just print the command 190 --plugins-path <dirs> colon-separated list of plugins.d directories. See the 191 plugins.d section in the user guide for more details. 192 Default: \$FLUME_HOME/plugins.d 193 -Dproperty=value sets a Java system property value 194 -Xproperty=value sets a Java -X option 195 196 agent options: 197 --conf-file,-f <file> specify a config file (required) 198 --name,-n <name> the name of this agent (required) 199 --help,-h display help text 200 201 avro-client options: 202 --rpcProps,-P <file> RPC client properties file with server connection params 203 --host,-H <host> hostname to which events will be sent 204 --port,-p <port> port of the avro source 205 --dirname <dir> directory to stream to avro source 206 --filename,-F <file> text file to stream to avro source (default: std input) 207 --headerFile,-R <file> File containing event headers as key/value pairs on each new line 208 --help,-h display help text 209 210 Either --rpcProps or both --host and --port must be specified. 211 212 Note that if <conf> directory is specified, then it is always included first 213 in the classpath. 214 215 EOF 216 } 217 218 run_flume() { 219  local FLUME_APPLICATION_CLASS 220 221 if [ "$#" -gt 0 ]; then 222 FLUME_APPLICATION_CLASS=$1 223 shift 224 else 225 error "Must specify flume application class" 1 226 fi 227 228 if [ ${CLEAN_FLAG} -ne 0 ]; then 229 set -x 230 fi 231 $EXEC $JAVA_HOME/bin/java $JAVA_OPTS -cp "$FLUME_CLASSPATH" \ 232 -Djava.library.path=$FLUME_JAVA_LIBRARY_PATH "$FLUME_APPLICATION_CLASS" $* 233 } 234 235 ################################ 236 # main 237 ################################ 238 239 # set default params 240 FLUME_CLASSPATH="" 241 FLUME_JAVA_LIBRARY_PATH="" 242 JAVA_OPTS="-Xmx20m" 243 LD_LIBRARY_PATH="" 244 245 opt_conf="" 246 opt_classpath="" 247 opt_plugins_dirs="" 248 opt_java_props="" 249 opt_dryrun="" 250 251 mode=$1 252 shift 253 254 case "$mode" in 255  help) 256  display_help 257 exit 0 258  ;; 259  agent) 260 opt_agent=1 261  ;; 262  node) 263 opt_agent=1 264 warn "The \"node\" command is deprecated. Please use \"agent\" instead." 265  ;; 266 avro-client) 267 opt_avro_client=1 268  ;; 269  tool) 270 opt_tool=1 271  ;; 272  version) 273 opt_version=1 274 CLEAN_FLAG=0 275  ;; 276 *) 277 error "Unknown or unspecified command '$mode'" 278 echo 279  display_help 280 exit 1 281  ;; 282 esac 283 284 args="" 285 while [ -n "$*" ] ; do 286 arg=$1 287 shift 288 289 case "$arg" in 290 --conf|-c) 291 [ -n "$1" ] || error "Option --conf requires an argument" 1 292 opt_conf=$1 293 shift 294  ;; 295 --classpath|-C) 296 [ -n "$1" ] || error "Option --classpath requires an argument" 1 297 opt_classpath=$1 298 shift 299  ;; 300 --dryrun|-d) 301 opt_dryrun="1" 302  ;; 303 --plugins-path) 304 opt_plugins_dirs=$1 305 shift 306  ;; 307 -D*) 308 opt_java_props="$opt_java_props $arg" 309  ;; 310 -X*) 311 opt_java_props="$opt_java_props $arg" 312  ;; 313 *) 314 args="$args $arg" 315  ;; 316 esac 317 done 318 319 # make opt_conf absolute 320 if [[ -n "$opt_conf" && -d "$opt_conf" ]]; then 321 opt_conf=$(cd $opt_conf; pwd) 322 fi 323 324 # allow users to override the default env vars via conf/flume-env.sh 325 if [ -z "$opt_conf" ]; then 326 warn "No configuration directory set! Use --conf <dir> to override." 327 elif [ -f "$opt_conf/flume-env.sh" ]; then 328 info "Sourcing environment configuration script $opt_conf/flume-env.sh" 329 source "$opt_conf/flume-env.sh" 330 fi 331 332 # append command-line java options to stock or env script JAVA_OPTS 333 if [ -n "${opt_java_props}" ]; then 334 JAVA_OPTS="${JAVA_OPTS} ${opt_java_props}" 335 fi 336 337 # prepend command-line classpath to env script classpath 338 if [ -n "${opt_classpath}" ]; then 339 if [ -n "${FLUME_CLASSPATH}" ]; then 340 FLUME_CLASSPATH="${opt_classpath}:${FLUME_CLASSPATH}" 341 else 342 FLUME_CLASSPATH="${opt_classpath}" 343 fi 344 fi 345 346 if [ -z "${FLUME_HOME}" ]; then 347 FLUME_HOME=$(cd $(dirname $0)/..; pwd) 348 fi 349 350 # prepend $FLUME_HOME/lib jars to the specified classpath (if any) 351 if [ -n "${FLUME_CLASSPATH}" ] ; then 352 FLUME_CLASSPATH="${FLUME_HOME}/lib/*:$FLUME_CLASSPATH" 353 else 354 FLUME_CLASSPATH="${FLUME_HOME}/lib/*" 355 fi 356 357 # load plugins.d directories 358 PLUGINS_DIRS="" 359 if [ -n "${opt_plugins_dirs}" ]; then 360 PLUGINS_DIRS=$(sed -e 's/:/ /g' <<<${opt_plugins_dirs}) 361 else 362 PLUGINS_DIRS="${FLUME_HOME}/plugins.d" 363 fi 364 365 unset plugin_lib plugin_libext plugin_native 366 for PLUGINS_DIR in $PLUGINS_DIRS; do 367 if [[ -d ${PLUGINS_DIR} ]]; then 368 for plugin in ${PLUGINS_DIR}/*; do 369  if [[ -d "$plugin/lib" ]]; then 370  plugin_lib="${plugin_lib}${plugin_lib+:}${plugin}/lib/*" 371  fi 372  if [[ -d "$plugin/libext" ]]; then 373  plugin_libext="${plugin_libext}${plugin_libext+:}${plugin}/libext/*" 374  fi 375  if [[ -d "$plugin/native" ]]; then 376  plugin_native="${plugin_native}${plugin_native+:}${plugin}/native" 377  fi 378  done 379  fi 380 done 381 382 if [[ -n "${plugin_lib}" ]] 383 then 384  FLUME_CLASSPATH="${FLUME_CLASSPATH}:${plugin_lib}" 385 fi 386 387 if [[ -n "${plugin_libext}" ]] 388 then 389  FLUME_CLASSPATH="${FLUME_CLASSPATH}:${plugin_libext}" 390 fi 391 392 if [[ -n "${plugin_native}" ]] 393 then 394  if [[ -n "${FLUME_JAVA_LIBRARY_PATH}" ]] 395  then 396  FLUME_JAVA_LIBRARY_PATH="${FLUME_JAVA_LIBRARY_PATH}:${plugin_native}" 397  else 398  FLUME_JAVA_LIBRARY_PATH="${plugin_native}" 399  fi 400 fi 401 402 # find java 403 if [ -z "${JAVA_HOME}" ] ; then 404  warn "JAVA_HOME is not set!" 405  # Try to use Bigtop to autodetect JAVA_HOME if it's available 406  if [ -e /usr/libexec/bigtop-detect-javahome ] ; then 407  . /usr/libexec/bigtop-detect-javahome 408  elif [ -e /usr/lib/bigtop-utils/bigtop-detect-javahome ] ; then 409  . /usr/lib/bigtop-utils/bigtop-detect-javahome 410  fi 411 412  # Using java from path if bigtop is not installed or couldn't find it 413  if [ -z "${JAVA_HOME}" ] ; then 414  JAVA_DEFAULT=$(type -p java) 415  [ -n "$JAVA_DEFAULT" ] || error "Unable to find java executable. Is it in your PATH?" 1 416  JAVA_HOME=$(cd $(dirname $JAVA_DEFAULT)/..; pwd) 417  fi 418 fi 419 420 # look for hadoop libs 421 add_hadoop_paths 422 add_HBASE_paths 423 424 # prepend conf dir to classpath 425 if [ -n "$opt_conf" ]; then 426  FLUME_CLASSPATH="$opt_conf:$FLUME_CLASSPATH" 427 fi 428 429 set_LD_LIBRARY_PATH 430 # allow dryrun 431 EXEC="exec" 432 if [ -n "${opt_dryrun}" ]; then 433  warn "Dryrun mode enabled (will not actually initiate startup)" 434  EXEC="echo" 435 fi 436 437 # finally, invoke the appropriate command 438 if [ -n "$opt_agent" ] ; then 439  run_flume $FLUME_AGENT_CLASS $args 440 elif [ -n "$opt_avro_client" ] ; then 441  run_flume $FLUME_AVRO_CLIENT_CLASS $args 442 elif [ -n "${opt_version}" ] ; then 443  run_flume $FLUME_VERSION_CLASS $args 444 elif [ -n "${opt_tool}" ] ; then 445  run_flume $FLUME_TOOLS_CLASS $args 446 else 447  error "This message should never appear" 1 448 fi 449 450 exit 0
View Code

五、测试配置文件

在conf目录下创建example-conf.properties文件,属性如下所示:

 1 # Describe the source  2 a1.sources = r1  3 a1.sinks = k1  4 a1.channels = c1  5  6 # Describe/configure the source  7 a1.sources.r1.type = avro  8 a1.sources.r1.bind = localhost  9 a1.sources.r1.port = 44444 10 11 # Describe the sink 12 # 将数据输出至日志中 13 a1.sinks.k1.type = logger 14 15 16 # Use a channel which buffers events in memory 17 a1.channels.c1.type = memory 18 a1.channels.c1.capacity = 1000 19 a1.channels.c1.transactionCapacity = 100 20 21 # Bind the source and sink to the channel 22 a1.sources.r1.channels = c1 23 a1.sinks.k1.channel = c1

六、运行命令

6.1 启动代理

[hadoop@hadoop1 conf]$ flume-ng agent -n a1 -f example-conf.properties

6.2 启动avro-client客户端向agent代理发送数据-需要单独启动新的窗口

[hadoop@hadoop1 conf]$ flume-ng avro-client -H localhost -p 44444 -F file01

七、结果查看

1 14/01/16 22:26:34 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 => /127.0.0.1:44444] OPEN 2 14/01/16 22:26:34 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 => /127.0.0.1:44444] BOUND: /127.0.0.1:44444 3 14/01/16 22:26:34 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 => /127.0.0.1:44444] CONNECTED: /127.0.0.1:54289 4 14/01/16 22:26:36 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 :> /127.0.0.1:44444] DISCONNECTED 5 14/01/16 22:26:36 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 :> /127.0.0.1:44444] UNBOUND 6 14/01/16 22:26:36 INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1:54289 :> /127.0.0.1:44444] CLOSED 7 14/01/16 22:26:36 INFO ipc.NettyServer: Connection to /127.0.0.1:54289 disconnected. 8 14/01/16 22:26:38 INFO sink.LoggerSink: Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64 hello world }

 


作者:张子良
出处:http://www.cnblogs.com/hadoopdev
本文版权归作者所有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

原文链接:https://yq.aliyun.com/articles/438569
关注公众号

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。

持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

转载内容版权归作者及来源网站所有,本站原创内容转载请注明来源。

文章评论

共有0条评论来说两句吧...

文章二维码

扫描即可查看该文章

点击排行

推荐阅读

最新文章