Hadoop2.7实战v1.0之Hive-2.0.0+MySQL远程模式安装
环境:Apache Hadoop2.7分布式集群环境(HDFS HA,Yarn HA,HBase HA) 元数据库mysql部署在hadoop-01机器上 user:hive password:hive database:hive_remote_meta hive服务端部署在hadoop-01机器上 hive客户端部署在hadoop-02机器上 1.Install MySQL5.6.23 on hadoop-01 2.Create db and user hadoop-01:mysqladmin:/usr/local/mysql:>mysql -uroot -p mysql> create database hive_remote_meta; Query OK, 1 row affected (0.04 sec) mysql> create user 'hive' identified by 'hive'; Query OK, 0 rows affected (0.05 sec) mysql> grant all privileges on hive_remote_meta.* to 'hive'@'%'; Query OK, 0 rows affected (0.03 sec) mysql> flush privileges; Query OK, 0 rows affected (0.01 sec) 3.在安装Install hive-2.0.0 [root@hadoop-01 tmp]# wget http://apache.communilink.net/hive/hive-2.0.0/apache-hive-2.0.0-bin.tar.gz [root@hadoop-01 tmp]# tar zxvf apache-hive-2.0.0-bin.tar.gz [root@hadoop-01 tmp]# mv apache-hive-2.0.0-bin /hadoop/hive-remote-server [root@hadoop-01 tmp]# cd /hadoop/hive-remote-server [root@hadoop-01 hive-remote-server]# ll total 588 drwxr-xr-x 3 root root 4096 Mar 29 23:19 bin drwxr-xr-x 2 root root 4096 Mar 29 23:19 conf drwxr-xr-x 4 root root 4096 Mar 29 23:19 examples drwxr-xr-x 7 root root 4096 Mar 29 23:19 hcatalog drwxr-xr-x 4 root root 12288 Mar 29 23:19 lib -rw-r--r-- 1 root root 26335 Jan 22 12:28 LICENSE -rw-r--r-- 1 root root 513 Jan 22 12:28 NOTICE -rw-r--r-- 1 root root 4348 Feb 10 09:50 README.txt -rw-r--r-- 1 root root 527063 Feb 10 09:56 RELEASE_NOTES.txt drwxr-xr-x 4 root root 4096 Mar 29 23:19 scripts [root@hadoop-01 hive-remote-server]# 4.Configure profile on hadoop-01 [root@hadoop-01 ~]# vi /etc/profile export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera export HADOOP_HOME=/hadoop/hadoop-2.7.2 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" export HBASE_HOME=/hadoop/hbase-1.2.0 export ZOOKEEPER_HOME=/hadoop/zookeeper export HIVE_HOME=/hadoop/hive-remote-server export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH [root@hadoop-01 ~]# source /etc/profile [root@hadoop-01 ~]# 5.configure jdbc jar [root@hadoop-01 tmp]# wget http://ftp.nchu.edu.tw/Unix/Database/MySQL/Downloads/Connector-J/mysql-connector-java-5.1.36.tar.gz [root@hadoop-01 tmp]# tar zxvf mysql-connector-java-5.1.36.tar.gz [root@hadoop-01 tmp]# cd mysql-connector-java-5.1.36 [root@hadoop-01 mysql-connector-java-5.1.36]# ll total 1428 -rw-r--r-- 1 root root 90430 Jun 20 2015 build.xml -rw-r--r-- 1 root root 235082 Jun 20 2015 CHANGES -rw-r--r-- 1 root root 18122 Jun 20 2015 COPYING drwxr-xr-x 2 root root 4096 Mar 29 23:35 docs -rw-r--r-- 1 root root 972009 Jun 20 2015 mysql-connector-java-5.1.36-bin.jar -rw-r--r-- 1 root root 61423 Jun 20 2015 README -rw-r--r-- 1 root root 63674 Jun 20 2015 README.txt drwxr-xr-x 8 root root 4096 Jun 20 2015 src [root@hadoop-01 mysql-connector-java-5.1.36]# cp mysql-connector-java-5.1.36-bin.jar $HIVE_HOME/lib/ 6.Configure Hive服务端 [root@hadoop-01 ~]# cd $HIVE_HOME/conf [root@hadoop-01 conf]# cp hive-default.xml.template hive-default.xml # "hdfs://mycluster"是指$HADOOP_HOME/etc/hadoop/core-site.xml文件的fs.defaultFS的值(NameNode HA URI) #格式:jdbc:mysql:///?createDatabaseIfNotExist=true 点击(此处)折叠或打开 [root@hadoop-01 conf]# vi hive-site.xml <?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>hdfs://mycluster/user/hive_remote/warehouse</value> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://hadoop-01:3306/hive_remote_meta?createDatabaseIfNotExist=true</value> <description>JDBC connect string for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password to use against metastore database</description> </property> <property> <name>hive.hwi.war.file</name> <value>${HIVE_HOME}/lib/hive-hwi-2.0.0.jar</value> <description>This sets the path to the HWI war file, relative to ${HIVE_HOME}. </description> </property> </configuration> "hive-site.xml" 26L, 1056C written [root@hadoop-01 conf]# 7.scp hive to client [root@hadoop-01 hadoop]# pwd /hadoop [root@hadoop-01 hadoop]# scp -r hive-remote-server root@hadoop-02:/hadoop/hive-remote-client 8.Configure profile on hadoop-01 [root@hadoop-01 ~]# vi /etc/profile export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera export HADOOP_HOME=/hadoop/hadoop-2.7.2 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" export HBASE_HOME=/hadoop/hbase-1.2.0 export ZOOKEEPER_HOME=/hadoop/zookeeper export HIVE_HOME=/hadoop/hive-remote-client export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH [root@hadoop-01 ~]# source /etc/profile [root@hadoop-01 ~]# 9.Configure Hive客户端 点击(此处)折叠或打开 [root@hadoop-02 conf]# vi hive-site.xml <?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- thrift://<host_name>:<port> 默认端口是9083 --> <property> <name>hive.metastore.uris</name> <value>thrift://hadoop-01:9083</value> <description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description> </property> <!-- hive表的默认存储路径 --> <property> <name>hive.metastore.warehouse.dir</name> <value>hdfs://mycluster/user/hive_remote/warehouse</value> </property> </configuration> "hive-site.xml" 26L, 1056C written [root@hadoop-01 conf]# 10.hive服务端如果是第一次需要执行初始化命令:schematool -initSchema -dbType mysql [root@hadoop-01 bin]# schematool -initSchema -dbType mysql Metastore connection URL: jdbc:mysql://hadoop-01.telenav.cn:3306/hive_remote_meta?createDatabaseIfNotExist=true Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: hive Starting metastore schema initialization to 2.0.0 Initialization script hive-schema-2.0.0.mysql.sql Initialization script completed schemaTool completed [root@hadoop-01 bin]# 11.启动hive服务端和客户端 【服务端】: hive --service metastore -p [port] 如果不加端口默认启动:hive --service metastore,则默认监听端口是:9083 ,注意客户端中的端口配置需要和启动监听的端口一致。服务端启动正常后,客户端就可以执行hive操作了 ### & 以后台方式启动 [root@hadoop-01 bin]# hive --service metastore & Starting Hive Metastore Server 【客户端】: [root@hadoop-02 hive-remote-client]# cd bin [root@hadoop-02 bin]# hive Logging initialized using configuration in jar:file:/hadoop/hive-remote-client/lib/hive-common-2.0.0.jar!/hive-log4j2.properties Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases. hive> 12.创建表,load数据验证 ## "tab制表符"分隔 [root@hadoop-02 bin]# vi /tmp/studentInfo.txt 1 a 26 110 1 a 26 113 2 b 11 222 [root@hadoop-02 bin]# hive Logging initialized using configuration in jar:file:/hadoop/hive-remote-client/lib/hive-common-2.0.0.jar!/hive-log4j2.properties Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. tez, spark) or using Hive 1.X releases. hive> > create table studentinfo (id int,name string, age int,tel string) > row format delimited fields terminated by '\t' > stored as textfile; hive> load data local inpath '/tmp/studentInfo.txt' overwrite into table studentinfo; Loading data to table default.studentinfo Moved: 'hdfs://mycluster/user/hive_remote/warehouse/studentinfo/studentInfo.txt' to trash at: hdfs://mycluster/user/root/.Trash/Current OK Time taken: 2.941 seconds hive> select * from studentinfo; OK 1 a 26 113 2 b 11 222 Time taken: 1.607 seconds, Fetched: 2 row(s) hive 13.查看hdfs文件系统 [root@hadoop-01 bin]# ps -ef|grep hive root 11629 4509 1 19:39 pts/0 00:00:21 /usr/java/jdk1.7.0_67-cloudera/bin/java -Xmx256m -Djava.library.path=/hadoop/hadoop-2.7.2/lib -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/hadoop/hadoop-2.7.2/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/hadoop/hadoop-2.7.2 -Dhadoop.id.str=root -Dhadoop.root.logger=INFO,console -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dlog4j.configurationFile=hive-log4j2.properties -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /hadoop/hive-remote-server/lib/hive-service-2.0.0.jar org.apache.hadoop.hive.metastore.HiveMetaStore root 13351 4509 0 19:57 pts/0 00:00:00 grep hive [root@hadoop-01 bin]# hadoop fs -ls /user/hive_remote/warehouse/studentinfo Found 1 items -rwx------ 3 root root 22 2016-04-16 19:54 /user/hive_remote/warehouse/studentinfo/studentInfo.txt [root@hadoop-01 bin]# hadoop fs -cat /user/hive_remote/warehouse/studentinfo/studentInfo.txt 1 a 26 113 2 b 11 222 [root@hadoop-01 bin]# 14.参考 https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin