17.3. 管理 Hive-低调大师

17.3. 管理 Hive

2018-01-02 575

如果你又任何一种关系型数据库的使用经验，那么你将在这里看到非常熟悉的操作。

17.3.1. 表管理

17.3.1.1. 创建表

CREATE TABLE member (name string, age int, sex int);

				
hive> CREATE TABLE member (name string, age int, sex int);
OK
Time taken: 0.687 seconds
hive>

基于现有的数据建表

				
hive> create table newtable as select * from oldtable;

17.3.1.2. 显示表

				
hive> SHOW TABLES;
OK
test
Time taken: 0.041 seconds, Fetched: 1 row(s)
hive>

通配符匹配表名称

show tables '*t*';

17.3.1.3. 删除表

				
hive> DROP TABLE test;
OK
Time taken: 1.337 seconds
hive>

17.3.1.4. 查看表结构

				
hive> CREATE TABLE member (name string, age int, sex int);
OK
Time taken: 0.273 seconds

hive> desc member;
OK
name                	string              	                    
age                 	int                 	                    
sex                 	int                 	                    
Time taken: 0.035 seconds, Fetched: 3 row(s)
hive>

17.3.1.5. 为表增加字段

增加一个字段 phone 字符串类型

				
hive> ALTER TABLE member ADD COLUMNS (phone String);
OK
Time taken: 0.188 seconds
hive> desc member;
OK
name                	string              	                    
age                 	int                 	                    
sex                 	int                 	                    
phone               	string              	                    
Time taken: 0.033 seconds, Fetched: 4 row(s)

17.3.1.6. 修改表名称

将 test 表重命名为 vipuser

				
hive> CREATE TABLE test (name string, age int, sex int);
OK
Time taken: 0.311 seconds
hive> ALTER TABLE test RENAME TO vipuser;
OK
Time taken: 0.115 seconds
hive> desc vipuser;
OK
name                	string              	                    
age                 	int                 	                    
sex                 	int                 	                    
Time taken: 0.032 seconds, Fetched: 3 row(s)
hive>

17.3.1.7. 使用已有表结构创建新表

仅仅创建表结构，不会复制数据过来。

				
hive> CREATE TABLE news_2017 LIKE news;
OK
Time taken: 0.311 seconds

17.3.2. 分区表

17.3.2.1. 创建分区表

				
hive> CREATE TABLE invites (foo INT, bar STRING) PARTITIONED BY (ds STRING);

17.3.2.2. 显示分区情况

				
hive> SHOW PARTITIONS passwd;
OK
computer=hadoop
computer=hbase
computer=hive
Time taken: 0.056 seconds, Fetched: 3 row(s)

17.3.2.3. 增加分区

				
hive> alter table member add partition (province='shenzhen');

17.3.2.4. 向分区表导入数据

				
hive> CREATE TABLE passwd (a string, b string, c string, d string, e string, f string) PARTITIONED BY (computer string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ':';
OK
Time taken: 0.323 seconds
hive> load data local inpath '/etc/passwd' overwrite into table passwd partition(computer="hive");
Loading data to table default.passwd partition (computer=hive)
OK
Time taken: 0.499 seconds
hive> select * from passwd;
OK
root	x	0	0	root	/root	hive
bin	x	1	1	bin	/bin	hive
daemon	x	2	2	daemon	/sbin	hive
adm	x	3	4	adm	/var/adm	hive
lp	x	4	7	lp	/var/spool/lpd	hive
sync	x	5	0	sync	/sbin	hive
shutdown	x	6	0	shutdown	/sbin	hive
halt	x	7	0	halt	/sbin	hive
mail	x	8	12	mail	/var/spool/mail	hive
operator	x	11	0	operator	/root	hive
games	x	12	100	games	/usr/games	hive
ftp	x	14	50	FTP User	/var/ftp	hive
nobody	x	99	99	Nobody	/	hive
dbus	x	81	81	System message bus	/	hive
polkitd	x	999	998	User for polkitd	/	hive
avahi	x	70	70	Avahi mDNS/DNS-SD Stack	/var/run/avahi-daemon	hive
avahi-autoipd	x	170	170	Avahi IPv4LL Stack	/var/lib/avahi-autoipd	hive
postfix	x	89	89		/var/spool/postfix	hive
sshd	x	74	74	Privilege-separated SSH	/var/empty/sshd	hive
ntp	x	38	38		/etc/ntp	hive
rpc	x	32	32	Rpcbind Daemon	/var/lib/rpcbind	hive
qemu	x	107	107	qemu user	/	hive
unbound	x	998	996	Unbound DNS resolver	/etc/unbound	hive
rpcuser	x	29	29	RPC Service User	/var/lib/nfs	hive
nfsnobody	x	65534	65534	Anonymous NFS User	/var/lib/nfs	hive
saslauth	x	997	76	"Saslauthd user"	/run/saslauthd	hive
radvd	x	75	75	radvd user	/	hive
nagios	x	1000	1000		/home/nagios	hive
apache	x	48	48	Apache	/usr/share/httpd	hive
exim	x	93	93		/var/spool/exim	hive
tss	x	59	59	Account used by the trousers package to sandbox the tcsd daemon	/dev/null	hive
git	x	996	994		/var/opt/gitlab	hive
gitlab-www	x	995	993		/var/opt/gitlab/nginx	hive
gitlab-redis	x	994	992		/var/opt/gitlab/redis	hive
gitlab-psql	x	993	991		/var/opt/gitlab/postgresql	hive
nginx	x	992	990	nginx user	/var/cache/nginx	hive
www	x	80	80	Web Application	/www	hive
mysql	x	27	27	MySQL Server	/var/lib/mysql	hive
redis	x	991	989	Redis Database Server	/var/lib/redis	hive
epmd	x	990	988	Erlang Port Mapper Daemon	/tmp	hive
rabbitmq	x	989	987	RabbitMQ messaging server	/var/lib/rabbitmq	hive
solr	x	1001	1001	Apache Solr	/srv/solr	hive
mongodb	x	184	986	MongoDB Database Server	/var/lib/mongodb	hive
test	x	1002	1002		/home/test	hive
sysaccount	x	988	985		/home/sysaccount	hive
systemd-bus-proxy	x	987	983	systemd Bus Proxy	/	hive
systemd-network	x	986	982	systemd Network Management	/	hive
elasticsearch	x	985	980	elasticsearch user	/home/elasticsearch	hive
zabbix	x	984	979	Zabbix Monitoring System	/var/lib/zabbix	hive
mysqlrouter	x	983	978	MySQL Router	/var/lib/mysqlrouter	hive
hadoop	x	1003	1003		/home/hadoop	hive
Time taken: 0.118 seconds, Fetched: 51 row(s)

hive> SHOW PARTITIONS passwd;
OK
computer=hive
Time taken: 0.058 seconds, Fetched: 1 row(s)

17.3.3. 视图管理

17.3.3.1. 创建视图

				
hive> CREATE VIEW v_test AS SELECT name,age FROM member where age>20;
hive> select * from v_test;

17.3.3.2. 删除视图

				
hive> drop view test;
OK
Time taken: 0.276 seconds

判断视图是否存在

				
hive> DROP VIEW IF EXISTS v_test;
OK
Time taken: 0.495 seconds

17.3.4. 数据管理

17.3.4.1. 从文本文件导入数据

首先创建一个文本文件，如下：

				
[root@localhost ~]# cat /tmp/hive.txt 
1	2	3
2	3	4
3	4	5
6	7	8

				
hive> CREATE TABLE test (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
OK
Time taken: 0.294 seconds
hive> LOAD DATA LOCAL INPATH '/tmp/hive.txt' OVERWRITE INTO TABLE test;
Loading data to table default.test
OK
Time taken: 0.541 seconds
hive> select * from test;
OK
1	2	3
2	3	4
3	4	5
6	7	8
Time taken: 0.952 seconds, Fetched: 5 row(s)

17.3.4.2. 从其他表查询数据并创建新表

				
hive> CREATE TABLE mytable AS SELECT * FROM anytable;

17.3.4.3. 从其他表查询数据然后插入指定表中

				
 INSERT OVERWRITE TABLE mytable SELECT * FROM other ;

17.3.4.4. 从现有表中查询数据然后插入到新的分区表中

				
hive> insert into table table2 partition(created_date) select * from table1;
hive> insert into table newtable partition(type='1') select * from oldtable;

17.3.5. HDFS与本地文件系统管理

17.3.5.1. HDFS 目录迁移

				
[hadoop@localhost ~]$ hdfs dfs -ls /user/hive/warehouse
Found 3 items
drwxrwxr-x   - hadoop supergroup          0 2017-06-29 03:36 /user/hive/warehouse/member
drwxrwxr-x   - hadoop supergroup          0 2017-06-29 03:32 /user/hive/warehouse/test
drwxrwxr-x   - hadoop supergroup          0 2017-06-29 03:41 /user/hive/warehouse/vipuser

[hadoop@localhost ~]$ hdfs dfs -cp /user/hive/warehouse/vipuser /user/hive/warehouse/vipuser2

17.3.5.2. 导出表数据到本地文件

				
hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/test' SELECT * FROM test;

hive> INSERT OVERWRITE LOCAL DIRECTORY '/tmp/test' SELECT * FROM member;
WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Query ID = hadoop_20170629040540_ddeda146-efed-44c4-bb20-a6453c21cc8e
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1498716998098_0001, Tracking URL = http://localhost:8088/proxy/application_1498716998098_0001/
Kill Command = /srv/apache-hadoop/bin/hadoop job  -kill job_1498716998098_0001
Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2017-06-29 04:05:49,221 Stage-1 map = 0%,  reduce = 0%
Ended Job = job_1498716998098_0001
Moving data to local directory /tmp/test
MapReduce Jobs Launched: 
Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 10.54 seconds

17.3.5.3. 导出到HDFS

				
hive> insert overwrite directory '/usr/tmp/test' select * from test;

17.3.5.4.

原文出处：Netkiller 系列手札
本文作者：陈景峯
转载请与作者联系，同时请务必标明文章原始出处和作者信息及本声明。

微信关注我们

原文链接：https://yq.aliyun.com/articles/329736

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

17.2. beeline

beeline 是 hive 提供的一个新的命令行工具，基于SQLLine CLI的JDBC客户端，beeline 与HiveServer2配合使用，支持嵌入模式和远程模式两种，可以像hive client一样访问本机的hive服务，也可以通过指定ip和端口访问远程hive服务。 hive 官方是推荐使用beeline，因为它还提供了更为友好的交互方式（类似mysql client）连接到本机的Hive [hadoop@localhost ~]$ /srv/apache-hive/bin/beeline -u jdbc:hive2:// SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/srv/apache-hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/srv/apache...

2018-01-03

582

HiveQL 与 SQL 极其相似，SQL语法尽管尝试。 17.4.1.JOIN 连接查询 hive> SELECT t1.a,t1.b,t2.a,t2.b > FROM table1 t1 JOIN table2 t2 on t1.a=t2.a > WHERE t1.a>10; 17.4.2.子查询原文出处：Netkiller 系列手札本文作者：陈景峯转载请与作者联系，同时请务必标明文章原始出处和作者信息及本声明。

2018-01-03

536

资源下载

更多资源

优质分享App

近一个月的开发和优化，本站点的第一个app全新上线。该app采用极致压缩，本体才4.36MB。系统里面做了大量数据访问、缓存优化。方便用户在手机上查看文章。后续会推出HarmonyOS的适配版本。

Mario

马里奥是站在游戏界顶峰的超人气多面角色。马里奥靠吃蘑菇成长，特征是大鼻子、头戴帽子、身穿背带裤，还留着胡子。与他的双胞胎兄弟路易基一起，长年担任任天堂的招牌角色。

腾讯云软件源

为解决软件依赖安装时官方源访问速度慢的问题，腾讯云为一些软件搭建了缓存服务。您可以通过使用腾讯云软件源站来提升依赖包的安装速度。为了方便用户自由搭建服务架构，目前腾讯云软件源站支持公网访问和内网访问。

Nacos

Nacos /nɑ:kəʊs/ 是 Dynamic Naming and Configuration Service 的首字母简称，一个易于构建 AI Agent 应用的动态服务发现、配置管理和AI智能体管理平台。Nacos 致力于帮助您发现、配置和管理微服务及AI智能体应用。Nacos 提供了一组简单易用的特性集，帮助您快速实现动态服务发现、服务配置、服务元数据、流量管理。Nacos 帮助您更敏捷和容易地构建、交付和管理微服务平台。

17.3. 管理 Hive

17.3.1. 表管理

17.3.1.1. 创建表

17.3.1.2. 显示表

17.3.1.3. 删除表

17.3.1.4. 查看表结构

17.3.1.5. 为表增加字段

17.3.1.6. 修改表名称

17.3.1.7. 使用已有表结构创建新表

17.3.2. 分区表

17.3.2.1. 创建分区表

17.3.2.2. 显示分区情况

17.3.2.3. 增加分区

17.3.2.4. 向分区表导入数据

17.3.3. 视图管理

17.3.3.1. 创建视图

17.3.3.2. 删除视图

17.3.4. 数据管理

17.3.4.1. 从文本文件导入数据

17.3.4.2. 从其他表查询数据并创建新表

17.3.4.3. 从其他表查询数据然后插入指定表中

17.3.4.4. 从现有表中查询数据然后插入到新的分区表中

17.3.5. HDFS与本地文件系统管理

17.3.5.1. HDFS 目录迁移

17.3.5.2. 导出表数据到本地文件

17.3.5.3. 导出到HDFS

17.3.5.4.

17.2. beeline

17.4. HiveQL - Hive查询语言

相关文章

发表评论

资源下载

优质分享App

Mario

腾讯云软件源

Nacos

欢迎您来访！