[Hive]Hive多分区操作

业务背景

mobile_log记录移动日志,现在需要将其保存到hive表中,将来可以按日期,小时进行统计,为此,,需要建立一张具有日期、小时的hive分区表。

业务实现

hive分区表分为单分区表和多分区表,一个表可以拥有多个分区,每个分区都以文件夹的形式单独存放在表的文件目录下,详细可以参见

建立多分区表代码

pms.test_mobile_log;create table pms.test_mobile_log (idbigint,infomation string)partitioned by (ds string, hour string)row format delimited fields terminated by ‘\t’lines terminated by ‘\n’;

导入数据到多分区表中,实现方式有如下这些:

建表的时候,就插入数据,参考:pms.test_mobile_log;create table pms.test_mobile_log (idbigint,infomation string)partitioned by (ds string, hour string)row format delimited fields terminated by ‘\t’lines terminated overwrite table pms.test_mobile_log partition(ds=’2015-05-26′, hour=’13’) selectid,category_namefrom category;使用LOAD DATA方式导入数据,参考load data inpath ‘/user/pms/workspace/ouyangyewei/temp2/category.txt’ overwrite into table pms.test_mobile_log partition (ds=’2015-05-26′, hour=’15’);新增分区时导入数据,参考:alter table pms.test_mobile_log add partition (ds=’2015-05-27′, hour=’14’) location ‘/user/pms/workspace/ouyangyewei/temp2/category.txt’;实验结果

表结构

CREATE TABLE pms.test_mobile_log( id bigint, infomation string)PARTITIONED BY ( ds string, hour string)ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED AS INPUTFORMAT ‘org.apache.hadoop.mapred.TextInputFormat’ OUTPUTFORMAT ‘org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat’LOCATION ‘hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/user/hive/pms/test_mobile’TBLPROPERTIES ( ‘numPartitions’=’2’, ‘numFiles’=’2’, ‘transient_lastDdlTime’=’1432711793’, ‘numRows’=’0’, ‘totalSize’=’3517’, ‘rawDataSize’=’0’)

表分区

Found 2 itemsdrwxrpms supergroup :/hour=13drwxrpms pms:/hour=15

每当我看天的时候我就不喜欢再说话,每当我说话的时候我却敢看天。

[Hive]Hive多分区操作

相关文章:

你感兴趣的文章:

标签云: