Alex 的 Hadoop 菜鸟教程: 第21课 不只是在HBase中用SQL:Phoeni

声明

这里我用4.2.2,下载好后,解压开,把 phoenix-4.2.2-server.jar 拷贝到所有RegionServer的lib目录下 /usr/lib/hbase/lib

cp phoenix-4.2.2-server.jar /usr/lib/hbase/lib

然后重启所有regionserver

service hbase-regionserver restart

使用Phoenix

把phoenix-4.2.2-bin.tar.gz解压出来的phoenix-4.2.2-bin 文件夹也上传到host1上,然后到bin目录下执行(其实在本机也可以,只是我本机没有Python环境而Centos天生有Python)

如果是windows下下载的,得先在centos上给bin文件夹里面的.py文件赋上执行权限

[root@host1 ~]# cd phoenix-4.2.2-bin/[root@host1 phoenix-4.2.2-bin]# cd bin[root@host1 bin]# chmod +x *.py

phoenix可以用4种方式调用

批处理方式

我们建立sql 名叫us_population.sql 内容是

CREATE TABLE IF NOT EXISTS us_population ( state CHAR(2) NOT NULL, city VARCHAR NOT NULL, population BIGINT CONSTRAINT my_pk PRIMARY KEY (state, city));

建立一个文件us_population.csv

NY,New York,8143197CA,Los Angeles,3844829IL,Chicago,2842518TX,Houston,2016582PA,Philadelphia,1463281AZ,Phoenix,1461575TX,San Antonio,1256509CA,San Diego,1255540TX,Dallas,1213825CA,San Jose,912332再创建一个文件us_population_queries.sqlSELECT state as "State",count(city) as "City Count",sum(population) as "Population Sum" FROM us_population GROUP BY state ORDER BY sum(population) DESC; 然后一起执行phoenix-4.2.2-bin/bin/psql.py host1,host2:2181 us_population.sql us_population.csv us_population_queries.sql这边记得把 host1 和 host2 换成你的zookeeper地址

这条命令你同时做了 创建一个表,插入数据,查询结果 三件事情

[root@host1 ~]# phoenix-4.2.2-bin/bin/psql.py host1,host2:2181 us_population.sql us_population.csv us_population_queries.sql15/03/04 17:14:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable15/03/04 17:14:24 WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-phoenix.properties,hadoop-metrics2.propertiesno rows upsertedTime: 0.726 sec(s)csv columns from database.CSV Upsert complete. 10 rows upsertedTime: 0.103 sec(s)StCity CountPopulation Sum — —————————————- —————————————- NY18143197 CA36012701 TX34486916 IL12842518 PA11463281 AZ11461575 Time: 0.048 sec(s)用hbase shell 看下会发现多出来一个US_POPULATION 表,用scan 命令查看一下这个表的数据hbase(main):002:0> scan ‘US_POPULATION’ROWCOLUMN+CELLAZPhoenixcolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00\x16MGAZPhoenixcolumn=0:_0, timestamp=1425460467206, value=CALos Angelescolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00:\xAA\xDDCALos Angelescolumn=0:_0, timestamp=1425460467206, value=CASan Diegocolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00\x13(tCASan Diegocolumn=0:_0, timestamp=1425460467206, value=CASan Josecolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00\x0D\xEB\xCCCASan Josecolumn=0:_0, timestamp=1425460467206, value=ILChicagocolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00+_\x96ILChicagocolumn=0:_0, timestamp=1425460467206, value=NYNew Yorkcolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00|A]NYNew Yorkcolumn=0:_0, timestamp=1425460467206, value=PAPhiladelphiacolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00\x16S\xF1PAPhiladelphiacolumn=0:_0, timestamp=1425460467206, value=TXDallascolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00\x12\x85\x81TXDallascolumn=0:_0, timestamp=1425460467206, value=TXHoustoncolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00\x1E\xC5FTXHoustoncolumn=0:_0, timestamp=1425460467206, value=TXSan Antoniocolumn=0:POPULATION, timestamp=1425460467206, value=\x80\x00\x00\x00\x00\x13,=TXSan Antoniocolumn=0:_0, timestamp=1425460467206, value=10 row(s) in 0.2220 seconds会发现

命令行方式

然后执行sqlline.py

$ ./sqlline.py localhost

可以进入命令行模式

0: jdbc:phoenix:localhost>

退出命令行的方式是执行 !quit

0: jdbc:phoenix:localhost>!quit命令开头需要一个感叹号,,使用help可以打印出所有命令

0: jdbc:phoenix:localhost> help!allExecute the specified SQL against all the current connections!autocommitSet autocommit mode on or off!batchStart or execute a batch of statements!briefSet verbose mode off!callExecute a callable statement!closeClose the current connection to the database!closeallClose all current open connections!columnsList all the columns for the specified table!commitCommit the current transaction (if autocommit is off)!connectOpen a new connection to the database.!dbinfoGive metadata information about the database!describeDescribe a table!dropallDrop all tables in the current database!exportedkeysList all the exported keys for the specified table!goSelect the current connection!helpPrint a summary of command usage!historyDisplay the command history!importedkeysList all the imported keys for the specified table!indexesList all the indexes for the specified table!isolationSet the transaction isolation for this connection!listList the current connections!manualDisplay the SQLLine manual!metadataObtain metadata information!nativesqlShow the native SQL for the specified statement!outputformatSet the output format for displaying results(table,vertical,csv,tsv,xmlattrs,xmlelements)!primarykeysList all the primary keys for the specified table!proceduresList all the procedures!propertiesConnect to the database specified in the properties file(s)!quitExits the program!reconnectReconnect to the database!recordRecord all output to the specified file!rehashFetch table and column names for command completion!rollbackRoll back the current transaction (if autocommit is off)!runRun a script from the specified file!saveSave the current variabes and aliases!scanScan for installed JDBC drivers!scriptStart saving a script to a file!setSet a sqlline variable!sqlExecute a SQL command!tablesList all the tables in the database!typeinfoDisplay the type map for the current connection!verboseSet verbose mode on

建立employee的映射表数据准备然后我们来建立一个映射表,映射我之前建立过的一个hbase表 employee

每一件事都要用多方面的角度来看它

Alex 的 Hadoop 菜鸟教程: 第21课 不只是在HBase中用SQL:Phoeni

相关文章:

你感兴趣的文章:

标签云: