本文以orders.tbl
为例,详细为您介绍如何将普通文本文件转成ORC格式的文件。
操作步骤
- 创建OSS Schema。
CREATE SCHEMA dla_oss_db with DBPROPERTIES( catalog='oss', location 'oss://dlaossfile1/' )
- 在DLA中创建orders_txt表,LOCATION为OSS中orders.tbl的路径。
CREATE EXTERNAL TABLE orders_txt ( O_ORDERKEY INT, O_CUSTKEY INT, O_ORDERSTATUS STRING, O_TOTALPRICE DOUBLE, O_ORDERDATE DATE, O_ORDERPRIORITY STRING, O_CLERK STRING, O_SHIPPRIORITY INT, O_COMMENT STRING ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'oss://dlaossfile1/orders/orders.tbl'
- 在DLA中创建目标表orders_orc,LOCATION设置为OSS中的您需要的位置。
CREATE EXTERNAL TABLE orders_orc ( O_ORDERKEY INT, O_CUSTKEY INT, O_ORDERSTATUS STRING, O_TOTALPRICE DOUBLE, O_ORDERDATE DATE, O_ORDERPRIORITY STRING, O_CLERK STRING, O_SHIPPRIORITY INT, O_COMMENT STRING ) STORED AS ORC LOCATION 'oss://dlaossfile1/orders_orc/'
说明STORED AS ORC
:指定文件的存储格式为ORC。- LOCATION必须是OSS中已经存在的目录,并以
/
结尾。
- 执行
INSERT...SELECT
语句,将orders_txt表中的数据插入orders_orc表中。INSERT INTO orders_orc SELECT * FROM orders_txt;
INSERT...SELECT
语句执行成功后,在OSS中查看生成的ORC数据文件。