数据库中常常需要处理JSON数据,例如,将JSON数据中的某些属性字段进行展开转换,转成行、列的关系型形式。JSON数据的展开方式灵活多变,本文通过详细的SQL示例,介绍如何展开JSON数据列。
基本思路
使用JSON函数,对JSON字符串进行解析和数据提取;
提取、转换为ARRAY或者MAP的数据结构,也可以使用Lambda函数式表达式进行转换处理;
使用UNNEST语法进行列展开。
使用UNNEST对MAP进行关系型展开
SQL示例:
SELECT t.m, t.n
FROM (
SELECT MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]) as map_data
)
CROSS JOIN unnest(map_data) AS t(m, n);
结果:
+------+------+
| m | n |
+------+------+
| foo | 1 |
| bar | 2 |
+------+------+
使用UNNEST对JSON数据进行关系型展开
SQL示例1:
SELECT json_extract(t.a, '$.a') AS a,
json_extract(t.a, '$.b') AS b
FROM (
SELECT cast(json_extract('{"x":[{"a":1,"b":2},{"a":3,"b":4}]}', '$.x')
AS array<JSON>) AS package_array
)
CROSS JOIN UNNEST(package_array) AS t(a);
结果:
+------+------+
| a | b |
+------+------+
| 1 | 2 |
| 3 | 4 |
+------+------+
SQL示例2:
SELECT t.m AS _col1, t.n AS _col2
FROM (
SELECT cast(json_extract('{"x":[{"a":1,"b":2},{"a":3,"b":4}]}', '$.x')
AS array<JSON>) AS array_1,
cast(json_extract('{"x":[{"a":5,"b":6}, {"a":7,"b":8}, {"a":9,"b":10}, {"a":11,"b":12}]}', '$.x')
AS array<JSON>) AS array_2
)
CROSS JOIN UNNEST(array_1, array_2) AS t(m, n);
结果:
+---------------+-----------------+
| _col1 | _col2 |
+---------------+-----------------+
| {"a":1,"b":2} | {"a":5,"b":6} |
| {"a":3,"b":4} | {"a":7,"b":8} |
| NULL | {"a":9,"b":10} |
| NULL | {"a":11,"b":12} |
+---------------+-----------------+
SQL示例3:
SELECT json_extract(t.m, '$.a') AS _col1,
json_extract(t.m, '$.b') AS _col2,
json_extract(t.n, '$.a') AS _col3,
json_extract(t.n, '$.b') AS _col4
FROM (
SELECT cast(json_extract('{"x":[{"a":1,"b":2},{"a":3,"b":4}]}', '$.x')
AS array<JSON>) AS array_1,
cast(json_extract('{"x":[{"a":5,"b":6}, {"a":7,"b":8}, {"a":9,"b":10}, {"a":11,"b":12}]}', '$.x')
AS array<JSON>) AS array_2
)
CROSS JOIN UNNEST(array_1, array_2) AS t(m, n);
结果:
+-------+-------+-------+-------+
| _col1 | _col2 | _col3 | _col4 |
+-------+-------+-------+-------+
| 1 | 2 | 5 | 6 |
| 3 | 4 | 7 | 8 |
| NULL | NULL | 9 | 10 |
| NULL | NULL | 11 | 12 |
+-------+-------+-------+-------+
结合Lambda表达式,使用UNNEST对JSON数据进行关系型展开
SQL示例:
SELECT count(*) AS cnt,
package_name
FROM (
SELECT t.a AS package_name
FROM (
SELECT transform(packages_map_array, x -> Element_at(x, 'packageName'))
AS package_array
FROM (
SELECT cast(Json_extract(data_json, '$.packages')
AS array<map<VARCHAR, VARCHAR>>) AS packages_map_array
FROM (
SELECT json_parse(data) AS data_json
FROM (
SELECT '{
"packages": [
{
"appName": "铁路12306",
"packageName": "com.MobileTicket",
"versionName": "4.1.9",
"versionCode": "194"
},
{
"appName": "QQ飞车",
"packageName": "com.tencent.tmgp.speedmobile",
"versionName": "1.11.0.13274",
"versionCode": "1110013274"
},
{
"appName": "掌阅",
"packageName": "com.chaozh.iReaderFree",
"versionName": "7.11.0",
"versionCode": "71101"
}
]
}'
AS data
)
)
)
) AS x (package_array)
CROSS JOIN UNNEST(package_array) AS t (a)
)
GROUP BY package_name
ORDER BY cnt DESC;
结果:
+------+------------------------------+
| cnt | package_name |
+------+------------------------------+
| 1 | com.MobileTicket |
| 1 | com.tencent.tmgp.speedmobile |
| 1 | com.chaozh.iReaderFree |
+------+------------------------------+
文档内容是否对您有帮助?