当前位置: 首页 > news >正文

动易网站模板制作方法微商怎么做推广加好友

动易网站模板制作方法,微商怎么做推广加好友,免费申请qq邮箱,最安全的网站语言业务场景中Hive解析Json常用案例 json在线工具 json格式转换在线工具 https://tool.lu/json/format格式互转: // 格式化可以合并整行显示 {"name":"John Doe","age":35,"email":"johnexample.com"}// 格式化…

业务场景中Hive解析Json常用案例

json在线工具

  • json格式转换在线工具
https://tool.lu/json/

  • format格式互转:
// 格式化可以合并整行显示
{"name":"John Doe","age":35,"email":"john@example.com"}// 格式化可以展开显示,示例数据更清晰
{"name": "John Doe","age": 35,"email": "john@example.com"
}
  • 转化常用yaml格式

工具提供了各种跨格式转化

age: 35
email: john@example.com
name: 'John Doe'

根据json串自动生成生成 Java 类等功能也非常实用

示例准备

需要Hive环境

  • test测试表
-- 在Hive中建测试表
CREATE TABLE json_test_table (id INT,json_column STRING
) STORED AS TEXTFILE;

json_column 字段用来存储需要处理的 json 串

示例 1: 提取单个字段

假设 JSON 数据包含用户信息

{"name": "John Doe","age": 35,"email": "john@example.com"
}
  • 样例数据
insert into json_test_table values (1,'{"name":"John Doe","age":35,"email":"john@example.com"}');select * from json_test_table where id = 1;
+------+---------------------------------------------------------+
| id   | json_column                                             |
+------+---------------------------------------------------------+
|    1 | {"name":"John Doe","age":35,"email":"john@example.com"} |
+------+---------------------------------------------------------+

使用 get_json_object 函数从 JSON 中提取字段:

SELECT get_json_object(json_column, '$.name') AS name FROM json_test_table where id = 1;+-----------+
|   name    |
+-----------+
| John Doe  |
+-----------+
1 row selected (19.655 seconds)SELECT get_json_object(json_column, '$.name') AS name,get_json_object(json_column, '$.age') AS age FROM json_test_table where id = 1;
+-----------+------+
|   name    | age  |
+-----------+------+
| John Doe  | 35   |
+-----------+------+
1 row selected (19.396 seconds)

示例 2: 提取嵌套字段

在 JSON 数据中有嵌套字段的情况下:

{"user": {"name": "Alice","address": {"city": "New York","zipcode": "10001"}}
}
  • 样例数据
insert into json_test_table values (2,'{"user":{"name":"Alice","address":{"city":"New York","zipcode":"10001"}}}');select * from json_test_table where id = 2;
+------+---------------------------------------------------------------------------+
| id   | json_column                                                               |
+------+---------------------------------------------------------------------------+
|    2 | {"user":{"name":"Alice","address":{"city":"New York","zipcode":"10001"}}} |
+------+---------------------------------------------------------------------------+

提取嵌套字段的值:

SELECT get_json_object(json_column, '$.user.name') AS user_name,get_json_object(json_column, '$.user.address.city') AS city
FROM json_test_table where id = 2;+------------+-----------+
| user_name  |   city    |
+------------+-----------+
| Alice      | New York  |
+------------+-----------+
1 row selected (19.38 seconds)

示例 3: 提取数组中的值

假设 JSON 数据包含一个数组:

{"tags": ["apple", "banana", "orange"]
}
  • 样例数据
insert into json_test_table values (3,'{"tags":["apple","banana","orange"]}');select * from json_test_table where id = 3;
+------+--------------------------------------+
| id   | json_column                          |
+------+--------------------------------------+
|    3 | {"tags":["apple","banana","orange"]} |
+------+--------------------------------------+

从数组中提取值:

SELECT get_json_object(json_column, '$.tags[0]') AS first_tag,get_json_object(json_column, '$.tags[1]') AS second_tag
FROM json_test_table where id = 3;+------------+-------------+
| first_tag  | second_tag  |
+------------+-------------+
| apple      | banana      |
+------------+-------------+
1 row selected (18.488 seconds)

示例 4: 提取数组列表长度

获取数组的长度:

-- 复用 示例3中样例数据
select * from json_test_table where id = 3;
+------+--------------------------------------+
| id   | json_column                          |
+------+--------------------------------------+
|    3 | {"tags":["apple","banana","orange"]} |
+------+--------------------------------------+
-- 可以看到数组tags对应的值有3个SELECT size(split(get_json_object(json_column, '$.tags'), ',')) AS tags_length
FROM json_test_table
WHERE id = 3;+--------------+
| tags_length  |
+--------------+
| 3            |
+--------------+
1 row selected (35.766 seconds)

示例 5:业务场景-解析公司基本信息案例

  • 假设JSON数据包含公司的基本信息,比如公司名称、注册资本等字段
{"company_name": "wangting_company","registration": {"registered_capital": 1000000,"registered_date": "2020-01-01","registered_address": "123 Main St, City"}
}{"company_name":"wangting_company","registration":{"registered_capital":1000000,"registered_date":"2020-01-01","registered_address":"123 Main St, City"}}
-- 创建表
CREATE TABLE company_info (company_name STRING,registered_capital INT,registered_date STRING,registered_address STRING
) STORED AS TEXTFILE;-- 转化插入数据
INSERT INTO company_info
SELECTget_json_object(json_data, '$.company_name') AS company_name,get_json_object(json_data, '$.registration.registered_capital') AS registered_capital,get_json_object(json_data, '$.registration.registered_date') AS registered_date,get_json_object(json_data, '$.registration.registered_address') AS registered_address
FROM(SELECT '{"company_name": "wangting_company", "registration": {"registered_capital": 1000000, "registered_date": "2020-01-01", "registered_address": "123 Main St, City"}}' AS json_data) t;+----------------------------+----------------------------------+-------------------------------+----------------------------------+
| company_info.company_name  | company_info.registered_capital  | company_info.registered_date  | company_info.registered_address  |
+----------------------------+----------------------------------+-------------------------------+----------------------------------+
| wangting_company           | 1000000                          | 2020-01-01                    | 123 Main St, City                |
+----------------------------+----------------------------------+-------------------------------+----------------------------------+
1 row selected (0.411 seconds)

示例 6: 使用 LATERAL VIEW 解析数组

使用 LATERAL VIEWexplode 解析 JSON 数组:

-- 复用示例3样例数据
SELECT id, tag
FROM json_test_table
LATERAL VIEW explode(split(get_json_object(json_column, '$.tags'), ',')) exploded_tags AS tag
WHERE id = 3;+-----+------------+
| id  |    tag     |
+-----+------------+
| 3   | ["apple"   |
| 3   | "banana"   |
| 3   | "orange"]  |
+-----+------------+
3 rows selected (17.318 seconds)-- 仅展示解析数据,实际使用如需去除方括号,可以一并清洗

示例 7: 提取数组对象的值

如果数组包含对象,则提取对象的值:

{"users": [{"name": "Alice","age": 28},{"name": "Bob","age": 35}]
}{"users":[{"name":"Alice","age":28},{"name":"Bob","age":35}]}
  • 样例数据
insert into json_test_table values (7,'{"users":[{"name":"Alice","age":28},{"name":"Bob","age":35}]}');select * from json_test_table where id = 7;
+------+---------------------------------------------------------------+
| id   | json_column                                                   |
+------+---------------------------------------------------------------+
|    7 | {"users":[{"name":"Alice","age":28},{"name":"Bob","age":35}]} |
+------+---------------------------------------------------------------+
SELECT get_json_object(json_column, '$.users[0].name') AS user1_name,get_json_object(json_column, '$.users[1].name') AS user2_name
FROM json_test_table where id = 7;+-------------+-------------+
| user1_name  | user2_name  |
+-------------+-------------+
| Alice       | Bob         |
+-------------+-------------+
1 row selected (17.372 seconds)

示例 8: 处理日期

处理 JSON 中的日期字段:

{"event_date": "2023-11-21"
}{"event_date":"2023-11-21"}
  • 样例数据
insert into json_test_table values (8,'{"event_date":"2023-11-21"}');select * from json_test_table where id = 8;
+------+-----------------------------+
| id   | json_column                 |
+------+-----------------------------+
|    8 | {"event_date":"2023-11-21"} |
+------+-----------------------------+
SELECT get_json_object(json_column, '$.event_date') AS event_date
FROM json_test_table where id = 8;+-------------+
| event_date  |
+-------------+
| 2023-11-21  |
+-------------+
1 row selected (17.436 seconds)SELECT id, CAST(get_json_object(json_column, '$.event_date') AS DATE) AS event_date,YEAR(CAST(get_json_object(json_column, '$.event_date') AS DATE)) AS event_year,MONTH(CAST(get_json_object(json_column, '$.event_date') AS DATE)) AS event_month
FROM json_test_table
WHERE id = 8;+-----+-------------+-------------+--------------+
| id  | event_date  | event_year  | event_month  |
+-----+-------------+-------------+--------------+
| 8   | 2023-11-21  | 2023        | 11           |
+-----+-------------+-------------+--------------+
1 row selected (17.363 seconds)

示例 9: 提取布尔值

提取 JSON 中的布尔字段:

{"is_active": true
}{"is_active":true}
  • 样例数据
insert into json_test_table values (9,'{"is_active":true}');select * from json_test_table where id = 9;
+------+--------------------+
| id   | json_column        |
+------+--------------------+
|    9 | {"is_active":true} |
+------+--------------------+
SELECT get_json_object(json_column, '$.is_active') AS is_active
FROM json_test_table where id = 9;
+------------+
| is_active  |
+------------+
| true       |
+------------+
1 row selected (18.401 seconds)--  使用WHERE子句根据布尔值进行条件过滤。
SELECT *
FROM json_test_table
WHERE get_json_object(json_column, '$.is_active') = 'true';+---------------------+------------------------------+
| json_test_table.id  | json_test_table.json_column  |
+---------------------+------------------------------+
| 9                   | {"is_active":true}           |
+---------------------+------------------------------+
1 row selected (17.327 seconds)

示例 10: json_tuple计算 JSON 对象数量

计算 JSON 对象中包含的键值对数量:

SELECT json_tuple(json_column, 'name', 'age') AS (name, age)
FROM json_test_table;+-----------+-------+
|   name    |  age  |
+-----------+-------+
| John Doe  | 35    |
| NULL      | NULL  |
| NULL      | NULL  |
| NULL      | NULL  |
| NULL      | NULL  |
| NULL      | NULL  |
+-----------+-------+
6 rows selected (17.344 seconds)-- 只有id=1符合条件 不符合条件均为NULL

示例 11: 使用 json_tuple 提取值

从 JSON 对象中使用 json_tuple 提取多个字段:

{"name": "John","age": 30,"address": "New York"
}{"name":"John","age":30,"address":"New York"}
  • 样例数据
insert into json_test_table values (11,'{"name":"John","age":30,"address":"New York"}');select * from json_test_table where id = 11;
+------+-----------------------------------------------+
| id   | json_column                                   |
+------+-----------------------------------------------+
|   11 | {"name":"John","age":30,"address":"New York"} |
+------+-----------------------------------------------+
SELECT json_tuple(json_column, 'name', 'age', 'address') AS (name, age, address)
FROM json_test_table where id = 11;+-------+------+-----------+
| name  | age  |  address  |
+-------+------+-----------+
| John  | 30   | New York  |
+-------+------+-----------+
1 row selected (17.414 seconds)

示例 12: 提取数字字段

从 JSON 中提取数字字段:

{"count": 50
}{"count":50}
  • 样例数据
insert into json_test_table values (12,'{"count":50}');select * from json_test_table where id = 12;
+------+--------------+
| id   | json_column  |
+------+--------------+
|   12 | {"count":50} |
+------+--------------+
SELECT get_json_object(json_column, '$.count') AS count
FROM json_test_table where id = 12;+--------+
| count  |
+--------+
| 50     |
+--------+
1 row selected (19.401 seconds)

示例 13: 使用 get_json_object 过滤数据

根据 JSON 中的字段值进行过滤:

SELECT *
FROM json_test_table
WHERE get_json_object(json_column, '$.age') > 30;+---------------------+----------------------------------------------------+
| json_test_table.id  |            json_test_table.json_column             |
+---------------------+----------------------------------------------------+
| 1                   | {"name":"John Doe","age":35,"email":"john@example.com"} |
+---------------------+----------------------------------------------------+
1 row selected (18.402 seconds)SELECT *
FROM json_test_table
WHERE get_json_object(json_column, '$.age') > 10;+---------------------+----------------------------------------------------+
| json_test_table.id  |            json_test_table.json_column             |
+---------------------+----------------------------------------------------+
| 1                   | {"name":"John Doe","age":35,"email":"john@example.com"} |
| 11                  | {"name":"John","age":30,"address":"New York"}      |
+---------------------+----------------------------------------------------+
2 rows selected (17.298 seconds)

示例 14: 处理空值

处理 JSON 数据中可能的空值:

{"status": null
}{"status":null}
  • 样例数据
insert into json_test_table values (14,'{"status":null}');select * from json_test_table where id = 14;
+------+-----------------+
| id   | json_column     |
+------+-----------------+
|   14 | {"status":null} |
+------+-----------------+
SELECT get_json_object(json_column, '$.status') AS status
FROM json_test_table where id = 14;+---------+
| status  |
+---------+
| NULL    |
+---------+
1 row selected (17.345 seconds)--  使用IS NULL或IS NOT NULL来检查字段是否为空
SELECT id, json_column
FROM json_test_table
WHERE id = 14 AND get_json_object(json_column, '$.status') IS NULL;+-----+------------------+
| id  |   json_column    |
+-----+------------------+
| 14  | {"status":null}  |
+-----+------------------+
1 row selected (17.47 seconds)-- 使用CASE语句对NULL值进行处理或者转换为其他值。
SELECT id, CASE WHEN get_json_object(json_column, '$.status') IS NULL THEN 'No Status' ELSE get_json_object(json_column, '$.status') END AS status
FROM json_test_table
WHERE id = 14;+-----+------------+
| id  |   status   |
+-----+------------+
| 14  | No Status  |
+-----+------------+
1 row selected (17.31 seconds)

示例 15: 使用 get_json_object 判断是否存在字段

判断 JSON 是否包含特定字段:

SELECT CASE WHEN get_json_object(json_column, '$.name') IS NOT NULL THEN 'Exists' ELSE 'Not Exists' END AS field_status
FROM json_test_table;
+---------------+
| field_status  |
+---------------+
| Exists        |
| Not Exists    |
| Exists        |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
+---------------+SELECT CASE WHEN get_json_object(json_column, '$.age') IS NOT NULL THEN 'Exists' ELSE 'Not Exists' END AS field_status
FROM json_test_table;+---------------+
| field_status  |
+---------------+
| Exists        |
| Not Exists    |
| Exists        |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
+---------------+SELECT CASE WHEN get_json_object(json_column, '$.aaaaaaaaa') IS NOT NULL THEN 'Exists' ELSE 'Not Exists' END AS field_status
FROM json_test_table;+---------------+
| field_status  |
+---------------+
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
| Not Exists    |
+---------------+
9 rows selected (17.337 seconds)

示例 16: 提取多层嵌套对象的值

从多层嵌套的 JSON 对象中提取值:

{"user": {"details": {"name": "Alice","age": 30}}
}{"user":{"details":{"name":"Alice","age":30}}}
  • 样例数据
insert into json_test_table values (16,'{"user":{"details":{"name":"Alice","age":30}}}');select * from json_test_table where id = 16;
+------+------------------------------------------------+
| id   | json_column                                    |
+------+------------------------------------------------+
|   16 | {"user":{"details":{"name":"Alice","age":30}}} |
+------+------------------------------------------------+
SELECT get_json_object(json_column, '$.user.details.name') AS user_name,get_json_object(json_column, '$.user.details.age') AS user_age
FROM json_test_table where id = 16;+------------+-----------+
| user_name  | user_age  |
+------------+-----------+
| Alice      | 30        |
+------------+-----------+
1 row selected (16.253 seconds)

示例 17: 提取嵌套对象数组的值

从嵌套的对象数组中提取值:

{"employees": [{"name": "Alice","department": "HR"},{"name": "Bob","department": "Engineering"}]
}{"employees":[{"name":"Alice","department":"HR"},{"name":"Bob","department":"Engineering"}]}
  • 样例数据
insert into json_test_table values (17,'{"employees":[{"name":"Alice","department":"HR"},{"name":"Bob","department":"Engineering"}]}');select * from json_test_table where id = 17;
+------+----------------------------------------------------------------------------------------------+
| id   | json_column                                                                                  |
+------+----------------------------------------------------------------------------------------------+
|   17 | {"employees":[{"name":"Alice","department":"HR"},{"name":"Bob","department":"Engineering"}]} |
+------+----------------------------------------------------------------------------------------------+
SELECT get_json_object(json_column, '$.employees[0].name') AS employee1_name,get_json_object(json_column, '$.employees[1].name') AS employee2_name
FROM json_test_table where id = 17;+-----------------+-----------------+
| employee1_name  | employee2_name  |
+-----------------+-----------------+
| Alice           | Bob             |
+-----------------+-----------------+
1 row selected (17.273 seconds)

示例 18: 提取数组对象的多个值

取数组对象的多个值

从数组对象中提取多个值:

{"items": [{"id": 1,"name": "Item 1"},{"id": 2,"name": "Item 2"}]
}{"items":[{"id":1,"name":"Item 1"},{"id":2,"name":"Item 2"}]}
  • 样例数据
insert into json_test_table values (18,'{"items":[{"id":1,"name":"Item 1"},{"id":2,"name":"Item 2"}]}');select * from json_test_table where id = 18;
+------+---------------------------------------------------------------+
| id   | json_column                                                   |
+------+---------------------------------------------------------------+
|   18 | {"items":[{"id":1,"name":"Item 1"},{"id":2,"name":"Item 2"}]} |
+------+---------------------------------------------------------------+
SELECT get_json_object(json_column, '$.items[0].id') AS item1_id,get_json_object(json_column, '$.items[0].name') AS item1_name,get_json_object(json_column, '$.items[1].id') AS item2_id,get_json_object(json_column, '$.items[1].name') AS item2_name
FROM json_test_table where id = 18;+-----------+-------------+-----------+-------------+
| item1_id  | item1_name  | item2_id  | item2_name  |
+-----------+-------------+-----------+-------------+
| 1         | Item 1      | 2         | Item 2      |
+-----------+-------------+-----------+-------------+
1 row selected (17.386 seconds)

示例19: json_serde解析映射数据文件的使用介绍

  • json_serde库提供了一种在Hive中直接将JSON格式的数据解析成表的方式,可简化处理流程。

如果现在有一个json文件:

[app@ali-wangting wangt]$ cat wangt.json

{"name":"wangting01","age":21,"email":"wangting01@example.com"}
{"name":"wangting02","age":22,"email":"wangting02@example.com"}
{"name":"wangting03","age":23,"email":"wangting03@example.com"}
{"name":"wangting04","age":24,"email":"wangting04@example.com"}
{"name":"wangting05","age":25,"email":"wangting05@example.com"}
{"name":"wangting06","age":26,"email":"wangting06@example.com"}
{"name":"wangting07","age":27,"email":"wangting07@example.com"}
{"name":"wangting08","age":28,"email":"wangting08@example.com"}

现在希望建立一张表,使得表可以直接映射上数据文件

  • hive中建表
CREATE TABLE IF NOT EXISTS wangt_666_json(name string,age int,email string
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
stored as textfile;desc wangt_json_table;
+-----------+------------+--------------------+
| col_name  | data_type  |      comment       |
+-----------+------------+--------------------+
| name      | string     | from deserializer  |
| age       | int        | from deserializer  |
| email     | string     | from deserializer  |
+-----------+------------+--------------------+
  • 将json文件上传至建表对应hdfs路径
[app@ali-wangting wangt]$ hdfs dfs -ls /user/hive/warehouse/wangt.db/wangt_666_json/
[app@ali-wangting wangt]$ hdfs dfs -put wangt.json /user/hive/warehouse/wangt.db/wangt_666_json/
[app@ali-wangting wangt]$ hdfs dfs -ls /user/hive/warehouse/wangt.db/wangt_666_json/
Found 1 items
-rwxrwx--x+  2 hive hive        512 2023-12-04 16:38 /user/hive/warehouse/wangt.db/wangt_666_json/wangt.json# 也可以通过hive的sql命令行直接load数据文件
load data local inpath '/home/wangt/wangt.json' into table wangt_666_json;
  • 查看表数据验证
select * from wangt_666_json;
+----------------------+---------------------+-------------------------+
| wangt_666_json.name  | wangt_666_json.age  |  wangt_666_json.email   |
+----------------------+---------------------+-------------------------+
| wangting01           | 21                  | wangting01@example.com  |
| wangting02           | 22                  | wangting02@example.com  |
| wangting03           | 23                  | wangting03@example.com  |
| wangting04           | 24                  | wangting04@example.com  |
| wangting05           | 25                  | wangting05@example.com  |
| wangting06           | 26                  | wangting06@example.com  |
| wangting07           | 27                  | wangting07@example.com  |
| wangting08           | 28                  | wangting08@example.com  |
+----------------------+---------------------+-------------------------+
8 rows selected (0.5 seconds)

可以看到数据文件8行记录,成功映射对应了hive中wangt_666_json表的8条数据

http://www.dt0577.cn/news/49184.html

相关文章:

  • 视频网站用什么做的seo是什么意思?
  • 郑州公司网站制作网站在线优化检测
  • 已有网站做移动网站2345网址导航怎么下载
  • 做网站时新闻的背景图网页关键词排名优化
  • 沈阳网站建设定制做网站的公司有哪些
  • 方正网站制作百度客户端在哪里打开
  • 网站建设的栏目策划内容营销是什么意思
  • 黑河做网站公司网推技巧
  • 网站开发的测试域名申请的流程
  • 北京网站建设 案例谷歌seo网站推广怎么做
  • 蓝天网站建设中国站长
  • 国内网站11月将现新冠感染高峰
  • 西安直播网站建设什么是淘宝seo
  • 营销型网站建设公司推荐重庆网站优化
  • 自己做公司网站成都seo推广
  • 厦门网站建设企业合肥优化
  • 印团网网站是哪家做的seo优化及推广如何运营
  • 火车头采集器网站被k谷歌官方seo入门指南
  • 淄博网站建设报价网站查询进入
  • 网站开发怎么样百度竞价排名又叫什么
  • 网站做cpa如何广告推广
  • 个人网站名称要求四年级小新闻50字左右
  • 学校网站建设必要性seo推广需要多少钱
  • 做购物网站写数据库的流程爱站seo工具包下载
  • 模板做网站如何做网站推广的策略
  • 免费企业网站建站学大教育培训机构怎么样
  • 建设雅马哈摩托车官方网站网络销售技巧和话术
  • b2c网站的功能包括百度网盘官网登录首页
  • 网站建设站点无法发布外链大全
  • 时空网站建设的可行性分析做企业网站建设的公司