DSQ工具简介
一、简介
1、简介
DSQ是一个用于针对JSON、CSV、Excel、Parquet等运行SQL查询的命令行工具,它基于SQLite构建。
2、安装
下载对应平台的压缩包后解压即可。
二、用法
测试数据如下:
data.json
[{
"id": 1,
"username": "anthony",
"title": "learn Russian",
"completed": false
},
{
"id": 2,
"username": "nicole",
"title": "play badminton",
"completed": true
},
{
"id": 3,
"username": "danika",
"title": "go Shopping",
"completed": false
},
{
"id": 4,
"username": "nicole",
"title": "play piano",
"completed": true
},
{
"id": 5,
"username": "peter",
"title": "do homework",
"completed": false
},
{
"id": 6,
"username": "venus",
"title": "play a game",
"completed": false
},
{
"id": 7,
"username": "danika",
"title": "go on a date",
"completed": true
},
{
"id": 8,
"username": "zane",
"title": "go on a journey",
"completed": true
}]
country.csv
id,country
1,Russia
2,Germany
3,France
4,Greece
5,Egypt
6,England
7,America
8,Korea
1、基础用法
- 查询
dsq data.json "select * from {} where id > 6"
结果如下:
[{"completed":true,"id":7,"title":"go on a date","username":"danika"},
{"completed":true,"id":8,"title":"go on a journey","username":"zane"}]
- 分组统计
dsq data.json "select username,count(1) ct from {} group by username order by ct desc limit 3"
结果如下:
[{"ct":2,"username":"nicole"},
{"ct":2,"username":"danika"},
{"ct":1,"username":"zane"}]
2、美化输出
dsq -p data.json "select * from {} where username='peter'"
结果如下:
3、多文件连接
可以将多个文件传递给DSQ,每个文件当作一个表,可以通过{N}
访问,其中N为命令行传递的文件列表中文件的索引(从0开始)。
由于DSQ使用标准SQL,因此也可以给使用别名。
dsq data.json country.csv "select d.id, d.username, c.country from {0} d join {1} c on d.id = c.id"
结果如下:
[{"country":"Russia","id":1,"username":"anthony"},
{"country":"Germany","id":2,"username":"nicole"},
{"country":"France","id":3,"username":"danika"},
{"country":"Greece","id":4,"username":"nicole"},
{"country":"Egypt","id":5,"username":"peter"},
{"country":"England","id":6,"username":"venus"},
{"country":"America","id":7,"username":"danika"},
{"country":"Korea","id":8,"username":"zane"}]
4、将数据转换为JSON
可以将DSQ支持的文件类型转换为JSON:
dsq country.csv "select * from {}"
或者简写为:
dsq country.csv
结果如下:
[{"country":"Russia","id":"1"},
{"country":"Germany","id":"2"},
{"country":"France","id":"3"},
{"country":"Greece","id":"4"},
{"country":"Egypt","id":"5"},
{"country":"England","id":"6"},
{"country":"America","id":"7"},
{"country":"Korea","id":"8"}]
5、对象嵌套
- 对象中的嵌套数组
数据文件user.json
内容如下:
{
"id": 100,
"info": {
"name": "albert",
"age": 30,
"friends": [
{"name": "Tom", "age": 28},
{"name": "Jack", "age": 32},
{"name": "Jones", "age": 30}
]
}
}
其中数组friends的路径为:info.friends
。
dsq user.json "select * from {'info.friends'} order by age"
结果如下:
[{"age":28,"name":"Tom"},
{"age":30,"name":"Jones"},
{"age":32,"name":"Jack"}]
- 嵌套对象值
数据文件user.json
内容如下:
[{
"id": 100,
"info": {
"name": "albert",
"age": 30,
"friends": [
{"name": "Tom", "age": 28},
{"name": "Jack", "age": 32},
{"name": "Jones", "age": 30}
]
}
},
{
"id": 101,
"info": {
"name": "gary",
"age": 22,
"friends": [
{"name": "clement", "age": 18},
{"name": "twinkle", "age": 25}
]
}
}]
可以使用如下的方式查询嵌套字段:
dsq user.json "select id, \"info.name\", \"info.age\" from {} "
结果如下:
[{"id":100,"info.name":"albert","info.age":30},
{"id":101,"info.name":"gary","info.age":22}]
6、正则表达式
语法:x REGEXP 'y'
,其中x为列名,y为正则表达式。
dsq data.json "select * from {} where title REGEXP '.* a .*'"
结果如下:
[{"completed":false,"id":6,"title":"play a game","username":"venus"},
{"completed":true,"id":7,"title":"go on a date","username":"danika"},
{"username":"zane","completed":true,"id":8,"title":"go on a journey"}]
7、输出列顺序
如果要保证列的顺序,可以使用jq
过滤(管道数据在Windows上不起作用):
dsq x.csv 'SELECT a, b FROM {}' | jq --sort-keys
8、交互式REPL
使用-i
或--interactive
可以开启交互式REPL,可以在其中运行多个SQL查询。