一、简介

1、简介

DSQ是一个用于针对JSON、CSV、Excel、Parquet等运行SQL查询的命令行工具,它基于SQLite构建。

2、安装

下载对应平台的压缩包后解压即可。

二、用法

测试数据如下:

  • data.json
[{
	"id": 1,
	"username": "anthony",
	"title": "learn Russian",
	"completed": false
},
{
	"id": 2,
	"username": "nicole",
	"title": "play badminton",
	"completed": true
},
{
	"id": 3,
	"username": "danika",
	"title": "go Shopping",
	"completed": false
},
{
	"id": 4,
	"username": "nicole",
	"title": "play piano",
	"completed": true
},
{
	"id": 5,
	"username": "peter",
	"title": "do homework",
	"completed": false
},
{
	"id": 6,
	"username": "venus",
	"title": "play a game",
	"completed": false
},
{
	"id": 7,
	"username": "danika",
	"title": "go on a date",
	"completed": true
},
{
	"id": 8,
	"username": "zane",
	"title": "go on a journey",
	"completed": true
}]
  • country.csv
id,country
1,Russia
2,Germany
3,France
4,Greece
5,Egypt
6,England
7,America
8,Korea

1、基础用法

  • 查询
dsq data.json "select * from {} where id > 6"

结果如下:

[{"completed":true,"id":7,"title":"go on a date","username":"danika"},
{"completed":true,"id":8,"title":"go on a journey","username":"zane"}]
  • 分组统计
dsq data.json "select username,count(1) ct from {} group by username order by ct desc limit 3"

结果如下:

[{"ct":2,"username":"nicole"},
{"ct":2,"username":"danika"},
{"ct":1,"username":"zane"}]

2、美化输出

dsq -p data.json "select * from {} where username='peter'"

结果如下:

3、多文件连接

可以将多个文件传递给DSQ,每个文件当作一个表,可以通过{N}访问,其中N为命令行传递的文件列表中文件的索引(从0开始)。

由于DSQ使用标准SQL,因此也可以给使用别名。

dsq data.json country.csv "select d.id, d.username, c.country from {0} d join {1} c on d.id = c.id"

结果如下:

[{"country":"Russia","id":1,"username":"anthony"},
{"country":"Germany","id":2,"username":"nicole"},
{"country":"France","id":3,"username":"danika"},
{"country":"Greece","id":4,"username":"nicole"},
{"country":"Egypt","id":5,"username":"peter"},
{"country":"England","id":6,"username":"venus"},
{"country":"America","id":7,"username":"danika"},
{"country":"Korea","id":8,"username":"zane"}]

4、将数据转换为JSON

可以将DSQ支持的文件类型转换为JSON:

dsq country.csv "select * from {}"

或者简写为:

dsq country.csv

结果如下:

[{"country":"Russia","id":"1"},
{"country":"Germany","id":"2"},
{"country":"France","id":"3"},
{"country":"Greece","id":"4"},
{"country":"Egypt","id":"5"},
{"country":"England","id":"6"},
{"country":"America","id":"7"},
{"country":"Korea","id":"8"}]

5、对象嵌套

  • 对象中的嵌套数组

数据文件user.json内容如下:

{
	"id": 100,
	"info": {
		"name": "albert",
		"age": 30,
		"friends": [
			{"name": "Tom", "age": 28},
			{"name": "Jack", "age": 32},
			{"name": "Jones", "age": 30}
		]
	}
}

其中数组friends的路径为:info.friends

dsq user.json "select * from {'info.friends'} order by age"

结果如下:

[{"age":28,"name":"Tom"},
{"age":30,"name":"Jones"},
{"age":32,"name":"Jack"}]
  • 嵌套对象值

数据文件user.json内容如下:

[{
	"id": 100,
	"info": {
		"name": "albert",
		"age": 30,
		"friends": [
			{"name": "Tom", "age": 28},
			{"name": "Jack", "age": 32},
			{"name": "Jones", "age": 30}
		]
	}
},
{
	"id": 101,
	"info": {
		"name": "gary",
		"age": 22,
		"friends": [
			{"name": "clement", "age": 18},
			{"name": "twinkle", "age": 25}
		]
	}
}]

可以使用如下的方式查询嵌套字段:

dsq user.json "select id, \"info.name\", \"info.age\" from {} "

结果如下:

[{"id":100,"info.name":"albert","info.age":30},
{"id":101,"info.name":"gary","info.age":22}]

6、正则表达式

语法:x REGEXP 'y',其中x为列名,y为正则表达式。

dsq data.json "select * from {} where title REGEXP '.* a .*'"

结果如下:

[{"completed":false,"id":6,"title":"play a game","username":"venus"},
{"completed":true,"id":7,"title":"go on a date","username":"danika"},
{"username":"zane","completed":true,"id":8,"title":"go on a journey"}]

7、输出列顺序

如果要保证列的顺序,可以使用jq过滤(管道数据在Windows上不起作用):

dsq x.csv 'SELECT a, b FROM {}' | jq --sort-keys

8、交互式REPL

使用-i--interactive可以开启交互式REPL,可以在其中运行多个SQL查询。

参考资料: