我有一组很多股票的定价数据(大约 110 万行)。
我在解析内存中的所有这些数据时遇到问题,因此我想按股票代码将其拆分为单独的文件,并仅在需要时导入数据。
来自:
stockprices.json
收件人:
AAPL.json
ACN.json
...
等等
stockprices.json 当前具有以下结构:
[{
"date": "2016-03-22 00:00:00",
"symbol": "ACN",
"open": "121.029999",
"close": "121.470001",
"low": "120.720001",
"high": "122.910004",
"volume": "711400.0"
},
{
"date": "2016-03-23 00:00:00",
"symbol": "AAPL",
"open": "121.470001",
"close": "119.379997",
"low": "119.099998",
"high": "121.470001",
"volume": "444200.0"
},
{
"date": "2016-03-24 00:00:00",
"symbol": "AAPL",
"open": "118.889999",
"close": "119.410004",
"low": "117.639999",
"high": "119.440002",
"volume": "534100.0"
},
...{}....]
我相信 jq 是完成这项工作的正确工具,但我无法理解它。
我如何获取上面的数据并使用 jq 按符号字段拆分它?
例如我想结束:
AAPL.json:
[{
"date": "2016-03-23 00:00:00",
"symbol": "AAPL",
"open": "121.470001",
"close": "119.379997",
"low": "119.099998",
"high": "121.470001",
"volume": "444200.0"
},
{
"date": "2016-03-24 00:00:00",
"symbol": "AAPL",
"open": "118.889999",
"close": "119.410004",
"low": "117.639999",
"high": "119.440002",
"volume": "534100.0"
}]
和 ACN.json:
[{
"date": "2016-03-22 00:00:00",
"symbol": "ACN",
"open": "121.029999",
"close": "121.470001",
"low": "120.720001",
"high": "122.910004",
"volume": "711400.0"
},
{
"date": "2016-03-22 00:00:00",
"symbol": "ACN",
"open": "121.029999",
"close": "121.470001",
"low": "120.720001",
"high": "122.910004",
"volume": "711400.0"
}
]
请您参考如下方法:
你需要一个循环,但它可以在一次调用中完成:
jq -rc 'group_by(.symbol)[] | "\(.[0].symbol)\t\(.)"' stockprices.json |
while IFS=$'\t' read -r symbol content; do
echo "${content}" > "${symbol}.json"
done