我有一组很多股票的定价数据(大约 110 万行)。

我在解析内存中的所有这些数据时遇到问题,因此我想按股票代码将其拆分为单独的文件,并仅在需要时导入数据。

来自:

stockprices.json 

收件人:

AAPL.json 
ACN.json 
... 

等等

stockprices.json 当前具有以下结构:

[{ 
    "date": "2016-03-22 00:00:00", 
    "symbol": "ACN", 
    "open": "121.029999", 
    "close": "121.470001", 
    "low": "120.720001", 
    "high": "122.910004", 
    "volume": "711400.0" 
}, 
{ 
    "date": "2016-03-23 00:00:00", 
    "symbol": "AAPL", 
    "open": "121.470001", 
    "close": "119.379997", 
    "low": "119.099998", 
    "high": "121.470001", 
    "volume": "444200.0" 
}, 
{ 
    "date": "2016-03-24 00:00:00", 
    "symbol": "AAPL", 
    "open": "118.889999", 
    "close": "119.410004", 
    "low": "117.639999", 
    "high": "119.440002", 
    "volume": "534100.0" 
}, 
...{}....] 

我相信 jq 是完成这项工作的正确工具,但我无法理解它。

我如何获取上面的数据并使用 jq 按符号字段拆分它?

例如我想结束:

AAPL.json:

[{ 
    "date": "2016-03-23 00:00:00", 
    "symbol": "AAPL", 
    "open": "121.470001", 
    "close": "119.379997", 
    "low": "119.099998", 
    "high": "121.470001", 
    "volume": "444200.0" 
}, 
{ 
    "date": "2016-03-24 00:00:00", 
    "symbol": "AAPL", 
    "open": "118.889999", 
    "close": "119.410004", 
    "low": "117.639999", 
    "high": "119.440002", 
    "volume": "534100.0" 
}] 

和 ACN.json:

[{ 
    "date": "2016-03-22 00:00:00", 
    "symbol": "ACN", 
    "open": "121.029999", 
    "close": "121.470001", 
    "low": "120.720001", 
    "high": "122.910004", 
    "volume": "711400.0" 
}, 
    { 
    "date": "2016-03-22 00:00:00", 
    "symbol": "ACN", 
    "open": "121.029999", 
    "close": "121.470001", 
    "low": "120.720001", 
    "high": "122.910004", 
    "volume": "711400.0" 
} 
] 

请您参考如下方法:

你需要一个循环,但它可以在一次调用中完成:

jq -rc 'group_by(.symbol)[] | "\(.[0].symbol)\t\(.)"' stockprices.json | 
while IFS=$'\t' read -r symbol content; do 
    echo "${content}" > "${symbol}.json" 
done 


评论关闭
IT序号网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!