Sunday, November 15, 2020
Sven interesting shell command -
 hadoop fs -text /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dwh_le/event_type=SatisfactionEvent/year=2020/month=11/day=11/* | grep  "\"conversationHandlerAccountId\":{\"string\"" | sed -E 's/.*\"accountId\":\{\"string\":\"([0-9]+)\"\}.*/\1/g' | sort --human-numeric-sort | uniq | wc -l
hadoop fs -text /liveperson/data/remote/DC=VA/storage_Shared/data_Platform/dwh_le/event_type=SatisfactionEvent/year=2020/month=11/day=11/* | grep -v "\"conversationHandlerAccountId\":{\"string\"" | sed -E 's/.*\"accountId\":\{\"string\":\"([0-9]+)\"\}.*/\1/g' | sort --human-numeric-sort | uniq | wc -l
Pay attention to this 
sed -E 's/.*\"accountId\":\{\"string\":\"([0-9]+)\"\}.*/\1/g'
Which means to take the accountID value right after the "accountId":{"string":" 
{"header":{"schemaRevision":"5.0.0.1266","eventTimeStamp":1605078051486,"eventUniqueId":{"string":"7AkrXepoQD2zRajMk3LDnA"},"globalSessionId":null,"globalUserId":null,"accountId":{"string":"60270350"},"encrypted":"NONE","platform":"DEFAULT","component
So, if I understand, it search for "accountId":{"string":"[0-9]+ regex  and fetch the number [0-9]+ represented by this \1 token in the sed syntxt
Pay attention to this 
sort --human-numeric-sort 
Pay attention to this 
uniq 
as well
Subscribe to:
Post Comments (Atom)
 
No comments:
Post a Comment