データがあり、結論を導き出すために文を要約したいと思います。以下の例はデータとは関係がなく、アイデアを明確にして再現できます。
Employee Suzie signed one time.
Employee Dan signed one time.
Employee Jordan signed one time.
Employee Suzie signed one time.
Employee Suzie signed one time.
Employee Harold signed one time.
Employee Sebastian signed one time.
Employee Jordan signed one time.
Employee Suzie signed one time.
Employee Suzan signed one time.
私はこれらの文を次のように要約したいと思います。
Jordan signed 2 time(s)
Dan signed 1 time(s)
Suzie signed 4 time(s)
Suzan signed 1 time(s)
Sebastian signed 1 time(s)
Harold signed 1 time(s)
持って遊んawk
だけどやりにくいようです。その後、sed
成功しませんでした。それはsed
物事を発見し変更するようです。
ベストアンサー1
一般的な方法は
$ awk '{ count[$2]++ }
END {
for (name in count)
printf("%s signed %d time(s)\n", name, count[name])
}' <file
Harold signed 1 time(s)
Dan signed 1 time(s)
Sebastian signed 1 time(s)
Suzie signed 4 time(s)
Jordan signed 2 time(s)
Suzan signed 1 time(s)
つまり、連想配列/ハッシュを使用して、特定の名前が表示された回数を保存します。END
ブロック内のすべての名前を繰り返し、各名前の要約を印刷します。
より良い形式を指定するには、呼び出しで%s
プレースホルダを変更して名前(左揃え)に10文字を予約しますprintf()
。%-10s
$ awk '{ count[$2]++ }
END {
for (name in count)
printf("%-10s signed %d time(s)\n", name, count[name])
}' <file
Harold signed 1 time(s)
Dan signed 1 time(s)
Sebastian signed 1 time(s)
Suzie signed 4 time(s)
Jordan signed 2 time(s)
Suzan signed 1 time(s)
出力をもう少し操作してみましょう(退屈して)。
$ awk '{ count[$2]++ }
END {
for (name in count)
printf("%-10s signed %d time%s\n", name, count[name],
count[name] > 1 ? "s" : "" )
}' <file
Harold signed 1 time
Dan signed 1 time
Sebastian signed 1 time
Suzie signed 4 times
Jordan signed 2 times
Suzan signed 1 time