重複項目の検索と置換

Question

以下は、正確な入力形式で動作し、迅速に実行されるsedソリューションです。

sed -rz 's:[ \t]+:,:g;s:$:,:mg;:l;s:,([^,]+),(.*),\1,:,\1,\2,:;tl;s:,$::mg;s:^([^,]+),:\1\t:mg' file.csv

仕組み：

"-z"フラグはファイル全体をロードするため、次のコードはデフォルトのようにすべての行に適用されるのではなく一度だけ適用されます。

#transform input format to actual CSV format
s:[ \t]+:,:g;s:$:,:mg;
#loop while the s command can still find and replace
:l;
    #main code: find two identical cell values anywhere and delete the latter
    #on a very big file this can suffer from backtracking nightmare
    s:,([^,]+),(.*),\1,:,\1,\2,:;
tl;
#transform format back
s:,$::mg;s:^([^,]+),:\1\t:mg

Answer 1

以下は、正確な入力形式で動作し、迅速に実行されるsedソリューションです。

sed -rz 's:[ \t]+:,:g;s:$:,:mg;:l;s:,([^,]+),(.*),\1,:,\1,\2,:;tl;s:,$::mg;s:^([^,]+),:\1\t:mg' file.csv

仕組み：

"-z"フラグはファイル全体をロードするため、次のコードはデフォルトのようにすべての行に適用されるのではなく一度だけ適用されます。

#transform input format to actual CSV format
s:[ \t]+:,:g;s:$:,:mg;
#loop while the s command can still find and replace
:l;
    #main code: find two identical cell values anywhere and delete the latter
    #on a very big file this can suffer from backtracking nightmare
    s:,([^,]+),(.*),\1,:,\1,\2,:;
tl;
#transform format back
s:,$::mg;s:^([^,]+),:\1\t:mg

重複項目の検索と置換

ベストアンサー1

仕組み：

おすすめ記事