여러 파일을 필터링하여 공통 첫 번째 열이 있는 행을 유지합니다.

Question

"내부" 편집을 위해 GNU awk를 사용하면 주어진 컬럼1 값이 입력 파일에 여러 번 나타날 수 있는 경우에도 작동합니다.

$ cat tst.awk
BEGIN {
    for (fileNr=1; fileNr<ARGC; fileNr++) {
        file = ARGV[fileNr]
        delete thisFile
        while ( (getline < file) > 0 ) {
            thisFile[$1]
            if ( fileNr == 1 ) {
                common[$1]
            }
        }
        close(file)
        for ( val in common ) {
            if ( !(val in thisFile) ) {
                delete common[val]
            }
        }
    }
}
(FNR == 1) || ($1 in common)

.

$ awk -i inplace -f tst.awk file{1..3}

$ tail -n +1 file{1..3}
==> file1 <==
Column1 Column2 Column3
geneA   11  C
geneF   34  A

==> file2 <==
Column1 Column2 Column3
geneA   34  A
geneF   67  G

==> file3 <==
Column1 Column2 Column3
geneA   22  A
geneF   7   T

그러나 column1 값이 각 파일에 한 번만 나타날 수 있는 경우에는 더 짧아질 수 있습니다.

$ awk -i inplace -v comm="$(cut -f1 file{1..3} | sort | uniq -c | awk '$1==3')" '
    BEGIN{split(comm,tmp); for (i in tmp) common[tmp[i]]} (FNR == 1) || ($1 in common)
' file{1..3}

또는 내부 편집 기능이 있는 awk가 없는 경우:

$ comm="$(cut -f1 file{1..3} | sort | uniq -c | awk '$1==3')"
$ for file in file{1..3}; do
    awk -v comm="$comm" '
        BEGIN{split(comm,tmp); for (i in tmp) common[tmp[i]]} (FNR == 1) || ($1 in common)
    ' "$file" > tmp && mv tmp "$file"
done

Answer 1

"내부" 편집을 위해 GNU awk를 사용하면 주어진 컬럼1 값이 입력 파일에 여러 번 나타날 수 있는 경우에도 작동합니다.

$ cat tst.awk
BEGIN {
    for (fileNr=1; fileNr<ARGC; fileNr++) {
        file = ARGV[fileNr]
        delete thisFile
        while ( (getline < file) > 0 ) {
            thisFile[$1]
            if ( fileNr == 1 ) {
                common[$1]
            }
        }
        close(file)
        for ( val in common ) {
            if ( !(val in thisFile) ) {
                delete common[val]
            }
        }
    }
}
(FNR == 1) || ($1 in common)

.

$ awk -i inplace -f tst.awk file{1..3}

$ tail -n +1 file{1..3}
==> file1 <==
Column1 Column2 Column3
geneA   11  C
geneF   34  A

==> file2 <==
Column1 Column2 Column3
geneA   34  A
geneF   67  G

==> file3 <==
Column1 Column2 Column3
geneA   22  A
geneF   7   T

그러나 column1 값이 각 파일에 한 번만 나타날 수 있는 경우에는 더 짧아질 수 있습니다.

$ awk -i inplace -v comm="$(cut -f1 file{1..3} | sort | uniq -c | awk '$1==3')" '
    BEGIN{split(comm,tmp); for (i in tmp) common[tmp[i]]} (FNR == 1) || ($1 in common)
' file{1..3}

또는 내부 편집 기능이 있는 awk가 없는 경우:

$ comm="$(cut -f1 file{1..3} | sort | uniq -c | awk '$1==3')"
$ for file in file{1..3}; do
    awk -v comm="$comm" '
        BEGIN{split(comm,tmp); for (i in tmp) common[tmp[i]]} (FNR == 1) || ($1 in common)
    ' "$file" > tmp && mv tmp "$file"
done

여러 파일을 필터링하여 공통 첫 번째 열이 있는 행을 유지합니다.

ベストアンサー1

おすすめ記事