重複した名前を削除し、一意の名前の後に配列を印刷する方法

Question

同じもの

awk '$1 != p { if (p>"") {printf "\n"} printf "%s",$1; p=$1 } { printf "\t%s",$2 } END { if(p>"") {printf "\n"} }' datafile

K00002  gene_65472      gene_212051     gene_403626
K00003  gene_666        gene_5168       gene_7635       gene_12687      gene_175295     gene_647659     gene_663019
K00004  gene_88381
K00005  gene_30485      gene_193699     gene_256294     gene_307497

別れたくない場合商標次に空白\tに変更します。

仕組みは次のとおりです。

# Each line is processed in turn. "p" is the previous line's key field value

# Key field isn't the same as before
$1 != p {
    # Flush this line if we have printed something already
    if (p > "") { printf "\n" }

    # Print the key field name and set it as the current key field
    printf "%s", $1; p = $1
}

# Every line, print the second value on the line
{ printf "\t%s", $2 }

# No more input. Flush the line if we have already printed something
END {
    if (p > "") { printf "\n" }
}

~から薄暗いコメント誰ですか作る誰もが答えると、根本的な問題は、Windowsシステムで生成されたデータファイルを使用しており、UNIX / Linuxプラットフォームで動作することを期待していることです。しないでください。または必要な場合は、まずファイルを正しい形式に変換してください。

dos2unix < datafile | awk '...'       # As above

tr -d '\r' < data file | awk '...'    # Also as above

Answer 1

同じもの

awk '$1 != p { if (p>"") {printf "\n"} printf "%s",$1; p=$1 } { printf "\t%s",$2 } END { if(p>"") {printf "\n"} }' datafile

K00002  gene_65472      gene_212051     gene_403626
K00003  gene_666        gene_5168       gene_7635       gene_12687      gene_175295     gene_647659     gene_663019
K00004  gene_88381
K00005  gene_30485      gene_193699     gene_256294     gene_307497

別れたくない場合商標次に空白\tに変更します。

仕組みは次のとおりです。

# Each line is processed in turn. "p" is the previous line's key field value

# Key field isn't the same as before
$1 != p {
    # Flush this line if we have printed something already
    if (p > "") { printf "\n" }

    # Print the key field name and set it as the current key field
    printf "%s", $1; p = $1
}

# Every line, print the second value on the line
{ printf "\t%s", $2 }

# No more input. Flush the line if we have already printed something
END {
    if (p > "") { printf "\n" }
}

~から薄暗いコメント誰ですか作る誰もが答えると、根本的な問題は、Windowsシステムで生成されたデータファイルを使用しており、UNIX / Linuxプラットフォームで動作することを期待していることです。しないでください。または必要な場合は、まずファイルを正しい形式に変換してください。

dos2unix < datafile | awk '...'       # As above

tr -d '\r' < data file | awk '...'    # Also as above

重複した名前を削除し、一意の名前の後に配列を印刷する方法

ベストアンサー1

おすすめ記事