Cutコマンドがソートされた列のフィールドを正しく抽出しません。

Question

列間の空白がすべてタブのように見えるわけではないため、cut目的の操作を実行できません。を使用することをお勧めしますawk。cutやりたいことなど、データ列を解析するよりも柔軟です。

$ awk '{print $3,$4,$5,$8}' data.txt

はい

$ awk '{print $3,$4,$5,$8}' data.txt 
4567 Harrison Joel Accountant
4587 Mitchell Barbara Admin
3589 Olson Timothy Supervisor
4591 Moore Sarah Dept
4527 Polk John Accountant
4567 Harrison Joel Accountant
1557 Harrison James Supervisor

次のコマンドを使用して出力間隔を増やすこともできますcolumn。

$ awk '{print $3,$4,$5,$8}' data.txt |column -t
4567  Harrison  Joel     Accountant
4587  Mitchell  Barbara  Admin
3589  Olson     Timothy  Supervisor
4591  Moore     Sarah    Dept
4527  Polk      John     Accountant
4567  Harrison  Joel     Accountant
1557  Harrison  James    Supervisor

awkjustと次を使用してすべての操作を実行することもできますprintf。

$ awk '{printf "%s\t%-20s\t%s\n",$3,$4" "$5,$8}' data.txt 
4567    Harrison Joel           Accountant
4587    Mitchell Barbara        Admin
3589    Olson Timothy           Supervisor
4591    Moore Sarah             Dept
4527    Polk John               Accountant
4567    Harrison Joel           Accountant
1557    Harrison James          Supervisor

クリップをもう一度見てください。

上記の方法は素晴らしいですが、特定の列の値に空白がある行を処理しません。たとえば、「Dept Manager」を含む行は「Dept」に切り捨てられます。

データが図のように構造化されていることを保証できる場合はそれを使用できますが、区切り文字に分割するのではなく、表示する文字の実際のcut位置のみを使用できます。

はい

これにより、ファイル内のテキストが切り捨てられ、data.txt位置9〜13、14〜35などのすべての内容が印刷されます。

$ cut -c 9-13,14-35,43-58 data.txt 
4567 Harrison     Joel     Accountant      
4587 Mitchell     Barbara  Admin Asst      
3589 Olson        Timothy  Supervisor      
4591 Moore        Sarah    Dept Manager    
4527 Polk         John     Accountant      
4567 Harrison     Joel     Accountant      
1557 Harrison     James    Supervisor

awkにもう一度アクセスしてください。

区切り記号ではなく位置に基づいて awk 抽出テキストを作成することもできます。より詳細ですが、完全性のために実装方法は次のとおりです。

$ awk '{
    printf "%s\t%-20s\t%s\n",substr($0,9,5),substr($0,14,22),substr($0,43,16)
  }' data.txt
4567    Harrison     Joel       Accountant      
4587    Mitchell     Barbara    Admin Asst      
3589    Olson        Timothy    Supervisor      
4591    Moore        Sarah      Dept Manager    
4527    Polk         John       Accountant      
4567    Harrison     Joel       Accountant      
1557    Harrison     James      Supervisor

奇妙なフィールド幅

GNUバリアントを使用している場合は、この変数を使用して各フィールドの静的サイズを指定awkできます。アクセスできる場合は、この方法はそれよりはるかにきれいです。別々のフィールドに解析されるフィールドを効果的に貼り付けることもできます。FIELDWIDTHSsubstr

$ awk 'BEGIN { FIELDWIDTHS="4 4 5 24 5 16 11" }{ print $3,$4,$5,$6 }' data.txt 
4567  Harrison     Joel     M  4540  Accountant      
4587  Mitchell     Barbara  C  4541  Admin Asst      
3589  Olson        Timothy  H  4544  Supervisor      
4591  Moore        Sarah    H  4500  Dept Manager    
4527  Polk         John     S  4520  Accountant      
4567  Harrison     Joel     M  4540  Accountant      
1557  Harrison     James    M  4544  Supervisor

Answer 1

列間の空白がすべてタブのように見えるわけではないため、cut目的の操作を実行できません。を使用することをお勧めしますawk。cutやりたいことなど、データ列を解析するよりも柔軟です。

$ awk '{print $3,$4,$5,$8}' data.txt

はい

$ awk '{print $3,$4,$5,$8}' data.txt 
4567 Harrison Joel Accountant
4587 Mitchell Barbara Admin
3589 Olson Timothy Supervisor
4591 Moore Sarah Dept
4527 Polk John Accountant
4567 Harrison Joel Accountant
1557 Harrison James Supervisor

次のコマンドを使用して出力間隔を増やすこともできますcolumn。

$ awk '{print $3,$4,$5,$8}' data.txt |column -t
4567  Harrison  Joel     Accountant
4587  Mitchell  Barbara  Admin
3589  Olson     Timothy  Supervisor
4591  Moore     Sarah    Dept
4527  Polk      John     Accountant
4567  Harrison  Joel     Accountant
1557  Harrison  James    Supervisor

awkjustと次を使用してすべての操作を実行することもできますprintf。

$ awk '{printf "%s\t%-20s\t%s\n",$3,$4" "$5,$8}' data.txt 
4567    Harrison Joel           Accountant
4587    Mitchell Barbara        Admin
3589    Olson Timothy           Supervisor
4591    Moore Sarah             Dept
4527    Polk John               Accountant
4567    Harrison Joel           Accountant
1557    Harrison James          Supervisor

クリップをもう一度見てください。

上記の方法は素晴らしいですが、特定の列の値に空白がある行を処理しません。たとえば、「Dept Manager」を含む行は「Dept」に切り捨てられます。

データが図のように構造化されていることを保証できる場合はそれを使用できますが、区切り文字に分割するのではなく、表示する文字の実際のcut位置のみを使用できます。

はい

これにより、ファイル内のテキストが切り捨てられ、data.txt位置9〜13、14〜35などのすべての内容が印刷されます。

$ cut -c 9-13,14-35,43-58 data.txt 
4567 Harrison     Joel     Accountant      
4587 Mitchell     Barbara  Admin Asst      
3589 Olson        Timothy  Supervisor      
4591 Moore        Sarah    Dept Manager    
4527 Polk         John     Accountant      
4567 Harrison     Joel     Accountant      
1557 Harrison     James    Supervisor

awkにもう一度アクセスしてください。

区切り記号ではなく位置に基づいて awk 抽出テキストを作成することもできます。より詳細ですが、完全性のために実装方法は次のとおりです。

$ awk '{
    printf "%s\t%-20s\t%s\n",substr($0,9,5),substr($0,14,22),substr($0,43,16)
  }' data.txt
4567    Harrison     Joel       Accountant      
4587    Mitchell     Barbara    Admin Asst      
3589    Olson        Timothy    Supervisor      
4591    Moore        Sarah      Dept Manager    
4527    Polk         John       Accountant      
4567    Harrison     Joel       Accountant      
1557    Harrison     James      Supervisor

奇妙なフィールド幅

GNUバリアントを使用している場合は、この変数を使用して各フィールドの静的サイズを指定awkできます。アクセスできる場合は、この方法はそれよりはるかにきれいです。別々のフィールドに解析されるフィールドを効果的に貼り付けることもできます。FIELDWIDTHSsubstr

$ awk 'BEGIN { FIELDWIDTHS="4 4 5 24 5 16 11" }{ print $3,$4,$5,$6 }' data.txt 
4567  Harrison     Joel     M  4540  Accountant      
4587  Mitchell     Barbara  C  4541  Admin Asst      
3589  Olson        Timothy  H  4544  Supervisor      
4591  Moore        Sarah    H  4500  Dept Manager    
4527  Polk         John     S  4520  Accountant      
4567  Harrison     Joel     M  4540  Accountant      
1557  Harrison     James    M  4544  Supervisor

Cutコマンドがソートされた列のフィールドを正しく抽出しません。

ベストアンサー1

はい

クリップをもう一度見てください。

はい

awkにもう一度アクセスしてください。

奇妙なフィールド幅

おすすめ記事