ファイルAにはすべてのデータが含まれ、他のファイルBにはIDのみが含まれる2つのファイルがあります。私が望むのは、ファイルBをファイルAと比較し、そのIDのデータを取得することです。私はSUSE Linuxを使用しています。
ファイルA
C 02020 Two-component system [PATH:aap02020]
D NT05HA_1798 sensor protein CpxA
D NT05HA_1797 CpxR K07662 cpxR
C 02030 *Bacterial chemotaxis* [PATH:aap02030]
D NT05HA_0919 maltose-binding periplasmic protein
D NT05HA_0918 maltose-binding periplasmic protein
C 03070 *Bacterial secretion system* [PATH:aap03070]
D NT05HA_1309 protein-export membrane protein SecD
D NT05HA_1310 protein-export membrane protein SecF
D NT05HA_1819 preprotein translocase subunit SecE
D NT05HA_1287 protein-export membrane protein
C 02060 Phosphotransferase system (PTS) [PATH:aap02060]
D NT05HA_0618 phosphoenolpyruvate-protein
D NT05HA_0617 phosphocarrier protein HPr
D NT05HA_0619 pts system
文書B
Bacterial chemotaxis
Bacterial secretion system
希望の出力:
C 02030 *Bacterial chemotaxis* [PATH:aap02030]
D NT05HA_0919 maltose-binding periplasmic protein
D NT05HA_0918 maltose-binding periplasmic protein
C 03070 *Bacterial secretion system* [PATH:aap03070]
D NT05HA_1309 protein-export membrane protein SecD
D NT05HA_1310 protein-export membrane protein SecF
D NT05HA_1819 preprotein translocase subunit SecE
D NT05HA_1287 protein-export membrane protein
ベストアンサー1
あなたが使用できるawk
:
awk 'NR==FNR{ # On the first file,
a[$0]; # store the content in the array a
next
}
{ # On the second file,
for(i in a) # for all element in the array a,
if(index($0,i)) { # check if there is match in the current record
print "C" $0 # in that case print it with the record separator
next
}
}' fileB RS='\nC' fileA
C 02030 *Bacterial chemotaxis* [PATH:aap02030]
D NT05HA_0919 maltose-binding periplasmic protein
D NT05HA_0918 maltose-binding periplasmic protein
C 03070 *Bacterial secretion system* [PATH:aap03070]
D NT05HA_1309 protein-export membrane protein SecD
D NT05HA_1310 protein-export membrane protein SecF
D NT05HA_1819 preprotein translocase subunit SecE
D NT05HA_1287 protein-export membrane protein