次の3つのファイルがあります。
ファイル1:
ko00980 Metabolism of xenobiotics by cytochrome P450 (5)
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07408 CYP1A1; cytochrome P450, family 1, subfamily A, polypeptide 1 [EC:1.14.14.1]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko00982 Drug metabolism - cytochrome P450 (5)
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00485 FMO; dimethylaniline monooxygenase (N-oxide forming) [EC:1.14.13.8]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko00983 Drug metabolism - other enzymes (4)
ko:K00088 guaB; IMP dehydrogenase [EC:1.1.1.205]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00857 tdk; thymidine kinase [EC:2.7.1.21]
ko:K00876 udk; uridine kinase [EC:2.7.1.48]
ファイル2:
ko00980 Metabolism of xenobiotics by cytochrome P450 (6)
ko:K00001 E1.1.1.1; alcohol dehydrogenase [EC:1.1.1.1]
ko:K00079 CBR1; carbonyl reductase 1 [EC:1.1.1.184 1.1.1.189 1.1.1.197]
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07408 CYP1A1; cytochrome P450, family 1, subfamily A, polypeptide 1 [EC:1.14.14.1]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko00982 Drug metabolism - cytochrome P450 (4)
ko:K00001 E1.1.1.1; alcohol dehydrogenase [EC:1.1.1.1]
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko00983 Drug metabolism - other enzymes (8)
ko:K00088 guaB; IMP dehydrogenase [EC:1.1.1.205]
ko:K00106 XDH; xanthine dehydrogenase/oxidase [EC:1.17.1.4 1.17.3.2]
ko:K00760 hprT; hypoxanthine phosphoribosyltransferase [EC:2.4.2.8]
ko:K00876 udk; uridine kinase [EC:2.7.1.48]
ko:K01431 UPB1; beta-ureidopropionase [EC:3.5.1.6]
ko:K01464 DPYS; dihydropyrimidinase [EC:3.5.2.2]
ko:K01519 ITPA; inosine triphosphate pyrophosphatase [EC:3.6.1.19]
ko:K13421 UMPS; uridine monophosphate synthetase [EC:2.4.2.10 4.1.1.23]
ファイル3:
ko00980 Metabolism of xenobiotics by cytochrome P450 (7)
ko:K00001 E1.1.1.1; alcohol dehydrogenase [EC:1.1.1.1]
ko:K00079 CBR1; carbonyl reductase 1 [EC:1.1.1.184 1.1.1.189 1.1.1.197]
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07408 CYP1A1; cytochrome P450, family 1, subfamily A, polypeptide 1 [EC:1.14.14.1]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko00982 Drug metabolism - cytochrome P450 (6)
ko:K00001 E1.1.1.1; alcohol dehydrogenase [EC:1.1.1.1]
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00485 FMO; dimethylaniline monooxygenase (N-oxide forming) [EC:1.14.13.8]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko00983 Drug metabolism - other enzymes (8)
ko:K00088 guaB; IMP dehydrogenase [EC:1.1.1.205]
ko:K00207 DPYD; dihydropyrimidine dehydrogenase (NADP+) [EC:1.3.1.2]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00857 tdk; thymidine kinase [EC:2.7.1.21]
ko:K00876 udk; uridine kinase [EC:2.7.1.48]
ko:K01431 UPB1; beta-ureidopropionase [EC:3.5.1.6]
ko:K01489 cdd; cytidine deaminase [EC:3.5.4.5]
ko:K01951 guaA; GMP synthase (glutamine-hydrolysing) [EC:6.3.5.2]
各ファイルには括弧で始まるヘッダ行がありko*****
、括弧内に字幕行の名前と数があります。たとえば、次のようになります。
ko00980 Metabolism of xenobiotics by cytochrome P450 (5)
字幕行は次から始まります。ko:K*****
各ヘッダー行のサブヘッダー行を3つのファイルにマージして、次のようなuniq
結果を実行したいと思います。
ko00980:
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07408 CYP1A1; cytochrome P450, family 1, subfamily A, polypeptide 1 [EC:1.14.14.1]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko:K00001 E1.1.1.1; alcohol dehydrogenase [EC:1.1.1.1]
ko:K00079 CBR1; carbonyl reductase 1 [EC:1.1.1.184 1.1.1.189 1.1.1.197]
ko00982
ko:K00121 frmA; S-(hydroxymethyl)glutathione dehydrogenase / alcohol dehydrogenase [EC:1.1.1.284 1.1.1.1]
ko:K00485 FMO; dimethylaniline monooxygenase (N-oxide forming) [EC:1.14.13.8]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00799 GST; glutathione S-transferase [EC:2.5.1.18]
ko:K07409 CYP1A2; cytochrome P450, family 1, subfamily A, polypeptide 2 [EC:1.14.14.1]
ko:K00001 E1.1.1.1; alcohol dehydrogenase [EC:1.1.1.1]
ko:K00088 guaB; IMP dehydrogenase [EC:1.1.1.205]
ko:K00207 DPYD; dihydropyrimidine dehydrogenase (NADP+) [EC:1.3.1.2]
ko:K00857 tdk; thymidine kinase [EC:2.7.1.21]
ko:K00876 udk; uridine kinase [EC:2.7.1.48]
ko:K01431 UPB1; beta-ureidopropionase [EC:3.5.1.6]
ko:K01489 cdd; cytidine deaminase [EC:3.5.4.5]
ko:K01951 guaA; GMP synthase (glutamine-hydrolysing) [EC:6.3.5.2]
ko00983
ko:K00088 guaB; IMP dehydrogenase [EC:1.1.1.205]
ko:K00699 UGT; glucuronosyltransferase [EC:2.4.1.17]
ko:K00857 tdk; thymidine kinase [EC:2.7.1.21]
ko:K00876 udk; uridine kinase [EC:2.7.1.48]
ko:K00106 XDH; xanthine dehydrogenase/oxidase [EC:1.17.1.4 1.17.3.2]
ko:K00760 hprT; hypoxanthine phosphoribosyltransferase [EC:2.4.2.8]
ko:K01431 UPB1; beta-ureidopropionase [EC:3.5.1.6]
ko:K01464 DPYS; dihydropyrimidinase [EC:3.5.2.2]
ko:K01519 ITPA; inosine triphosphate pyrophosphatase [EC:3.6.1.19]
ko:K13421 UMPS; uridine monophosphate synthetase [EC:2.4.2.10 4.1.1.23]
ko:K00207 DPYD; dihydropyrimidine dehydrogenase (NADP+) [EC:1.3.1.2]
ko:K01489 cdd; cytidine deaminase [EC:3.5.4.5]
ko:K01951 guaA; GMP synthase (glutamine-hydrolysing) [EC:6.3.5.2]
ベストアンサー1
これにより、awk
以下を実行できます。
awk '/^ko[^:]/{fn=$1;next};/./{id=fn$1;if (!(seen[id]++)){print > fn}}' file[123]
各ヘッダー行では識別子ko*****
をとして保存し、サブヘッダー行では1を配列のインデックスとしてfn
保存し、最初に表示される場合はその行を書き込みます。fn$1
id
seen
id
fn
1: また使用できますfn$0