式1に続く式2のgrep正規表現パターン

2024-06-18 • tag-icon

私はHTMLファイルのバンドル中にタイトルに「エージェント」という言葉があり、そのタイトルの後に特定のエージェントの名前を持つファイルを見つけようとしています。

それでは普通です。

<h3>Agent</h3>
<p>Blah blah blah </p>
<p>Their agent is XYZ Corp.</p>

見つけることができるはずです

ただし、タイトルとXYZ Corpインスタンスの間のマークアップやコンテンツの規則性は保証できません。したがって、DOSまたは同様の状況では、「Agent * XYZ」の意味を検索できます。

-match the string 'Agent'
-followed by anything
-followed by the string 'XYZ'

Ubuntuでgrepを使ってどのように書くのですか？頑張りました

grep -lc 'Agent*XYZ' *.html
grep -lc 'Agent.*?XYZ' *.html

誰も成功しませんでした。複数のファイルでパターンを手動で見つけることができ、パターンが存在することがわかります。

ティア

ベストアンサー1

次のようなものが目標に良いようです。

$ cat d2.txt
<h3>Agent</h3>
<p>Blah blah blah </p>
<p>Their agent is XYZ Corp.</p>

$ grep -i 'agent' d2.txt #-i = ignore case. By default grep returns lines containing agent followed by anything or even alone
<h3>Agent</h3>
<p>Their agent is XYZ Corp.</p>

$ grep -iE 'agent.*XYZ' d2.txt #match agent followed by XYZ
<p>Their agent is XYZ Corp.</p>

ベストアンサー1

おすすめ記事