grepを使ってdivコンテンツを抽出するには？

Question

使用grep -A

$ grep -A 2 'class="col-6"' test.html | sed -n 2p
        <p>One of three columns</p>

からman grep：

-A NUM、一致する行の後に末尾のコンテキスト行を--after-context=NUM
印刷します。NUM

または以下を使用してくださいawk。

$ awk '/class="col-6"/{getline; print $0}' test.html
        <p>One of three columns</p>

注：この方法は、構造がテスト入力と全く同じ場合にのみ機能します。一般的に言えば、やるいつも適切なxml / htmlパーサーを好みます。

例えばpython:beautifulsoup

$ python3 -c '
from bs4 import BeautifulSoup
with open("test.html") as fp:
    soup = BeautifulSoup(fp)
print(soup.findAll("div", {"class":"col-6"})[0].findAll("p")[0])'
<p>One of three columns</p>

または、xmlstarlet次のように使用してください。

$ xmlstarlet sel -t -m '//div[@class="col-6"]' -c './p' -n test.html
<p>One of three columns</p>

Answer 1

使用grep -A

$ grep -A 2 'class="col-6"' test.html | sed -n 2p
        <p>One of three columns</p>

からman grep：

-A NUM、一致する行の後に末尾のコンテキスト行を--after-context=NUM
印刷します。NUM

または以下を使用してくださいawk。

$ awk '/class="col-6"/{getline; print $0}' test.html
        <p>One of three columns</p>

注：この方法は、構造がテスト入力と全く同じ場合にのみ機能します。一般的に言えば、やるいつも適切なxml / htmlパーサーを好みます。

例えばpython:beautifulsoup

$ python3 -c '
from bs4 import BeautifulSoup
with open("test.html") as fp:
    soup = BeautifulSoup(fp)
print(soup.findAll("div", {"class":"col-6"})[0].findAll("p")[0])'
<p>One of three columns</p>

または、xmlstarlet次のように使用してください。

$ xmlstarlet sel -t -m '//div[@class="col-6"]' -c './p' -n test.html
<p>One of three columns</p>

grepを使ってdivコンテンツを抽出するには？

ベストアンサー1

おすすめ記事