正規表現に基づいて複数行を連結します。

2024-07-07 • tag-icon

次のようにpandocの出力をHTMLに変換しました。

foo

bar

<blockquote>

That's one small step for man, one giant leap for mankind

A new line and another quote

</blockquote>

baz

私はこれを次のようにしたいと思います：

foo

bar<blockquote>That's one small step for man, one giant leap for mankind

A new line and another quote</blockquote>baz

（ブロック引用符はとにかく個別にレンダリングされるため、追加の改行は必要ありません。）

私はsedで実験を始め、最終的に次のような結果を得ました。

'/./ {printf "%s%s", $0, ($1 ~ /^$/ && $2 ~ /<\/?blockquote>/) ? OFS : ORS}'

私が望むもののいくつかを実行しますが、修正方法を理解するにはあまりにも進歩したステップです。

つまり、私が望むルールは、次の行が空で、その後の行が一致する場合は、/<\/?blockquote>/現在の行、次の行、次の行を区切りなしで印刷して続行することです。

ベストアンサー1

GNU awkを使用して、ファイル全体をメモリに一度に読み取ることなく複数の文字RS、RTおよびgensub()を処理します。\s

$ awk -v RS='\\s*</?blockquote>\\s*' '{ORS=gensub(/\s+/,"","g",RT)} 1' file
foo

bar<blockquote>That's one small step for man, one giant leap for mankind

A new line and another quote</blockquote>baz

ベストアンサー1

おすすめ記事