Download and save PDF file with Python requests module Ask Question

Question

You should use response.content in this case:

with open('/tmp/metadata.pdf', 'wb') as f:
    f.write(response.content)

From the document:

You can also access the response body as bytes, for non-text requests:
>>> r.content
b'[{"repository":{"open_issues":0,"url":"https://github.com/...

So that means: response.text return the output as a string object, use it when you're downloading a text file. Such as HTML file, etc.

response.content出力をバイトオブジェクトとして返します。ダウンロードするときに使用します。バイナリーファイルPDFファイル、音声ファイル、画像など。

response.raw代わりに以下を使用することもできますただし、ダウンロードしようとしているファイルが大きい場合に使用します。以下は、ドキュメントにも記載されている基本的な例です。

import requests

url = 'http://www.hrecos.org//images/Data/forweb/HRTVBSH.Metadata.pdf'
r = requests.get(url, stream=True)

with open('/tmp/metadata.pdf', 'wb') as fd:
    for chunk in r.iter_content(chunk_size):
        fd.write(chunk)

chunk_sizeは、使用するチャンクサイズです。に設定すると2000、リクエストはファイルの最初のバイトをダウンロードし2000、それをファイルに書き込み、終了するまでこれを何度も繰り返します。

これにより、RAM を節約できます。ただし、response.contentこの場合はファイルが小さいため、代わりにを使用することをお勧めします。ご覧のとおり、使用はresponse.raw複雑です。

関連:

Answer 1