python – How to download a file over HTTP?
python – How to download a file over HTTP?
One more, using urlretrieve
:
import urllib
urllib.urlretrieve(http://www.example.com/songs/mp3.mp3, mp3.mp3)
(for Python 3+ use import urllib.request
and urllib.request.urlretrieve
)
Yet another one, with a progressbar
import urllib2
url = http://download.thinkbroadband.com/10MB.zip
file_name = url.split(/)[-1]
u = urllib2.urlopen(url)
f = open(file_name, wb)
meta = u.info()
file_size = int(meta.getheaders(Content-Length)[0])
print Downloading: %s Bytes: %s % (file_name, file_size)
file_size_dl = 0
block_sz = 8192
while True:
buffer = u.read(block_sz)
if not buffer:
break
file_size_dl += len(buffer)
f.write(buffer)
status = r%10d [%3.2f%%] % (file_size_dl, file_size_dl * 100. / file_size)
status = status + chr(8)*(len(status)+1)
print status,
f.close()
import urllib.request
with urllib.request.urlopen(http://www.example.com/) as f:
html = f.read().decode(utf-8)
This is the most basic way to use the library, minus any error handling. You can also do more complex stuff such as changing headers.
On Python 2, the method is in urllib2
:
import urllib2
response = urllib2.urlopen(http://www.example.com/)
html = response.read()
python – How to download a file over HTTP?
In 2012, use the python requests library
>>> import requests
>>>
>>> url = http://download.thinkbroadband.com/10MB.zip
>>> r = requests.get(url)
>>> print len(r.content)
10485760
You can run pip install requests
to get it.
Requests has many advantages over the alternatives because the API is much simpler. This is especially true if you have to do authentication. urllib and urllib2 are pretty unintuitive and painful in this case.
2015-12-30
People have expressed admiration for the progress bar. Its cool, sure. There are several off-the-shelf solutions now, including tqdm
:
from tqdm import tqdm
import requests
url = http://download.thinkbroadband.com/10MB.zip
response = requests.get(url, stream=True)
with open(10MB, wb) as handle:
for data in tqdm(response.iter_content()):
handle.write(data)
This is essentially the implementation @kvance described 30 months ago.