Requests 继承urllib的所有特性。支持HTTP连接保持和连接池,支持使用cookie保持会话,支持文件上传,支持自动确定响应内容的编码,支持国际化的 URL 和 POST 数据自动编码。
安装
- pip install requests
- easy_install requests
get
1 | response = requests.get("http://www.baidu.com/") |
POST
1 | response = requests.post("http://www.baidu.com/", data = data) |
其他方法可以自行搜索
响应内容
1 |
|
参数
用到哪个写哪个就行了1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25`Request` object. method for the new :class:
`Request` object. URL for the new :class:
`Request`. (optional) Dictionary or bytes to be sent in the query string for the :class:
``[(key, value)]`` (will be form-encoded), bytes, or file-like object to send in the body of the :class:`Request`. (optional) Dictionary or list of tuples
`Request`. (optional) json data to send in the body of the :class:
`Request`. (optional) Dictionary of HTTP Headers to send with the :class:
`Request`. (optional) Dict or CookieJar object to send with the :class:
``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload. (optional) Dictionary of
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
to add for the file.
(optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
(optional) How many seconds to wait for the server to send data
before giving up, as a float, or a :ref:`(connect timeout, read
timeout) <timeouts>` tuple.
:type timeout: float or tuple
``True``. (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to
:type allow_redirects: bool
(optional) Dictionary mapping protocol to the URL of the proxy.
(optional) Either a boolean, in which case it controls whether we verify
the server's TLS certificate, or a string, in which case it must be a path
to a CA bundle to use. Defaults to ``True``.
``False``, the response content will be immediately downloaded. (optional) if
'cert', 'key') pair. (optional) if String, path to ssl client cert file (.pem). If Tuple, (
爬虫步骤我大致归类是三种
- 用什么工具怕什么网页
- 怎么找到网页内容
- 用什么方法得到想要的内容形式
- 之后就是细节,专业术语一定有的,但我只说概念比如更快的时间,更稳定的爬取,部分爬取,实时爬取。。。