requests

Requests 继承urllib的所有特性。支持HTTP连接保持和连接池,支持使用cookie保持会话,支持文件上传,支持自动确定响应内容的编码,支持国际化的 URL 和 POST 数据自动编码。

安装

  • pip install requests
  • easy_install requests

get

1
2
response = requests.get("http://www.baidu.com/")
response = requests("get","http://www.baidu.com/")

POST

1
response = requests.post("http://www.baidu.com/", data = data)

其他方法可以自行搜索

响应内容

1
2
3
4
5
6
7
8
9

response.url # 打印请求url
response.headers # 打印头信息
response.cookies # 打印cookie信息
response.text # 以文本形式打印网页源码
response.content # 以字节流形式打印
response.status_code # 打印状态码
# <str 使用encode方法转化为 bytes
# bytes通过decode转化为str>

参数

用到哪个写哪个就行了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
:param method: method for the new :class:`Request` object.
:param url: URL for the new :class:`Request` object.
:param params: (optional) Dictionary or bytes to be sent in the query string for the :class:`Request`.
:param data: (optional) Dictionary or list of tuples ``[(key, value)]`` (will be form-encoded), bytes, or file-like object to send in the body of the :class:`Request`.
:param json: (optional) json data to send in the body of the :class:`Request`.
:param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
:param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
:param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
to add for the file.
:param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
:param timeout: (optional) How many seconds to wait for the server to send data
before giving up, as a float, or a :ref:`(connect timeout, read
timeout) <timeouts>` tuple.
:type timeout: float or tuple
:param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
:type allow_redirects: bool
:param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
:param verify: (optional) Either a boolean, in which case it controls whether we verify
the server's TLS certificate, or a string, in which case it must be a path
to a CA bundle to use. Defaults to ``True``.
:param stream: (optional) if ``False``, the response content will be immediately downloaded.
:param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.

爬虫步骤我大致归类是三种

  1. 用什么工具怕什么网页
  2. 怎么找到网页内容
  3. 用什么方法得到想要的内容形式
  4. 之后就是细节,专业术语一定有的,但我只说概念比如更快的时间,更稳定的爬取,部分爬取,实时爬取。。。

版权声明:本文为博主原创,如若转载请标明出处https://dword.top/requests.html

-------------end-------------
0%