Python爬虫:Scrapy的get请求和post请求

532次阅读
没有评论
Python爬虫:Scrapy的get请求和post请求

scrapy 请求继承体系

Request |– FormRequest

通过以下请求测试
GET: https://httpbin.org/get
POST: https://httpbin.org/post

get请求

方式:通过Request 发送

import json

from scrapy import Spider, Request, cmdline

class SpiderRequest(Spider): name = "spider_request"

def start_requests(self): url = "https://httpbin.org/get?name=tom" yield Request(url, body=json.dumps({"age": "23"}))

def parse(self, response): print(response.text)

if __name__ == '__main__': cmdline.execute("scrapy crawl spider_request".split())

服务端收到url链接中的参数name,而没有收到body里边的参数age

"args": { "name": "tom" },

post请求

方式一:通过FormRequest 发送

from scrapy import Spider, cmdline, FormRequest

class SpiderFormData(Spider): name = "spider_form_data"

def start_requests(self): url = "https://httpbin.org/post" yield FormRequest(url, formdata={"name": "Tom"})

def parse(self, response): print(response.text)

if __name__ == '__main__': cmdline.execute("scrapy crawl spider_form_data".split())

服务器接收到参数

"form": { "name": "Tom" },

而且headers里边有一个参数

"headers": { "Content-Type": "application/x-www-form-urlencoded", },

方式二:通过Request发送

需要添加参数 method="POST"

import json

from scrapy import Spider, Request, cmdline

class SpiderPost(Spider): name = "spider_post"

def start_requests(self): url = "https://httpbin.org/post" yield Request(url, method="POST", body=json.dumps({"name": "Tom"}))

def parse(self, response): print(response.text)

if __name__ == '__main__': cmdline.execute("scrapy crawl spider_post".split())

1、直接发送post请求,服务器端收到参数data,和json:

"data": "{"name": "Tom"}", "form": {}, "json": { "name": "Tom" },

2、如果添加headers参数:

"headers": { "Content-Type": "application/x-www-form-urlencoded", },

服务器收到参数,form将接收到参数,也就是FormRequest的提交方式

"data": "", "form": { "{"name": "Tom"}": "" }, "json": null,

3、如果添加headers参数:

"headers": { "Content-Type": "application/json", },

服务器端将收到data 和json 参数,和第一个情形一样,不过有时候不加这个请求头参数获取,会请求错误

"data": "{"name": "Tom"}", "form": {}, "json": { "name": "Tom" },

总结

请求方式使用方法headers参数参数服务器端接收到参数
get Request ?name=tom args
post FormRequest 有默认值 formdata={“name”: “Tom”} form
post Request body=json.dumps({“name”: “Tom”}) data,json
post Request “Content-Type”: “application/x-www-form-urlencoded” body=json.dumps({“name”: “Tom”}) form
post Request “Content-Type”: “application/json”, body=json.dumps({“name”: “Tom”}) data, json

参考
Scrapy Requests and Responses

神龙|纯净稳定代理IP免费测试>>>>>>>>天启|企业级代理IP免费测试>>>>>>>>IPIPGO|全球住宅代理IP免费测试

相关文章:

版权声明:Python教程2022-10-25发表,共计2119字。
新手QQ群:570568346,欢迎进群讨论 Python51学习