Request
Normal
Single Url
yield response.follow(ele.css('a'), cb);Batch Urls
yield* response.follow_all(ele.css('a'), cb);Start Urls
start_urls() {
return [{
link: 'https://target.com',
download: true,
options: {
type: 'zip',
},
direct: true,
}]
}Post data
yield response.from_request({
link: 'https://target.com',
method: 'POST',
form: {
key: value
},
headers,
cb,
});Download file
yield response.from_request({
link: 'https://target.com',
download: true,
options: {
type: 'jpg',
},
headers,
cb,
});another way
yield response.follow(imageEle.attr('src'), cb, {
download: true,
options: {
type: 'jpg',
},
extData: {
title: imageEle.attr('alt'),
}
});Capture snapshot
yield* response.follow_all(ele.css('a.title'), cb, {
splash: true,
download: true,
options: {
type: 'png',
},
render_all: 0,
wait: 0,
// engine: "chromium",
viewport: '1200x2000',
});Common options
option
default
comment
skipDuplicates
true
If duplicate skipping is true, avoid queueing entirely for URLs we already crawled
direct
false
If this option is true, use direct method instead of queue, it will cause system is always waiting for anothor request
Last updated