Request

Normal

Single Url

yield response.follow(ele.css('a'), cb);

Batch Urls

yield* response.follow_all(ele.css('a'), cb);

Start Urls

start_urls() {
    return [{
        link: 'https://target.com',
        download: true,
        options: {
            type: 'zip',
        },
        direct: true,
    }]
}

Post data

yield response.from_request({
    link: 'https://target.com',
    method: 'POST',
    form: {
        key: value
    },
    headers,
    cb,
});

Download file

yield response.from_request({
    link: 'https://target.com',
    download: true,
    options: {
        type: 'jpg',
    },
    headers,
    cb,
});

another way

yield response.follow(imageEle.attr('src'), cb, {
    download: true,
    options: {
        type: 'jpg',
    },
    extData: {
        title: imageEle.attr('alt'),
    }
});

Capture snapshot

yield* response.follow_all(ele.css('a.title'), cb, {
    splash: true,
    download: true,
    options: {
        type: 'png',
    },
    render_all: 0,
    wait: 0,
    // engine: "chromium",
    viewport: '1200x2000',
});

Common options

option

default

comment

skipDuplicates

true

If duplicate skipping is true, avoid queueing entirely for URLs we already crawled

direct

false

If this option is true, use direct method instead of queue, it will cause system is always waiting for anothor request

PreviousAPI References NextSelector

Last updated 4 years ago

hashtagNormal

hashtagPost data

hashtagDownload file

hashtagCapture snapshot

hashtagCommon options

Normal

Post data

Download file

Capture snapshot

Common options