Scrape JSON and HTML responses in different handle...
# crawlee-js
f
I do not know how to scrape a website, that contains JSON and HTML responses My scraper need to: 1. Send a request and parse a JSON response which contains a list of URL that I will enqueue. 2. Scrape those URLs but in HTML using cheerio or whatever is required to do so.
h
message has been deleted
o
Hey, For your task, I'd use 2 request handlers: -
JSON
handler will handle the JSON response, it'll parse it and enqueue HTML requests -
HTML
handler will parse HTML response as usual with cheerio's
$
JSON
and
HTML
are request labels, you can read more about labels [here](https://crawlee.dev/docs/introduction/crawling#the-label-of-enqueuelinks). Basically, if you label a request with e.g.
HTML
label, it will be handled with
HTML
request handler.
Copy code
js
const router = createCheerioRouter();

// add request handler for handling `JSON` labelled requests
router.addHandler('JSON', async ({ body, crawler }) => {
    // parse JSON response
    const json = JSON.parse(body.toString());

    // enqueue HTML requests
    await crawler.addRequests([{ url: '...', userData: { label: 'HTML' } }]);
});

// add request handler for handling `HTML` labelled requests
router.addHandler('HTML', async ({ $ }) => {
    // parse HTML response
});

const crawler = new CheerioCrawler({
    proxyConfiguration,
    maxRequestsPerCrawl,
    requestHandler: router,
});

await crawler.run([{ url: '...', userData: { label: 'JSON' } }]);
Let me know if you have any questions