AdaptivePlaywrightCrawler starts crawling the whol...
# crawlee-js
n
I want to use the AdaptivePlaywrightCrawler, but it seems like it wants to crawl the entire web. Here is my code.
Copy code
const crawler = new AdaptivePlaywrightCrawler({
      renderingTypeDetectionRatio: 0.1,
      maxRequestsPerCrawl: 50,
      async requestHandler({ request, enqueueLinks, parseWithCheerio, querySelector, log, urls }) {
        console.log(request.url, request.uniqueKey);
        await enqueueLinks();
      }
});

crawler.run(['https://crawlee.dev']);
h
Someone will reply to you shortly. In the meantime, this might help:
n
By the way, I've tried with strategy: 'same-domain', 'same-origin' but the result was the same.
e
Hi you have to restrict crawling to specific domains or patterns
Copy code
import { AdaptivePlaywrightCrawler } from 'crawlee';

const crawler = new AdaptivePlaywrightCrawler({
    renderingTypeDetectionRatio: 0.1,
    maxRequestsPerCrawl: 50,
    async requestHandler({ request, enqueueLinks, parseWithCheerio, querySelector, log, urls }) {
        
        await enqueueLinks({
            pseudoUrls: ['https://crawlee.dev[.*]'], });
    },
});

await crawler.run(['https://crawlee.dev']);
n
Hey, thanks! That kinda works, but I thought the default behavior of enqueueLinks was to stay on the same hostname. It's clearly mentioned in the documentation: https://crawlee.dev/docs/introduction/adding-urls