Rayrun
← Back to Discord Forum

Error Target page, context or browser closed

mlody_krol1posted in #help-playwright
Open in Discord

Hello fellow developers,

I'm facing a consistent issue with Playwright in the Crawlee library context. Every time I perform an async operation on a locator instance, the page unexpectedly closes.

Here's the simplified code where the issue is evident:

const doesContainAllParts: AlertsProximityAnalyzerComparator<
  Frame | Page
> = async (element) => {
  try {
    const test = element.locator('body');
    const result = await test.count();  // Page closes unexpectedly here

    return result > 0;
  } catch (error) {
    console.error('Error in doesContainAllParts:', error);
    throw error;
  }
};

The issue specifically happens at the line const result = await test.count(). Each time this line executes, the page closes, leading to the failure of the operation.

Some key points: The problem consistently occurs every time this code is executed. I'm using the latest versions of Playwright and Crawlee. The issue seems to be tied to the await operation on the locator instance.

I'm stumped as to why this is happening. Is this a known issue with Playwright or Crawlee, or could there be something wrong with my implementation? Any insights, suggestions, or similar experiences would be incredibly helpful.

Thanks a lot in advance for any assistance!

PS I'm adding a video with settings headless: false to show you how it looks

PSS And here is disscussion on github with more details: https://github.com/apify/crawlee/discussions/2185

This thread is trying to answer question "Why does the page close unexpectedly when an async operation is performed on a locator instance in the Crawlee context using Playwright?"

2 replies

sounds like a page crasher, we need a repro then we can take a look at it. Please file it once you have a repro (which is only using playwright) on GitHub.

I tired that and it did not work, As far as I testing I found out that the problem is probably with resoling my promise to early, becasue once I do operations like await page.title() await page.content, etc inside requestHandler everything works fine, but my logic looks different:

private requestHandler: PlaywrightRequestHandler = async ({
    page,
    request,
    log,
  }) => {
    log.info(`Request to: ${request.url} ...`)
      await page.waitForLoadState('domcontentloaded')
      const title = await page.title() // works fine
      const content = await page.content() // works fine
      // every other logic I paste here works fine, but i cannot paste it here because of my buissness logic and other dynamic data I provide here.

      await playwrightUtils.infiniteScroll(page, {
        scrollDownAndUp: true,
        waitForSecs: 2,
        timeoutSecs: 5,
      })
      this.resolvePromise(request.url, page)
  }

====
 private resolvePromise(url: string, result: Page): void {
    if (this.urlToPromiseResolver[url]) {
      this.urlToPromiseResolver[url].resolvePromise(result)
      delete this.urlToPromiseResolver[url]
    }
  }
====
  public resolve = async (urls: string[]): Promise<Page[]> => {
    const urlsWithUniqueKeys = urls.map((url) => ({
      url,
      uniqueKey: `${url}_${Math.random()}`,
    }))

    await this.crawler.addRequests(urlsWithUniqueKeys)

    const promises = urls.map((url) => {
      return new Promise<Page | null>((resolvePromise, rejectPromise) => {
        this.urlToPromiseResolver[url] = { resolvePromise, rejectPromise }
      })
    })

    const result = Promise.all(promises)
      .then(filterResults).cathc(...)

    return result
  }
==== Thats how I use it:
 const [page] = await playwrightCrawleePageResolver.resolve([url])

  const title = await page.title() //error

After debugging it it looks like page close after i call first await on the result from my resolve function

Related Discord Threads

TwitterGitHubLinkedIn
AboutQuestionsDiscord ForumBrowser ExtensionTagsQA Jobs

Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email luc@ray.run.