Rayrun
← Back to Discord Forum

I want to download PDF file automatically when I click the button while scraping

_henryxie999posted in #help-playwright
Open in Discord
_henryxie999
_henryxie999

Hi I want to download pdf file automatically when I click the button but this button is not linked to *.pdf file. In web app, when I click the button, a new chrome browser opens and shows PDF file content using PDF viewer. And to download pdf file, I have to click the download icon on the top-right of corner.

How to make disable PDF file viewer and download file automatically?

Thanks

screen.png

This thread is trying to answer question "How to automatically download a PDF file when a button is clicked during web scraping, when the button is not directly linked to the PDF file?"

10 replies

You will have to launch a persistent context with user preferences: const context = await chromium.launchPersistentContext(profilePath,options)

from urllib.request import urlretrieve link_url = locator("button a").get_attribute("href") urlretrieve(link_url, 'path_to_your_folder/give_me_a_name.pdf')

@vipinphogat: Hi @vipinphogat , thanks for your help Would you please provide some code snippet of Python? I am using python Thanks a mil

Hi @vipinphogat , thanks for your help Would you please provide some code snippet of Python? I am using python Thanks a mil

@_henryxie999 why do you ignoring my reply? Do you need any more explanation about the code?

Ok, let me add some hints for you:

  • clicking the button triggering to open the pdf file, accessible with the URL of the link's href. (To prove this, please share the code you have on href, with inspect)
  • the code doing the same, but without the click action
  • link_url gets the href value of the link (of the button)
  • urlretrieve requesting the file of path link_url and saves into/as path_to_your_folder/give_me_a_name.pdf Result: the PDF saved on your project folder, with name give_me_a_name.pdf Done. Last hint: you won't find any other solution....
_henryxie999
_henryxie999

Hi @bandito9274 , thanks for your help Would you please check my attached image again?

_henryxie999
_henryxie999

That has no "href" attribute, If I click the button, it opens a new browser with this link "************.jsf" How to handle this?

_henryxie999
_henryxie999

I am using Python

Can you share an image to show the HTML element of the button?

_henryxie999
_henryxie999

Hi , thanks for your effort, I sent you DMs

Related Discord Threads

TwitterGitHubLinkedIn
AboutQuestionsDiscord ForumBrowser ExtensionTagsQA Jobs

Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email luc@ray.run.