Blog Ask AI Tools Videos QA Wiki Discord

how to get content from a specific div or span

Hi, for example, I want to get text from this:

<div class="pv-text-details__left-panel mt2”> <span class="text-body-small inline t-black--light break-words”> ***TEXT I WANT TO GET*** </span> <span class="pv-text-details__separator t-black--light”> Other text that I don’t need* </span> </div>

I'm not able to figure out what to code to get the text, is it page.locator, page.getByRole (if so, what role), etc. Any help wold be greatly greatly appreciated!!! Thank you so much

This thread is trying to answer question "How can I extract specific text from a div or span in HTML?"

4 replies

vertigo448July 20 4:05 PM

In the case You described above this will work: page.locator('[class*="text-body-small"]').textContent() - tho that is not the best selector but there is not too much to work with in this example, locator strategy depends on the content in this case You cannot use getByRole - would suggest to add custom data-testid to not rely on something like partial class name

kombucha199July 20 5:02 PM

This didn't end up working, I got a timeout exceeded error

moshedrorJuly 20 7:37 PM

maybe this can help you- await page.locator('div.pv-text-details__left-panel.mt2 span.text-body-small').textContent()

dand33July 20 11:44 PM

could try a good old Xpath

//div[contains(@class,'mt2')]/span[contains(@class,'inline')]

this will match the node desired only.

Open in Discord

Answer

Extracting Text from HTML Elements with Playwright

You can use Playwright's page.getByText() method to extract text from specific HTML elements like <div> or <span>. This method locates an element based on its text content. You can match the text by a substring, exact string, or even a regular expression.

Here's an example:

const divText = await page.getByText('This is some text inside a div.');
console.log(await divText.textContent());

For an exact match, pass { exact: true } as an option to page.getByText().

const exactMatch = await page.getByText('exact match', { exact: true });
console.log(await exactMatch.textContent());

You can also use regular expressions to match and extract specific patterns of texts.

const regexMatch = await page.getByText(/some [A-Za-z]+/i);
console.log(await regexMatch.textContent());

Remember, page.getByText() always normalizes whitespace. It turns multiple spaces into one and ignores leading and trailing whitespace.

Playwright's getByText() method is a powerful tool for locating elements based on their text content. Whether you're dealing with substrings, exact strings, or regular expressions, it's got you covered.

Blog Ask AI Tools Videos QA Wiki Discord

About Questions Discord Forum Browser Extension Tags QA Jobs

Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email [email protected].