← Back to Discord Forum

how to get content from a specific div or span

kombucha199posted in #help-playwright
Open in Discord

Hi, for example, I want to get text from this:

<div class="pv-text-details__left-panel mt2”> <span class="text-body-small inline t-black--light break-words”> ***TEXT I WANT TO GET*** </span> <span class="pv-text-details__separator t-black--light”> Other text that I don’t need* </span> </div>

I'm not able to figure out what to code to get the text, is it page.locator, page.getByRole (if so, what role), etc. Any help wold be greatly greatly appreciated!!! Thank you so much

This thread is trying to answer question "How can I extract specific text from a div or span in HTML?"

4 replies

In the case You described above this will work: page.locator('[class*="text-body-small"]').textContent() - tho that is not the best selector but there is not too much to work with in this example, locator strategy depends on the content in this case You cannot use getByRole - would suggest to add custom data-testid to not rely on something like partial class name


This didn't end up working, I got a timeout exceeded error

maybe this can help you- await page.locator('div.pv-text-details__left-panel.mt2 span.text-body-small').textContent()

could try a good old Xpath


this will match the node desired only.


Extracting Text from HTML Elements with Playwright

You can use Playwright's page.getByText() method to extract text from specific HTML elements like <div> or <span>. This method locates an element based on its text content. You can match the text by a substring, exact string, or even a regular expression.

Here's an example:

const divText = await page.getByText('This is some text inside a div.');
console.log(await divText.textContent());

For an exact match, pass { exact: true } as an option to page.getByText().

const exactMatch = await page.getByText('exact match', { exact: true });
console.log(await exactMatch.textContent());

You can also use regular expressions to match and extract specific patterns of texts.

const regexMatch = await page.getByText(/some [A-Za-z]+/i);
console.log(await regexMatch.textContent());

Remember, page.getByText() always normalizes whitespace. It turns multiple spaces into one and ignores leading and trailing whitespace.

Playwright's getByText() method is a powerful tool for locating elements based on their text content. Whether you're dealing with substrings, exact strings, or regular expressions, it's got you covered.

AboutQuestionsDiscord ForumBrowser ExtensionTagsQA Jobs

Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email luc@ray.run.