Hi, for example, I want to get text from this:
<div class="pv-text-details__left-panel mt2”>
<span class="text-body-small inline t-black--light break-words”>
***TEXT I WANT TO GET***
</span>
<span class="pv-text-details__separator t-black--light”>
Other text that I don’t need*
</span>
</div>
I'm not able to figure out what to code to get the text, is it page.locator, page.getByRole (if so, what role), etc. Any help wold be greatly greatly appreciated!!! Thank you so much
This thread is trying to answer question "How can I extract specific text from a div or span in HTML?"
In the case You described above this will work: page.locator('[class*="text-body-small"]').textContent()
- tho that is not the best selector but there is not too much to work with in this example, locator strategy depends on the content in this case You cannot use getByRole - would suggest to add custom data-testid to not rely on something like partial class name
Related Ask AI answer for "What is the method to locate and extract text from a specific HTML element using Playwright?".
You can use Playwright's page.getByText()
method to extract text from specific HTML elements like <div>
or <span>
. This method locates an element based on its text content. You can match the text by a substring, exact string, or even a regular expression.
Here's an example:
const divText = await page.getByText('This is some text inside a div.');
console.log(await divText.textContent());
For an exact match, pass { exact: true }
as an option to page.getByText()
.
const exactMatch = await page.getByText('exact match', { exact: true });
console.log(await exactMatch.textContent());
You can also use regular expressions to match and extract specific patterns of texts.
const regexMatch = await page.getByText(/some [A-Za-z]+/i);
console.log(await regexMatch.textContent());
Remember, page.getByText()
always normalizes whitespace. It turns multiple spaces into one and ignores leading and trailing whitespace.
Playwright's getByText()
method is a powerful tool for locating elements based on their text content. Whether you're dealing with substrings, exact strings, or regular expressions, it's got you covered.
Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email [email protected].