What is the method to locate and extract text from a specific HTML element using Playwright?


Extracting Text from HTML Elements with Playwright

You can use Playwright's page.getByText() method to extract text from specific HTML elements like <div> or <span>. This method locates an element based on its text content. You can match the text by a substring, exact string, or even a regular expression.

Here's an example:

const divText = await page.getByText('This is some text inside a div.');
console.log(await divText.textContent());

For an exact match, pass { exact: true } as an option to page.getByText().

const exactMatch = await page.getByText('exact match', { exact: true });
console.log(await exactMatch.textContent());

You can also use regular expressions to match and extract specific patterns of texts.

const regexMatch = await page.getByText(/some [A-Za-z]+/i);
console.log(await regexMatch.textContent());

Remember, page.getByText() always normalizes whitespace. It turns multiple spaces into one and ignores leading and trailing whitespace.

Playwright's getByText() method is a powerful tool for locating elements based on their text content. Whether you're dealing with substrings, exact strings, or regular expressions, it's got you covered.


Thank you!
Was this helpful?
Still have questions?

If you still have questions, please ask a question and I will try to answer it.

Related Discord Threads

Related Questions

AboutQuestionsDiscord ForumBrowser ExtensionTagsQA Jobs

Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email luc@ray.run.