Rayrun
← Back to Discord Forum

`Page.ContentAsync()` escapes '>' character in style tag

the_tandyman_canposted in #help-playwright
Open in Discord
the_tandyman_can
the_tandyman_can

I'm adding a style tag using Page.AddStyleTagAsync. That style tag has a > character like .selector > *. When I call Page.ContentAsync() to save the HTML, the style tag in the HTML string looks like .selector > *, which breaks the application styles.

The HTML string is quite large, and I'm hesitant to just call html.Replace(">", ">") because it could break legitimately escaped characters outside the head.

Any ideas?

This thread is trying to answer question "How can I prevent `Page.ContentAsync()` from escaping the '>' character in a style tag without breaking legitimately escaped characters?"

3 replies

This is by design might suggest reading up on htmlEncoding and htmlDecoding. Also seems you are using .Net... Checkout: https://learn.microsoft.com/en-us/dotnet/api/system.web.httputility.htmlencode?view=net-7.0

the_tandyman_can
the_tandyman_can
@dand33: Thanks for the tip! I wasn't sure if I could just `HttpUtility.HtmlDecode()` the whole string but it seems like that's actually the best approach

Thanks for the tip! I wasn't sure if I could just HttpUtility.HtmlDecode() the whole string but it seems like that's actually the best approach

Glad to help, the main issue is how do you include characters in the very thing that makes up the documents... Be it HTML or XML... XML you could possibly try CDATA but usually that will usually change the semantics. Don't worry about the speed for the en/de-coding should be a short fast operation even for large documents.

Related Discord Threads

TwitterGitHubLinkedIn
AboutQuestionsDiscord ForumBrowser ExtensionTagsQA Jobs

Rayrun is a community for QA engineers. I am constantly looking for new ways to add value to people learning Playwright and other browser automation frameworks. If you have feedback, email luc@ray.run.