Monitor vulnerabilities like this one.
Sign up free to get alerted when software you use is affected.
6.1
lxml-html-clean allows attackers to hijack relative links
GHSA-xvp8-3mhv-424c
CVE-2026-28350
GHSA-xvp8-3mhv-424c
Summary
The lxml-html-clean library does not properly remove <base> tags, allowing attackers to inject them and redirect users to malicious sites or steal sensitive information. To fix this, update to the latest version of lxml-html-clean or configure the Cleaner to remove <base> tags explicitly.
What to do
- Update lxml-html-clean to version 0.4.4.
Affected software
| Vendor | Product | Affected versions | Fix available |
|---|---|---|---|
| – | lxml-html-clean | <= 0.4.3 | 0.4.4 |
| fedoralovespython | lxml_html_clean | <= 0.4.4 | – |
| – | lxml-html-clean | <= 0.4.4 | 0.4.4 |
Original title
lxml-html-clean has <base> tag injection through default Cleaner configuration
Original description
### Summary
The `<base>` tag passes through the default `Cleaner` configuration. While `page_structure=True` removes `html`, `head`, and `title` tags, there is no specific handling for `<base>`, allowing an attacker to inject it and hijack relative links on the page.
### Details
The `<base>` tag is not currently in the `page_structure` kill set. Even though the specification says `<base>` must be inside `<head>`, browsers accept `<base>` tags outside of the head.
If an attacker injects a `<base>` tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.
### PoC
```python
from lxml_html_clean import clean_html
# The base tag is preserved in the output
result = clean_html('<base href="http://evil.com/"><a href="/account">Account</a>')
print(result)
# Output: <div><base href="http://evil.com/">...<a href="/account">Account</a></div>
```
### Impact
The injection of a `<base>` tag allows an attacker to hijack the resolution of **all** relative URLs on the page. This results in three critical attack vectors:
1. **Phishing & Redirection:** Attackers can redirect user navigation (e.g., `<a href="/login">`) and form submissions (e.g., `<form action="/auth">`) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site.
2. **Cross-Site Scripting (XSS):** If the victim application loads JavaScript files using relative paths (e.g., `<script src="assets/app.js">`), the browser will attempt to fetch the script from the attacker's domain. This upgrades the vulnerability from HTML injection to full Stored XSS.
3. **Defacement:** Relative references to images (`<img>`) and stylesheets (`<link>`) will be loaded from the attacker's server, allowing for UI redressing or defacement.
The `<base>` tag passes through the default `Cleaner` configuration. While `page_structure=True` removes `html`, `head`, and `title` tags, there is no specific handling for `<base>`, allowing an attacker to inject it and hijack relative links on the page.
### Details
The `<base>` tag is not currently in the `page_structure` kill set. Even though the specification says `<base>` must be inside `<head>`, browsers accept `<base>` tags outside of the head.
If an attacker injects a `<base>` tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.
### PoC
```python
from lxml_html_clean import clean_html
# The base tag is preserved in the output
result = clean_html('<base href="http://evil.com/"><a href="/account">Account</a>')
print(result)
# Output: <div><base href="http://evil.com/">...<a href="/account">Account</a></div>
```
### Impact
The injection of a `<base>` tag allows an attacker to hijack the resolution of **all** relative URLs on the page. This results in three critical attack vectors:
1. **Phishing & Redirection:** Attackers can redirect user navigation (e.g., `<a href="/login">`) and form submissions (e.g., `<form action="/auth">`) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site.
2. **Cross-Site Scripting (XSS):** If the victim application loads JavaScript files using relative paths (e.g., `<script src="assets/app.js">`), the browser will attempt to fetch the script from the attacker's domain. This upgrades the vulnerability from HTML injection to full Stored XSS.
3. **Defacement:** Relative references to images (`<img>`) and stylesheets (`<link>`) will be loaded from the attacker's server, allowing for UI redressing or defacement.
ghsa CVSS3.1
6.1
Vulnerability type
CWE-116
- https://github.com/fedora-python/lxml_html_clean/security/advisories/GHSA-xvp8-3...
- https://github.com/fedora-python/lxml_html_clean/commit/9c5612ca33b941eec4178abf...
- https://nvd.nist.gov/vuln/detail/CVE-2026-28350
- https://github.com/advisories/GHSA-xvp8-3mhv-424c
- https://github.com/fedora-python/lxml_html_clean Product
Published: 2 Mar 2026 · Updated: 11 Mar 2026 · First seen: 6 Mar 2026