lxml-html-clean allows attackers to hijack relative links

Monitor vulnerabilities like this one. Sign up free to get alerted when software you use is affected.

6.1

lxml-html-clean allows attackers to hijack relative links

GHSA-xvp8-3mhv-424c CVE-2026-28350 GHSA-xvp8-3mhv-424c

Summary

The lxml-html-clean library does not properly remove <base> tags, allowing attackers to inject them and redirect users to malicious sites or steal sensitive information. To fix this, update to the latest version of lxml-html-clean or configure the Cleaner to remove <base> tags explicitly.

What to do

Update lxml-html-clean to version 0.4.4.

Affected software

Vendor	Product	Affected versions	Fix available
–	lxml-html-clean	<= 0.4.3	0.4.4
fedoralovespython	lxml_html_clean	<= 0.4.4	–
–	lxml-html-clean	<= 0.4.4	0.4.4

Original title

lxml-html-clean has <base> tag injection through default Cleaner configuration

Original description

### Summary
The `<base>` tag passes through the default `Cleaner` configuration. While `page_structure=True` removes `html`, `head`, and `title` tags, there is no specific handling for `<base>`, allowing an attacker to inject it and hijack relative links on the page.

### Details
The `<base>` tag is not currently in the `page_structure` kill set. Even though the specification says `<base>` must be inside `<head>`, browsers accept `<base>` tags outside of the head.

If an attacker injects a `<base>` tag, it changes the base URL for all relative URLs on the page (links, images, scripts) to a domain controlled by the attacker.

### PoC
```python
from lxml_html_clean import clean_html

# The base tag is preserved in the output
result = clean_html('<base href="http://evil.com/"><a href="/account">Account</a>')
print(result)
# Output: <div><base href="http://evil.com/">...<a href="/account">Account</a></div>
```

### Impact
The injection of a `<base>` tag allows an attacker to hijack the resolution of **all** relative URLs on the page. This results in three critical attack vectors:

1. **Phishing & Redirection:** Attackers can redirect user navigation (e.g., `<a href="/login">`) and form submissions (e.g., `<form action="/auth">`) to an attacker-controlled domain, effectively stealing credentials or sensitive data without the user realizing they have left the legitimate site.
2. **Cross-Site Scripting (XSS):** If the victim application loads JavaScript files using relative paths (e.g., `<script src="assets/app.js">`), the browser will attempt to fetch the script from the attacker's domain. This upgrades the vulnerability from HTML injection to full Stored XSS.
3. **Defacement:** Relative references to images (`<img>`) and stylesheets (`<link>`) will be loaded from the attacker's server, allowing for UI redressing or defacement.

ghsa CVSS3.1 6.1

Vulnerability type

CWE-116

Published: 2 Mar 2026 · Updated: 11 Mar 2026 · First seen: 6 Mar 2026