IDS04-J. Safely extract files from ZipInputStream - SEI CERT Oracle Coding Standard for Java

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

Noncompliant Code Example

This noncompliant code fails to validate the name of the file that is being unzipped. It passes the name directly to the constructor of FileOutputStream . It also fails to check the resource consumption of the file that is being unzipped. It permits the operation to run to completion or until local resources are exhausted.

static final int BUFFER = 512;
// ...
public final void unzip(String filename) throws java.io.IOException{
  FileInputStream fis = new FileInputStream(filename);
  ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis));
  ZipEntry entry;
  try {
    while ((entry = zis.getNextEntry()) != null) {
      System.out.println("Extracting: " + entry);
      int count;
      byte data[] = new byte[BUFFER];
      // Write the files to the disk
      FileOutputStream fos = new FileOutputStream(entry.getName());
      BufferedOutputStream dest = new BufferedOutputStream(fos, BUFFER);
      while ((count = zis.read(data, 0, BUFFER)) != -1) {
        dest.write(data, 0, count);
      dest.flush();
      dest.close();
      zis.closeEntry();
  } finally {
    zis.close();

Noncompliant Code Example ( `getSize()` )

This noncompliant code attempts to overcome the problem by calling the method ZipEntry.getSize() to check the uncompressed file size before uncompressing it. Unfortunately, a malicious attacker can forge the field in the ZIP file that purports to show the uncompressed size of the file, so the value returned by getSize() is unreliable, and local resources may still be exhausted.

static final int BUFFER = 512;
static final int TOOBIG = 0x6400000; // 100MB
// ...
public final void unzip(String filename) throws java.io.IOException{
  FileInputStream fis = new FileInputStream(filename);
  ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis));
  ZipEntry entry;
  try {
    while ((entry = zis.getNextEntry()) != null) {
      System.out.println("Extracting: " + entry);
      int count;
      byte data[] = new byte[BUFFER];
      // Write the files to the disk, but only if the file is not insanely big
      if (entry.getSize() > TOOBIG ) {
         throw new IllegalStateException("File to be unzipped is huge.");
      if (entry.getSize() == -1) {
         throw new IllegalStateException("File to be unzipped might be huge.");
      FileOutputStream fos = new FileOutputStream(entry.getName());
      BufferedOutputStream dest = new BufferedOutputStream(fos, BUFFER);
      while ((count = zis.read(data, 0, BUFFER)) != -1) {
        dest.write(data, 0, count);
      dest.flush();
      dest.close();
      zis.closeEntry();
  } finally {
    zis.close();

Acknowledgement: The vulnerability in this code was pointed out by Giancarlo Pellegrino, researcher at the Technical University of Darmstadt in Germany, and Davide Balzarotti, faculty member of EURECOM in France.

Compliant Solution

In this compliant solution, the code validates the name of each entry before extracting the entry. If the name is invalid, the entire extraction is aborted. However, a compliant solution could also skip only that entry and continue the extraction process, or it could even extract the entry to some safe location.

Furthermore, the code inside the while loop tracks the uncompressed file size of each entry in a zip archive while extracting the entry. It throws an exception if the entry being extracted is too large—about 100MB in this case. We do not use the ZipEntry.getSize() method because the value it reports is not reliable. Finally, the code also counts the number of file entries in the archive and throws an exception if there are more than 1024 entries.

static final int BUFFER = 512;
static final long TOOBIG = 0x6400000; // Max size of unzipped data, 100MB
static final int TOOMANY = 1024;      // Max number of files
// ...
private String validateFilename(String filename, String intendedDir)
      throws java.io.IOException {
  File f = new File(filename);
  String canonicalPath = f.getCanonicalPath(); 
  File iD = new File(intendedDir);
  String canonicalID = iD.getCanonicalPath();
  if (canonicalPath.startsWith(canonicalID)) {
    return canonicalPath;
  } else {
    throw new IllegalStateException("File is outside extraction target directory.");
public final void unzip(String filename) throws java.io.IOException {
  FileInputStream fis = new FileInputStream(filename);
  ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis));
  ZipEntry entry;
  int entries = 0;
  long total = 0;
  try {
    while ((entry = zis.getNextEntry()) != null) {
      System.out.println("Extracting: " + entry);
      int count;
      byte data[] = new byte[BUFFER];
      // Write the files to the disk, but ensure that the filename is valid,
      // and that the file is not insanely big
      String name = validateFilename(entry.getName(), ".");
      if (entry.isDirectory()) {
        System.out.println("Creating directory " + name);
        new File(name).mkdir();
        continue;
      FileOutputStream fos = new FileOutputStream(name);
      BufferedOutputStream dest = new BufferedOutputStream(fos, BUFFER);
      while (total + BUFFER <= TOOBIG && (count = zis.read(data, 0, BUFFER)) != -1) {
        dest.write(data, 0, count);
        total += count;
      dest.flush();
      dest.close();
      zis.closeEntry();
      entries++;
      if (entries > TOOMANY) {
        throw new IllegalStateException("Too many files to unzip.");
      if (total + BUFFER > TOOBIG) {
        throw new IllegalStateException("File being unzipped is too big.");
  } finally {
    zis.close();

Risk Assessment

Rule	Severity	Likelihood	Remediation Cost	Priority	Level
IDS04-J	Low	Probable	High	P2	L3

Automated Detection

Tool	Version	Checker	Description
The Checker Framework	2.1.3	Tainting Checker	Trust and security errors (see Chapter 8)
SonarQube	9.9	S5042	Expanding archive files is security-sensitive

Related Guidelines

MITRE CWE	CWE-409 , Improper Handling of Highly Compressed Data (Data Amplification)
Secure Coding Guidelines for Java SE, Version 5.0	Guideline 1-1 / DOS-1: Beware of activities that may use disproportionate resources

Related Vulnerabilities

Vulnerability	Description
Zip Slip	Zip Slip is a form of directory traversal that can be exploited by extracting files from an archive. It is caused by a failure to validate path names of the files within an archive which can lead to files being extracted outside of the intended directory and overwriting existing system files. An attacker can exploit this vulnerability to overwrite executable files to achieve remote command execution on a victim’s machine. Snyk responsibly disclosed the vulnerability before public disclosure on June 5 ^th 2018. Their blog post and technical paper detailing the vulnerability can be found at https://snyk.io/blog/zip-slip-vulnerability/ .

Vulnerability

Description

Zip Slip

Zip Slip is a form of directory traversal that can be exploited by extracting files from an archive. It is caused by a failure to validate path names of the files within an archive which can lead to files being extracted outside of the intended directory and overwriting existing system files. An attacker can exploit this vulnerability to overwrite executable files to achieve remote command execution on a victim’s machine. Snyk responsibly disclosed the vulnerability before public disclosure on June 5 ^th 2018. Their blog post and technical paper detailing the vulnerability can be found at https://snyk.io/blog/zip-slip-vulnerability/ .

Android Implementation Details

Although not directly a violation of this rule, the Android Master Key vulnerability (insecure use of ZipEntry ) is related to this rule. Another attack vector , found by a researcher in China, is also related to this rule.

Bibliography

[ Mahmoud 2002 ]	Compressing and Decompressing Data Using Java APIs
[ Seacord 2015 ]	IDS04-J. Safely extract files from ZipInputStream LiveLesson

There are a few problems with the compliant solution.

Firstly, can we trust ZipEntry.getSize? No, not really. Currently ZipInputStream checks decompressed size (and others) at the end of reading the data, which is a bit late. Who knows what ZipFile does. I strongly suggest metering the amount of data actually read. Probably don't bother with the lying getSize. Probably also worth metering the size of the raw zip file itself.

Example is also subject to directory traversal and similar attacks (suggest run away from using untrusted data in file paths). And it doesn't close the output stream.

(100 MB is probably better written as 100*1024*1024 than 0x6400000.)

Permalink

Feb 20, 2012

Fair enough. I think we could generalize your point to say that this rule should be just an instance of IDS00-J, where the "untrusted data" includes any metadata coming from the zip file. I guess the only wrinkle is that, in this case, the flow of untrusted input (entry size) to trusted output (uncompressed file size) is implicit, whereas IDS00-J seems to be more about explicit data.

Do we need a rule (or extension of IDS00-J) for implicit flows?

Permalink

Feb 20, 2012

I don't think we need a new rule...we can fix this one.

The getSize() API doc also allows it to return -1 if the size is 'not known', so I wouldn't trust it in Java 1.8 even if it worked now. Still, I would trust getSize() more than the size of the raw ZIP file itself. Best suggestion is to amend the CS to terminate the while loop after reading TOOBIG bytes, if it gets that far.

We do discuss directory traversal attacks in this rule:
FIO00-J. Do not operate on files in shared directories
but that deals with checking the path for a preexisting file, not for creating a new file. Probably could be solved by making sure that none of the directory names in the path are . or .. The other hazards only apply to preexisting directories, which is not an issue when doing uncompression.

The compliant solution didn't close anything because it is intended to replace a subset of the noncompliant solution. Prob we should explicitly do the closing just to avoid confusion.

Permalink

Feb 20, 2012

I was only suggesting metering the raw file as an addition to metering the output. As far as I can see, an entry can only get as far as allocating 16-bit sized chunks of memory, which it will then attempt to read. At least in ZipInputStream. ZipFile is altogether a dodgier area.

There's all sorts of issues with user supplied file name: directory traversal, file extensions, nulls, translation of odd characters.

The non-compliant solution doesn't release resource in the event of an exception.

Permalink

Feb 21, 2012

Two other items not taken into account by this sample:

Total disk space used is not tracked. Should add a int grandTotal and also quit if that exceeds TOOBIG.
Total number of entries. You could also exhaust inodes by zipping up a file with thousands of tiny entries. Should add an int entryCount and quit if that exceeds TOOMANY.

Of course, in a real implementation, these would be methods that dynamically set limits rather than hard-coded values. It's always a bug to allow a user to even accidentally exhaust the machine's resources (memory, disk, inodes, etc.).

In a real implementation, too, you should log these errors as a security exception.

Permalink

Sep 06, 2013

I've fixed the NCCE to use a finally clause to ensure that the ZIP file is closed even if an exception is thrown. (The CS already did this.)

I've modified the CS to also count the number of files extracted and throw an exception if this exceeds a constant...1024 in the example.

I've also modified the CS to not extract more than total bytes. This should address issues of running out of disk space, assuming it has space for total bytes. Actually the maximum amount of space required will be something like total + BUFFER + TOOMANY * BLOCK_SIZE . Exhausting disk space is a nasty problem, partially because many systems give no indication when it happens, and many other programs can crash when their attempts to write to disk fail.

BTW see rule ERR00-J. Do not suppress or ignore checked exceptions for more details on how to handle exceptions securely.

Permalink

Sep 09, 2013

Probably total should be a long. Consider changing

while (total <= TOOBIG ..

while (total < TOOBIG ..

combined with an (overflow aware) check if total+count is smaller than TOOBIG before the write would prevent the total allocated size to grow ever beyond TOOBIG. Otherwise a large chunk exceeding the total limit might be written in the last iteration.

Permalink

May 30, 2014

For the compliant solution:

validateFileName needs to throw IOException

methods should be static

Adding the CS to a test based on 42.zip from http://en.wikipedia.org/wiki/Zip_bomb failed because ZipInputStream doesn't handle encrypted entries. Does this rule out the exploits mentioned?

Stub follows:

@Test

public void checkZipBomb() {

try {

unzip("42.zip");

fail();

} catch(FileNotFoundException x) {

fail(x.getMessage());

} catch(IOException e) {

System.out.println(e);

assertEquals("Too many files to unzip.", e.getMessage());

}

result:

org.junit.ComparisonFailure:

Expected :Too many files to unzip.

Actual :encrypted ZIP entry not supported

Permalink

Sep 02, 2014

Modifying unzip like so:

public static void unzip(String filename, String password) throws java.io .IOException{
FileInputStream fis = new FileInputStream(filename);
ZipInputStream zis = new ZipInputStream(new BufferedInputStream(fis));
if(password != null) {
ZipDecryptInputStream zdis = new ZipDecryptInputStream(fis, password);
zis = new ZipInputStream(new BufferedInputStream(zdis));
}

and using info from : http://blog.alutam.com/2009/10/31/reading-password-protected-zip-files-in-java/

Test now fails with "invalid compression method"

Permalink

Sep 02, 2014

A few options I can see:

unzip the 42.zip and then zip it up yourself using windows (so it isn't using AES).
create your own smaller encrypted test zips and change the CS constants so you don't need so many entries or such large files to prove it is working -- don't forget to create some files that go into places you don't want them to go (doesn't have to be anything sensistive, of course).

Permalink

Sep 02, 2014

Just to add a couple other things this doesn't currently handle.

subdirectories. Currently it will take a subdirectory entry and incorrectly create a file rather than a directory. You'll then get IOExceptions when it tries to put files into the 'subdirectory'.
It's fixed simply with

name = validateFilename(entry.getName(), ".");

if (entry.isDirectory()) {

System.out.println("Creating directory " + name);

new File(name).mkdir();

continue;

}

right above where you currently create a FileOutputStream fos.

incomplete files. Currently it will leave the final incomplete file on disk, assuming the total data exceeds the limit. I have a meeting at the moment, but I'll post an example that solves these.

Permalink

Sep 02, 2014

I've incorporated A. Bishop's fixes to the compliant solution.

I suspect there are more features we could add to enhance our 'secure unzip', but the code is intended to be illustrative rather than feature-complete. So I'm not going to worry about incomplete files.

I suspect the exploit would work on an unencrypted 42.zip file, or any zip file that java.util.zip could handle.

Permalink

Sep 22, 2014

This is sort of a odd rule. We normally focus on one problem, such as "resource exhaustion". A rule like that might include extracting the contents of a zip file or even just reading a single file that is too large. Another rule may focus on the path traversal problem, which IDS02-J. Canonicalize path names before validating them sort of does. The title of this rule is very sweeping: IDS04-J. Safely extract files from ZipInputStream . It sort of suggests that the compliant solution will address every security problem. Given this scope, concerns like the ones expressed by Ron Craig are completely valid. On the other hand, it is probably a reasonable idea to have ZipInputStream in the title of a rule/guideline, so that folks will discover this rule when attempting to use the java.util.zip package.

Permalink

Oct 25, 2014

Noncompliant Code Example

Noncompliant Code Example ( getSize() )

Compliant Solution

Risk Assessment

Automated Detection

Related Guidelines

Related Vulnerabilities

Android Implementation Details

Bibliography

Noncompliant Code Example ( `getSize()` )