My code uses some new feature, compiles fine but fails when live with a "MethodNotFoundException" or "IncompatibleClassChangeError"
You almost certainly have an older version of Apache POI
on your classpath. Quite a few runtimes and other packages
will ship older version of Apache POI, so this is an easy problem
to hit without your realising. Some will ship just one old jar,
some may ship a full set of old POI jars.
The best way to identify the offending earlier jar files is
with a few lines of java. These will load a Core POI class, an
OOXML class and a Scratchpad class, and report where they all came
from.
ClassLoader classloader =
org.apache.poi.poifs.filesystem.POIFSFileSystem.class.getClassLoader();
URL res = classloader.getResource(
"org/apache/poi/poifs/filesystem/POIFSFileSystem.class");
String path = res.getPath();
System.out.println("POI Core came from " + path);
classloader = org.apache.poi.ooxml.POIXMLDocument.class.getClassLoader();
res = classloader.getResource("org/apache/poi/ooxml/POIXMLDocument.class");
path = res.getPath();
System.out.println("POI OOXML came from " + path);
classloader = org.apache.poi.hslf.usermodel.HSLFSlideShow.class.getClassLoader();
res = classloader.getResource("org/apache/poi/hslf/usermodel/HSLFSlideShow.class");
path = res.getPath();
System.out.println("POI Scratchpad came from " + path);
My code uses the scratchpad, compiles fine but fails to run with a "MethodNotFoundException"
You almost certainly have an older version earlier on your
classpath. See the prior answer.
I'm using the poi-ooxml-lite (previously known as poi-ooxml-schemas) jar, but my code is failing with "java.lang.NoClassDefFoundError: org/openxmlformats/schemas/*something*"
To use the new OOXML file formats, POI requires a jar containing
the file format XSDs, as compiled by
XMLBeans
. These
XSDs, once compiled into Java classes, live in the
org.openxmlformats.schemas
namespace.
There are two jar files available, as described in
the components overview section
.
The
full jar of all of the schemas is poi-ooxml-full-XXX.jar (previously known as ooxml-schemas)
(lower versions for older releases, see table below)
,
and it is currently around 16mb. The
smaller poi-ooxml-lite (previously known as poi-ooxml-schemas)
jar
is only about 6mb. This latter jar file only contains the
typically used parts though.
Many users choose to use the smaller poi-ooxml-lite jar to save
space. However, the poi-ooxml-lite jar only contains the XSDs and
classes that are typically used, as identified by the unit tests.
Every so often, you may try to use part of the file format which
isn't included in the minimal poi-ooxml-lite jar. In this case,
you should switch to the full poi-ooxml-full jar. Longer term,
you may also wish to submit a new unit test which uses the extra
parts of the XSDs, so that a future poi-ooxml-lite jar will
include them.
There are a number of ways to get the full poi-ooxml-full jar.
If you are a maven user, see the
the components overview section
for the artifact details to have maven download it for you.
If you download the source release of POI, and/or checkout the
source code from
subversion
,
then you can run the ant task "compile-ooxml-xsds" to have the
OOXML schemas downloaded and compiled for you (This will also
give you the XMLBeans generated source code, in case you wish to
look at this). Finally, you can download the jar by hand from the
Maven Repository.
Note that historically, different versions of poi-ooxml-full / ooxml-schemas were
You've probably enabled logging. Logging is intended only for
autopsy style debugging. Having it enabled will reduce performance
by a factor of at least 100. Logging is helpful for understanding
why POI can't read some file or developing POI itself. Important
errors are thrown as exceptions, which means you probably don't need
logging.
What is the HSSF "eventmodel"?
The SS eventmodel package is an API for reading Excel files without loading the whole spreadsheet into memory. It does
require more knowledge on the part of the user, but reduces memory consumption by more than
tenfold. It is based on the AWT event model in combination with SAX. If you need read-only
access, this is the best way to do it.
Why can't read the document I created using Star Office 5.1?
Star Office 5.1 writes some records using the older BIFF standard. This causes some problems
with POI which supports only BIFF8.
Why am I getting an exception each time I attempt to read my spreadsheet?
It's possible your spreadsheet contains a feature that is not currently supported by POI.
If you encounter this then please create the simplest file that demonstrates the trouble and submit it to
Bugzilla.
How do you tell if a spreadsheet cell contains a date?
Excel stores dates as numbers therefore the only way to determine if a cell is
actually stored as a date is to look at the formatting. There is a helper method
in HSSFDateUtil that checks for this.
Thanks to Jason Hoffman for providing the solution.
case HSSFCell.CELL_TYPE_NUMERIC:
double d = cell.getNumericCellValue();
// test if a date!
if (HSSFDateUtil.isCellDateFormatted(cell)) {
// format in form of M/D/YY
cal.setTime(HSSFDateUtil.getJavaDate(d));
cellText =
(String.valueOf(cal.get(Calendar.YEAR))).substring(2);
cellText = cal.get(Calendar.MONTH)+1 + "/" +
cal.get(Calendar.DAY_OF_MONTH) + "/" +
cellText;
I'm trying to stream an XLS file from a servlet and I'm having some trouble. What's the problem?
The problem usually manifests itself as the junk characters being shown on
screen. The problem persists even though you have set the correct mime type.
The short answer is, don't depend on IE to display a binary file type properly if you stream it via a
servlet. Every minor version of IE has different bugs on this issue.
The problem in most versions of IE is that it does not use the mime type on
the HTTP response to determine the file type; rather it uses the file extension
on the request. Thus you might want to add a
.xls
to your request
string. For example
http://yourserver.com/myServelet.xls?param1=xx
. This is
easily accomplished through URL mapping in any servlet container. Sometimes
a request like
http://yourserver.com/myServelet?param1=xx&dummy=file.xls
is also
known to work.
To guarantee opening the file properly in Excel from IE, write out your file to a
temporary file under your web root from your servlet. Then send an http response
to the browser to do a client side redirection to your temp file. (Note that using a
server side redirect using RequestDispatcher will not be effective in this case)
Note also that when you request a document that is opened with an
external handler, IE sometimes makes two requests to the webserver. So if your
generating process is heavy, it makes sense to write out to a temporary file, so that multiple
requests happen for a static file.
None of this is particular to Excel. The same problem arises when you try to
generate any binary file dynamically to an IE client. For example, if you generate
pdf files using
FOP
, you will come across many of the same issues.
I want to set a cell format (Data format of a cell) of an excel sheet as ###,###,###.#### or ###,###,###.0000. Is it possible using POI ?
Yes. You first need to get a DataFormat object from the workbook and call getFormat with the desired format. Some examples are
here
.
I want to set a cell format (Data format of a cell) of an excel sheet as text. Is it possible using POI ?
Yes. This is a built-in format for excel that you can get from DataFormat object using the format string "@". Also, the string "text" will alias this format.
How do I add a border around a merged cell?
Add blank cells around where the cells normally would have been and set the borders individually for each cell.
We will probably enhance HSSF in the future to make this process easier.
I am using styles when creating a workbook in POI, but Excel refuses to open the file, complaining about "Too Many Styles".
You just create the styles OUTSIDE of the loop in which you create cells.
GOOD:
HSSFWorkbook wb = new HSSFWorkbook();
HSSFSheet sheet = wb.createSheet("new sheet");
HSSFRow row = null;
// Aqua background
HSSFCellStyle style = wb.createCellStyle();
style.setFillBackgroundColor(HSSFColor.AQUA.index);
style.setFillPattern(HSSFCellStyle.BIG_SPOTS);
HSSFCell cell = row.createCell((short) 1);
cell.setCellValue("X");
cell.setCellStyle(style);
// Orange "foreground",
// foreground being the fill foreground not the font color.
style = wb.createCellStyle();
style.setFillForegroundColor(HSSFColor.ORANGE.index);
style.setFillPattern(HSSFCellStyle.SOLID_FOREGROUND);
for (int x = 0; x < 1000; x++) {
// Create a row and put some cells in it. Rows are 0 based.
row = sheet.createRow((short) k);
for (int y = 0; y < 100; y++) {
cell = row.createCell((short) k);
cell.setCellValue("X");
cell.setCellStyle(style);
// Write the output to a file
FileOutputStream fileOut = new FileOutputStream("workbook.xls");
wb.write(fileOut);
fileOut.close();
</source>
<p>BAD:</p>
<source>
HSSFWorkbook wb = new HSSFWorkbook();
HSSFSheet sheet = wb.createSheet("new sheet");
HSSFRow row = null;
for (int x = 0; x < 1000; x++) {
// Aqua background
HSSFCellStyle style = wb.createCellStyle();
style.setFillBackgroundColor(HSSFColor.AQUA.index);
style.setFillPattern(HSSFCellStyle.BIG_SPOTS);
HSSFCell cell = row.createCell((short) 1);
cell.setCellValue("X");
cell.setCellStyle(style);
// Orange "foreground",
// foreground being the fill foreground not the font color.
style = wb.createCellStyle();
style.setFillForegroundColor(HSSFColor.ORANGE.index);
style.setFillPattern(HSSFCellStyle.SOLID_FOREGROUND);
// Create a row and put some cells in it. Rows are 0 based.
row = sheet.createRow((short) k);
for (int y = 0; y < 100; y++) {
cell = row.createCell((short) k);
cell.setCellValue("X");
cell.setCellStyle(style);
// Write the output to a file
FileOutputStream fileOut = new FileOutputStream("workbook.xls");
wb.write(fileOut);
fileOut.close();
I think POI is using too much memory! What can I do?
This one comes up quite a lot, but often the reason isn't what
you might initially think. So, the first thing to check is - what's
the source of the problem? Your file? Your code? Your environment?
Or Apache POI?
(If you're here, you probably think it's Apache POI. However, it
often isn't! A moderate laptop, with a decent but not excessive heap
size, from a standing start, can normally read or write a file with
100 columns and 100,000 rows in under a couple of seconds, including
the time to start the JVM).
Apache POI ships with a few programs and a few example programs,
which can be used to do some basic performance checks. For testing
file generation, the class to use is in the examples package,
SSPerformanceTest
(
viewvc
).
Run SSPerformanceTest with arguments of the writing type (HSSF, XSSF
or SXSSF), the number rows, the number of columns, and if the file
should be saved. If you can't run that with 50,000 rows and 50 columns
in HSSF and SXSSF in under 3 seconds, and XSSF in under 20 seconds
(and ideally all 3 in less than that!), then the problem is with
your environment.
Next, use the example program
ToCSV
(
viewvc
)
to try reading the file in with HSSF or XSSF. Related is
XLSX2CSV
(
viewvc
),
which uses SAX parsing for .xlsx. Run this against both your problem file,
and a simple one generated by SSPerformanceTest of the same size. If this is
slow, then there could be an Apache POI problem with how the file is being
processed (POI makes some assumptions that might not always be right on all
files). If these tests are fast, then performance problems likely are in your
code.
I can't seem to find the source for the OOXML CT.. classes, where do they
come from?
The OOXML support in Apache POI is built on top of the file format
XML Schemas, as compiled into Java using
XMLBeans
. Currently,
the compilation is done with XMLBeans 5.x, for maximum compatibility
with installations.
All of the
org.openxmlformats.schemas.spreadsheetml.x2006
CT...
classes are auto-generated by XMLBeans. The resulting generated Java goes
in the
poi-ooxml-full-*-sources
jar, and the compiled version into the
poi-ooxml-full
jar.
The full
poi-ooxml-full
jar is distributed with Apache POI,
along with the cut-down
poi-ooxml-lite
jar containing just
the common parts. Use the sources off
poi-ooxml-full
for the lite version,
which is available from Maven Central - ask your favourite Maven
mirror for the
poi-ooxml-full-*-sources
jar. Alternately, if you download
the POI source distribution (or checkout from SVN) and build, Ant will
automatically compile it for you to generate the source and binary poi-ooxml-full jars.
An OLE2 ("binary") file is giving me problems, but I can't share it. How can I investigate the problem on my own?
The first thing to try is running the
Binary File Format Validator
from Microsoft against the file, which will report if the file
complies with the specification. If your input file doesn't, then this
may well explain why POI isn't able to process it correctly. You
should probably in this case speak to whoever is generating the file,
and have them fix it there. If your POI generated file is identified
as having an issue, and you're on the
latest codebase
, report a new
POI bug and include the details of the validation failure.
Another thing to try, especially if the file is valid but POI isn't
behaving as expected, are the POI Dev Tools for the component you're
using. For example, HSSF has
org.apache.poi.hssf.dev.BiffViewer
which will allow you to view the file as POI does. This will often
allow you to check that things are being read as you expect, and
narrow in on problem records and structures.
An OOXML ("xml") file is giving me problems, but I can't share it. How can I investigate the problem on my own?
There's not currently a simple validator tool as there is for the
OLE2 based (binary) file formats, but checking the basics of a file
is generally much easier.
Files such as .xlsx, .docx and .pptx are actually a zip file of XML
files, with a special structure. Your first step in diagnosing the
issues with the input or output file will likely be to unzip the
file, and look at the XML of it. Newer versions of Office will
normally tell you which area of the file is problematic, so
narrow in on there. Looking at the XML, does it look correct?
When reporting bugs, ideally include the whole file, but if you're
unable to then include the snippet of XML for the problem area, and
reference the OOXML standard for what it should contain.
Why do I get a java.lang.NoClassDefFoundError: javax/xml/stream/XMLEventFactory.newFactory()
Applies to versions <= 3.17 (Java 6):
This error indicates that the class XMLEventFactory does not provide
functionality which POI is depending upon. There can be a number of
different reasons for this:
Outdated xml-apis.jar, stax-apis.jar or xercesImpl.jar:
These libraries were required with Java 5 and lower, but are not actually
required with spec-compliant Java 6 implementations, so try removing those
libraries from your classpath. If this is not possible, try upgrading to a
newer version of those jar files.
Running IBM Java 6 (potentially as part of WebSphere Application Server):
IBM Java 6 does not provide all the interfaces required by the XML standards,
only IBM Java 7 seems to provide the correct interfaces, so try upgrading
your JDK.
Sun/Oracle Java 6 with outdated patchlevel:
Some of the interfaces were only included/fixed in some of the patchlevels for
Java 6. Try running with the latest available patchlevel or even better use
Java 7/8 where this functionality should be available in all cases.
No. This is not supported.
All POI jars in use must come from the same version. A combination
such as
poi-3.11.jar
and
poi-ooxml-3.9.jar
is not
supported, and will fail to work in unpredictable ways.
If you're not sure which POI jars you're using at runtime, and/or
you suspect it might not be the one you intended, see
this FAQ entry
for details on
diagnosing it. If you aren't sure what POI jars you need, see the
Components Overview
for details
Can I access/modify workbooks/documents/slideshows in multiple threads?
What are the multi-threading guarantees that Apache POI makes
In short:
Handling different document-objects in different threads will
work. Accessing the same document in multiple threads will not work.
This means the workbook/document/slideshow objects are not checked for
thread safety, but any globally held object like global caches or other
data structures are guarded against multi threaded access accordingly.
There have been
discussions
about accessing different Workbook-sheets
in different threads concurrently. While this may work to some degree, it may lead
to very hard to track errors as multi-threading issues typically only
manifest after long runtime when many threads are active and the system
is under high load, i.e. in production use! Also it might break in future
versions of Apache POI as we do not specifically test using the library
this way.
What are the advantages and disadvantages of the different constructor and
write methods?
Across most of the UserModel classes (
POIDocument
POIXMLDocument
),
you can open the document from a read-only
File
, a read-write
File
or an
InputStream
. You can always write out to an
OutputStream
,
and increasing also to a
File
.
Opening your document from a
File
is suggested wherever possible.
This will always be quicker and lower memory then using an
InputStream
,
as the latter has to buffer things in memory.
When writing, you can use an
OutputStream
to write to a new file, or
overwrite an existing one (provided it isn't already open!). On slow links / disks,
wrapping with a
BufferedOutputStream
is suggested. To write like this, use
write(OutputStream)
.
To write to the currently open file (an in-place write / replace), you need to
have opened your document from a
File
, not an
InputStream
. In
addition, you need to have opened from the
File
in read-write mode, not
read-only mode. To write to the currently open file, on formats that support it
(not all do), use
write()
.
You can also write out to a new
File
. This is available no matter how
you opened the document, and will create/replace a new file. It is faster and lower
memory than writing to an
OutputStream
. However, you can't use this to
replace the currently open file, only files not currently open. To write to a
new / different file, use
write(File)
More information is also available in the
HSSF and XSSF documentation
,
which largely applies to the other formats too.
Note that currenly (POI 3.15 beta 3), not all of the write methods are available
for the OOXML formats yet.
Can POI be used with OSGI?
Starting with POI 3.16 there's a workaround for OSGIs context classloader handling,
i.e. it replaces the threads current context classloader with an implementation of
limited class view. This will lead to IllegalStateExceptions, as xmlbeans can't find
the xml schema definitions in this reduced view. The workaround is to initialize
the classloader delegate of
POIXMLTypeLoader
, which defaults to the current
thread context classloader. The initialization should take place before any other
OOXML related calls. The class in the example could be any class, which is
part of the poi-ooxml-schema or ooxml-schema:
POIXMLTypeLoader.setClassLoader(CTTable.class.getClassLoader());
POI is successfully tested with many different versions of Java. It is
recommended that you use Java versions that have Long Term Support (Java 11, 17 and 21).
Including the existing binaries as normal jar-files
should work when using recent versions of Apache POI. You may see
some warnings about illegal reflective access, but it should work fine
despite those. We are working on getting the code changed so we avoid
discouraged accesses in the future.
NOTE: Apache POI tries to support the Java module system but it is more complicated
because Apache POI is still supporting Java 8 and the module system
cannot be fully supported while maintaining such support.
FYI, jaxb in current versions also causes some warnings about reflective access,
we cannot fix those until jaxb >= 2.4.0 is available, see
https://stackoverflow.com/a/50251510/411846 for details, you can set a system
property "com.sun.xml.bind.v2.bytecode.ClassTailor.noOptimize" to avoid this warning.
For compiling Apache POI, you should use at least version 4.1.0 when it becomes available
or a recent trunk checkout until then.
If you are building POI yourself from source files, use an up to date version of Gradle.
If you use Ant, again check the Ant version supports the version of Java you are using.
Can Apache POI be compiled/used with Java 9 or Java 10?
Apache POI does not actively support Java 9 or Java 10 any longer as those versions were
obsoleted by Oracle already. See the previous FAQ entry for information about support for
Java LTS versions.
Anything to consider when using IBM JDK?
The IBM Java runtime is using a JIT compiler which doesn't behave sometimes. ;)
Especially when rendering slideshows it throws errors, which don't occur when debugging the code.
E.g. an ArrayIndexOutOfBoundsException is thrown in TexturePaintContext when the image contains
textures - see
#62999
for more
details on how to detected JIT errors.
To prevent the JIT errors, the affected methods need be excluded from JIT compiling.
Currently (tested with IBM JDK 1.8.0_144 and _191) the following should be added to the VM parameters:
-Xjit:exclude={sun/java2d/pipe/AAShapePipe.renderTiles(Lsun/java2d/SunGraphics2D;Ljava/awt/Shape;Lsun/java2d/pipe/AATileGenerator;[I)V},exclude={sun/java2d/pipe/AlphaPaintPipe.renderPathTile(Ljava/lang/Object;[BIIIIII)V},exclude={java/awt/TexturePaintContext.getRaster(IIII)Ljava/awt/image/Raster;}
Tomcat is reporting memory leaks caused by some class in Apache POI which uses ThreadLocal
Apache POI uses Java
ThreadLocals
in order to cache some data when Apache POI is used in a multi-threading environment (see also the FAQ about thread-safety above!)
WebServers like Tomcat use thread-pooling to re-use threads to avoid the cost of frequent thread-startup and shutdown.
In order to guard against memory-leaks, Tomcat performs checks on allocated memory in ThreadLocals and reports them as warnings.
In order to get rid of these warnings, Apache POI, starting with version 5.2.4, provides a utility ThreadLocalUtils which can
be used to clear all objects held in thread-local objects before returning the thread back to the global pool.
org.apache.poi.util.ThreadLocalUtil.clearAllThreadLocals();
// if you use poi-ooxml, also clear thread-locals in XMLBeans
org.apache.xmlbeans.ThreadLocalUtil.clearAllThreadLocals();
How can I demand fixes or features in Apache POI to be done with urgency?
Apache POI is an open source project developed by a very small group of volunteers.
Currently no-one is paid to work on new features or bug-fixes.
So it is considered fairly rude to "demand" things, especially "ASAP" is quite frowned
upon and may even reduce the likelihood that your issue is picked up and worked on.
If you would like to increase chances that your problem is tackled, you can do a number of things
as follows, sorted by the amount of effort which may be required from you:
Ensure your bug-report is complete and contains instructions/samples which allow to reproduce the problem.
Ideally a self-sufficient test-case which does not need lots of manual setup.
Provide a summary of research of the root-cause of your problem.
Provide a patch which fixes the problem. We usually like to have unit-tests accompanying changes to
have high code-coverage and good confidence that issues are fixed and few regressions are introduced
over time.
Become a contributor! The entry threshold is actually not too high as soon as you provided your
first successful bugfix. If you think you can spare the time to contribute for some longer time,
becoming an official committer should not be too hard.
There are two angles to reproducibility: building reproducible jars for Apache POI itself and making Apache POI
produce byte-for-byte identical files when it is used to create documents.
The build of jars for Apache POI should be reproducible since version 5.2.4 by removing the build-timestamp
from the generated Version.java. Make sure the exact same combination of build-tools is used,
especially the version of the JDK.
Producing reproducible output files will be supported in the future (after version 5.3.0), initial support is available in
nightly builds.
Note: Files are only written without timestamps if the environment variable SOURCE_DATE_EPOCH is set to a
non-empty value.
Please create a bug entry if you find things which break reproducibility, both for building and output files.
Please provide exact steps how to reproduce your issue!
See
https://reproducible-builds.org/
for general information about why reproducible builds
and output may be important.
2001-2024
The Apache Software Foundation
Apache, Apache POI, the Apache feather logo, and the Apache POI
logos are trademarks of The Apache Software Foundation.
Send feedback about the website to: