添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe the bug

Version 3.0.0 of PyPDF2 was just released today (23 Dec 2022), which includes a breaking change for removing PdfFileReader (see changelog ). As a result, all new installs and usage of camelot-py will raise the following exception:

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    camelot.read_pdf(PDF_FILE_PATH)
  File ".venv/py37/lib/python3.7/site-packages/camelot/io.py", line 117, in read_pdf
    **kwargs
  File ".venv/py37/lib/python3.7/site-packages/camelot/handlers.py", line 172, in parse
    self._save_page(self.filepath, p, tempdir)
  File ".venv/py37/lib/python3.7/site-packages/camelot/handlers.py", line 111, in _save_page
    infile = PdfFileReader(fileobj, strict=False)
  File ".venv/py37/lib/python3.7/site-packages/PyPDF2/_reader.py", line 1974, in __init__
    deprecation_with_replacement("PdfFileReader", "PdfReader", "3.0.0")
  File ".venv/py37/lib/python3.7/site-packages/PyPDF2/_utils.py", line 369, in deprecation_with_replacement
    deprecation(DEPR_MSG_HAPPENED.format(old_name, removed_in, new_name))
  File ".venv/py37/lib/python3.7/site-packages/PyPDF2/_utils.py", line 351, in deprecation
    raise DeprecationError(msg)
PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.

Steps to reproduce the bug

  • Create a new virtualenv
  • Install camelot-py:
    pip install camelot-py[base]
    
  • Run the following code:
    import camelot
    # replace with a valid path on your local filesystem
    PDF_FILE_PATH = "/path/to/file.pdf"
    # raises an exception from PyPDF2
    camelot.read_pdf(PDF_FILE_PATH)

    Expected behavior

    The code above should execute without any exceptions.

    Environment

  • OS: macOS 12.3.1
  • Python version: 3.7
  • Numpy version: 1.24.0
  • OpenCV version: 4.6.0.66
  • Ghostscript version: 0.7
  • Camelot version: 0.10.1
  • vkasalaST, insomniac-tk, MartinThoma, alprnyldz, cacampbell, pauloeli, hamzatamry, Saloh603, Yen-Lung-Huang, Mfdsix, and 2 more reacted with thumbs up emoji siddarthvader and mooosamir reacted with hooray emoji All reactions

    As a workaround, I've added this line in my requirement.txt for the time being:

    PyPDF2~=2.0
    

    Thank you for the workaround fix

    Hey @saidakyuz , I did a bit of work around and now my camelot is working fine. Am mentioning the steps below if you want to refer.

  • Set anaconda env (preferably use python 3.7)
  • Install camelot -
    pip install camelot-py[base]
  • It will itself download the pyPDF2 version 3.0.0, so you need to extensively change the version -
    pip install 'PyPDF2<3.0'
  • I used pyCharm to work with my script so I set the environment from settings and then it worked fine.

    Environment packages:
    camelot-py 0.10.1
    ghostscript 0.7
    pypdf2 2.0.0

    Note: It might sometimes create error showing ghostscript is not installed. You can explicitly install it from: https://ghostscript.com/releases/gsdnld.html
    and then set it computer's environment variable to bypass any issues and restart.

    Then the issue should be resolved.

    OUTPUT:

    Hope it helps.

    KshitizPandya, anakin87, ScottGR101, dannielshalev, materialknight, rockyicer, and TomaszZdziarski reacted with thumbs up emoji ScottGR101 reacted with hooray emoji All reactions MartinThoma, marcosmap, karenirenecano, shkao, aleksandr-kotlyar, RohanRaut2222, dicruzg, kautukraj, prabhatCH, Micahmichael03, and 3 more reacted with thumbs up emoji alexnum, joetristano, dicruzg, and zohabAli reacted with hooray emoji dicruzg and vannitotaro reacted with rocket emoji All reactions

    @vinayak-mehta if you want, I'm available to submit a PR to fix this issue.

    Im so close to getting my program finished, when i run in pycharm the code runs fine when i run the exe i get the
    ""PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.[17036] Failed to execute script 'Werks' due to unhandled exception!""
    error

    the suggested workarounds arent working for me, probably because this is my first program.
    i would LOVE a fix or a workaround that i can do..
    please advise!

    Another possible way is to downgrade installed version of PyPDF2:

    pip install --upgrade PyPDF2==2.12.1

    I tried this and also downgraded to PyPDF2.0, i get the same result, the deprecation error. I'm sure I'm missing something dumb.
    i am running the camelot-py[cv] version of camelot, would that have anything to do with it?

    @RhacklefordGPT most likely you downgraded it in the wrong environment. So you have two places where PyPDF2 is installed. You need to ensure to downgrade it in the correct one.

    For example, you might need pip3. Or you might need to load a virtual environment.

    To verify, you can add the following before you import camelot:

    import PyPDF2
    print("PyPDF2==" + PyPDF2.__version__)
    

    @RhacklefordGPT most likely you downgraded it in the wrong environment. So you have two places where PyPDF2 is installed. You need to ensure to downgrade it in the correct one.

    For example, you might need pip3. Or you might need to load a virtual environment.

    To verify, you can add the following before you import camelot:

    import PyPDF2
    print("PyPDF2==" + PyPDF2.__version__)
    

    Looks like that did the trick.
    my project is in a venv and I had been using the pycharm terminal to pip anything,
    looks like before I was using venv I had installed it using CMD directly.
    I'm still not clear on why if I use pyinstaller through Pycharm it tries to use things installed using cmd, I would think those are separated somehow, but really I do have a loose grasp of how this all works, could you recommend some further reading so I can avoid this? should I delete every installation of everything outside my venv?
    ps thank you very much now I can bring a finished product into work and blow some minds with it!!! :)

    i have already do that but i get other error
    OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html
    although it is installed already

    print("PyPDF2==" + PyPDF2.version)

    i have already do that but i get other error OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html although it is installed already

    Hey @au3m,
    Doing just the installation sometimes might not help. Sometimes you might need to set the things in the computer's environment variables to access it easily.
    So try setting "ghostscript" to your environment variables.

    STEPS FOR REFERENCE:

    copy the path where you have installed ghostscript.

    If you are using windows - search for "Edit the system environment variable" .

    above dialog should open. Click on the "environment variable" tab.

    Under "system variables" section double click "path".

    Click on the open space and paste the copied path of the ghostscript.

    Click OK and for precautions restart your device.

    After this your program should run fine without giving the ghostscript related error.

    print("PyPDF2==" + PyPDF2.version)

    i have already do that but i get other error OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html although it is installed already

    Hey @au3m,
    Doing just the installation sometimes might not help. Sometimes you might need to set the things in the computer's environment variables to access it easily.
    So try setting "ghostscript" to your environment variables.

    STEPS FOR REFERENCE:

    copy the path where you have installed ghostscript.

    If you are using windows - search for "Edit the system environment variable" .

    above dialog should open. Click on the "environment variable" tab.

    Under "system variables" section double click "path".

    Click on the open space and paste the copied path of the ghostscript.

    Click OK and for precautions restart your device.

    After this your program should run fine without giving the ghostscript related error.

    @KshitizPandya
    Thanks bro it works now ☺️

    I think the problem is based on a missed migration considering the naming adjustments within PyPDF2/pypdf - see the following doc: https://pypdf2.readthedocs.io/en/stable/user/migration-1-to-2.html

    Following the The Deprecation Process of PyPDF2/pypdf they are not longer tolerated.

    I replaced the handlers.py-file with the file from the PR below and the cli is working again for me.
    PR from @MartinThoma can be found here: #307

    If anyone trying to do this on colab then run the following steps:

    !pip install ghostscript
    !pip install camelot-py[cv]
    !pip install excalibur-py
    !apt install ghostscript python3-tk

    And after that check if installed:

    from ctypes.util import find_library
    # It will display `libgs.so.9` if installed or will print `None` if not
    print(find_library("gs")) 

    If still doesn't work:

    !excalibur initdb

    Source: here

    print("PyPDF2==" + PyPDF2.version)

    i have already do that but i get other error OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html although it is installed already

    Hey @au3m, Doing just the installation sometimes might not help. Sometimes you might need to set the things in the computer's environment variables to access it easily. So try setting "ghostscript" to your environment variables.

    STEPS FOR REFERENCE:

  • copy the path where you have installed ghostscript.
  • If you are using windows - search for "Edit the system environment variable" .
  • above dialog should open. Click on the "environment variable" tab.
  • Under "system variables" section double click "path".
  • Click on the open space and paste the copied path of the ghostscript.
  • Click OK and for precautions restart your device.
  • After this your program should run fine without giving the ghostscript related error.

    Thanks! Help me a lot.

    Hey @saidakyuz , I did a bit of work around and now my camelot is working fine. Am mentioning the steps below if you want to refer.

  • Set anaconda env (preferably use python 3.7)
  • Install camelot -
    pip install camelot-py[base]
  • It will itself download the pyPDF2 version 3.0.0, so you need to extensively change the version -
    pip install 'PyPDF2<3.0'
  • I used pyCharm to work with my script so I set the environment from settings and then it worked fine.

    Environment packages: camelot-py 0.10.1 ghostscript 0.7 pypdf2 2.0.0

    Note: It might sometimes create error showing ghostscript is not installed. You can explicitly install it from: https://ghostscript.com/releases/gsdnld.html and then set it computer's environment variable to bypass any issues and restart.

    Then the issue should be resolved.

    OUTPUT: image

    Hope it helps.

    I set all the libs exact same version as yours, yet error remains...
    Environment packages: camelot-py 0.10.1 ghostscript 0.7 pypdf2 2.0.0

  •