Reload PDF without losing scroll position? · Issue #11359 · mozilla/pdf.js

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

相关文章推荐

忧郁的冲锋衣 · 2022年南阳市国民经济和社会发展统计公报_ ...· 4 周前 ·

失落的汽水 · [new-rule] prefer ...· 3 月前 ·

飘逸的领结 · GetProcessTimes in ...· 3 月前 ·

不拘小节的香蕉 · googletest — Homebrew ...· 4 月前 ·

傻傻的薯片 · 警隊架構：高級官員 | 香港警務處· 5 月前 ·

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question 1: Is it possible to make PDFViewerApplication to reload the PDF without losing its scroll position ?

I was able to trigger a reload by using PDFViewerApplication.open() , but the scroll position returns to the top.

If there is no such function, I could save the scrollTop property of #viewerContainer and then set it again after the PDF has been reloaded, but for that I would need to know when the PDF has been rendered.

Question 2: Is there a render-finished callback that I can register?

I realized that PDFViewerApplication.open() returns a Promise, so I tried the following:

const scrollTop = document.getElementById('viewerContainer').scrollTop
PDFViewerApplication.open(pdfname).then(
    () => {
      document.getElementById('viewerContainer').scrollTop = scrollTop
Unfortunately, it doesn't work, after the reload the scroll position returns to the top.
Question 1: Is it possible to make PDFViewerApplication to reload the PDF without losing its scroll position?
That's already the default behaviour, unless you've manually changed preferences/options, when you simply reload the viewer demo at https://mozilla.github.io/pdf.js/web/viewer.html
I was able to trigger a reload by using PDFViewerApplication.open(), but the scroll position returns to the top.
Assuming that you're actually re-opening the exact same file, then I cannot imagine why directly calling PDFViewerApplication.open() would produce a different behaviour (than the demo above).
Question 2: Is there a render-finished callback that I can register?
Given that the default viewer will only render visible pages, in order to save resources, there's generally no point where all pages are rendered (except obviously for one-page documents).

However, you could use the 'documentinit' event to know when enough of the viewer/document has been loaded in order to be able to interact with it.
That's already the default behaviour, unless you've manually changed preferences/options, when you simply reload the viewer demo at https://mozilla.github.io/pdf.js/web/viewer.html
Right. The difference in my case is that I have the viewer.html in an iframe, see #11358. But that means I can trigger a reload using .contentWindow.location.reload(), correct?
Assuming that you're actually re-opening the exact same file, then I cannot imagine why directly calling PDFViewerApplication.open() would produce a different behaviour (than the demo above).
No, it's not the exact same file, otherwise the reload wouldn't change anything. The context is that slightly modified versions of the PDF are generated, based on a source document (Markdown processed by Pandoc). Because the PDF is different, reloading makes sense, and because the changes are almost always very small, keeping the scroll position makes sense.
No, it's not the exact same file, otherwise the reload wouldn't change anything. The context is that slightly modified versions of the PDF are generated,
The functionality in question relies on the document fingerprint, which is defined in the PDF document itself. Hence if the document is modified, the PDF generator will create a new fingerprint since it's then a "new" PDF document. (There's generally no good/simply way of supporting exactly what you seem to want in this case, with the exception of a full browser-reload, without issues elsewhere.)
[...] and because the changes are almost always very small, keeping the scroll position makes sense.
In the general case, the "almost always very small" part cannot really be assumed to hold.
Would the documentinit event notify me when all the pages are represented by not-yet-rendered page placeholders, so that scrolling to the same position will work?
Generally, yes.
          Alright, I finally figured it out, and I describe the solution here in case some else needs this.
The default reload behavior of the pdf.js viewer is indeed to preserve the scroll position (and some other GUI states) if the file is exactly the same. However, it doesn't check the complete file contents (via a hash or fingerprint) but relies on the document ID that is typically (but not necessarily) embedded in a PDF file (see 14.4 in the specification).
In my case, the (contentwise slightly different) PDFs are generated by pdflatex / xelatex / lualatex, and apparently they generate the PDF ID based on the current time and the pathname of the document. Therefore, even if I don't change anything in the LaTeX document, just process it again to generate a PDF, this PDF has a different ID, and the pdf.js viewer does not treat it as the same document (preserving the scroll position), but as a new one (starting with the scroll position at the top).
Fortunately, there is a way around this. If the environment variable SOURCE_DATE_EPOCH is set to a Unix time stamp (number of seconds since 1 Jan 1970 00:00 UTC), then this time will be used for the PDF ID instead of the current time. Since the pathname of the document doesn't change either, this ensures an identical ID across regenerations of the PDF, and therefore the viewer shows its scroll-preserving behavior on reload.
          I found another solution by overriding the fingerprint inside the PDFDocumentProxy:
// Binary pdf contents stored in data
let doc = await pdfjsLib.getDocument({data}).promise;
// Override the fingerprint after parsing but before passing to the view layer.
// This means that the view state (page, scroll offset, etc.) is preserved.
doc._pdfInfo.fingerprint = 'constantFingerprint';
PDFViewerApplication.load(doc);
Note we have to assign to doc._pdfInfo.fingerprint because doc.fingerprint is a getter. This doesn't appear to have ill effects despite accessing the presumably private _pdfInfo.
This is similar to the solution in #11496, but doesn't require patching pdf.worker.js. You could get fancier here and use a fingerprint based on the e.g. basename or path of the file.
          When using frameworks like ng-pdfjs-viewer, this doens't work as pdf.js is somewhat hidden.

In our use case, we have a table containing scanned PDFs on the left and the actual PDF on the right of the selected scan. Each scan is the same type of document and user needs to manually verify the scanned data on e.g. page 2 zoomed in to easily check values. Then user traverses the rows and verifies the data is correct at the exact (same) position in all scans.
Our hack solution that works in combination with ng-pdfjs-viewer is:
copy the pdfjs folder into /assets/ This is happening either way so remove it out of the angular.json as you are doing it yourself)
modify /assets/pdfjs/pdf.js to check for a global variable
key: "fingerprint",
    get: function get() {
      const currentPdfFingerPrintOverride = window.parent.currentPdfFingerPrintOverride;
      if(currentPdfFingerPrintOverride){
        console.warn('PDF UI State Saving feature in pdf.js: Overriding PDF fingerprint to ',currentPdfFingerPrintOverride);
        return currentPdfFingerPrintOverride;
      }else{
        return this._pdfInfo.fingerprin
Then somewhere in your Angular code or (anywhere you want in plain javascript) set the value
window.currentPdfFingerPrintOverride = 'some-ui-state-saving-ui';