You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PyMuPDF (current version 1.18.18) is a Python binding with support for `MuPDF <http://mupdf.com/>`_ (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer and toolkit, which is maintained and developed by Artifex Software, Inc.
23
+
PyMuPDF (current version 1.18.19) is a Python binding with support for `MuPDF <http://mupdf.com/>`_ (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer and toolkit, which is maintained and developed by Artifex Software, Inc.
24
24
25
25
MuPDF can access files in PDF, XPS, OpenXPS, CBZ, EPUB and FB2 (e-books) formats, and it is known for its top performance and high rendering quality.
@@ -11,7 +11,7 @@ On **[PyPI](https://pypi.org/project/PyMuPDF)** since August 2016: [![Downloads]
11
11
12
12
# Introduction
13
13
14
-
PyMuPDF (current version 1.18.18) is a Python binding with support for [MuPDF](https://mupdf.com/) (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
14
+
PyMuPDF (current version 1.18.19) is a Python binding with support for [MuPDF](https://mupdf.com/) (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
15
15
16
16
MuPDF can access files in PDF, XPS, OpenXPS, CBZ, EPUB and FB2 (e-books) formats, and it is known for its top performance and high rendering quality.
Copy file name to clipboardExpand all lines: docs/changes.rst
+16Lines changed: 16 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,22 @@ Change Logs
3
3
4
4
------
5
5
6
+
**Changes in Version 1.18.18**
7
+
8
+
* **Fixed** issue `#1257 <https://github.com/pymupdf/PyMuPDF/issues/1257>`_. Removing the read-only flag from PDF fields is now possible.
9
+
10
+
* **Fixed** issue `#1252 <https://github.com/pymupdf/PyMuPDF/issues/1252>`_. Now correctly specifying the ``zoom`` value for PDF link annotations.
11
+
12
+
* **Fixed** issue `#1244 <https://github.com/pymupdf/PyMuPDF/issues/1244>`_. Now correctly computing the transform matrix in :meth:`Page.get_image__bbox`.
13
+
14
+
* **Fixed** issue `#1241 <https://github.com/pymupdf/PyMuPDF/issues/1241>`_. Prevent returning artifact characters in :meth:`Page.get_textbox`, which happened in certain constellations.
Copy file name to clipboardExpand all lines: docs/faq.rst
+13-14Lines changed: 13 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -191,14 +191,14 @@ and `extract-imgb.py <https://github.com/JorjMcKie/PyMuPDF-Utilities/blob/master
191
191
192
192
----------
193
193
194
-
How to Handle Stencil Masks
194
+
How to Handle Image Masks
195
195
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
196
-
Some images in PDFs are accompanied by **stencil masks**. In their simplest form stencil masks represent alpha (transparency) bytes stored as separate images. In order to reconstruct the original of an image, which has a stencil mask, it must be "enriched" with transparency bytes taken from its stencil mask.
196
+
Some images in PDFs are accompanied by **image masks**. In their simplest form, masks represent alpha (transparency) bytes stored as separate images. In order to reconstruct the original of an image, which has a mask, it must be "enriched" with transparency bytes taken from its mask.
197
197
198
-
Whether an image does have such a stencil mask can be recognized in one of two ways in PyMuPDF:
198
+
Whether an image does have such a mask can be recognized in one of two ways in PyMuPDF:
199
199
200
-
1. An item of :meth:`Document.get_page_images` has the general format *[xref, smask, ...]*, where *xref* is the image's :data:`xref` and *smask*, if positive, is the :data:`xref` of a stencil mask.
201
-
2. The (dictionary) results of :meth:`Document.extract_image` have a key *"smask"*, which also contains any stencil mask's :data:`xref` if positive.
200
+
1. An item of :meth:`Document.get_page_images` has the general format ``(xref, smask, ...)``, where *xref* is the image's :data:`xref` and *smask*, if positive, is the :data:`xref` of a mask.
201
+
2. The (dictionary) results of :meth:`Document.extract_image` have a key *"smask"*, which also contains any mask's :data:`xref` if positive.
202
202
203
203
If *smask == 0* then the image encountered via :data:`xref` can be processed as it is.
204
204
@@ -207,12 +207,11 @@ To recover the original image using PyMuPDF, the procedure depicted as follows m
Step (1) creates a pixmap of the "netto" image. Step (2) does the same with the stencil mask. Please note that the :attr:`Pixmap.samples` attribute of *pix2* contains the alpha bytes that must be stored in the final pixmap. This is what happens in step (3) and (4).
214
+
Step (1) creates a pixmap of the basic image. Step (2) does the same with the image mask. Step (3) adds an alpha channel and fills it with transparency information.
216
215
217
216
The scripts `extract-imga.py <https://github.com/JorjMcKie/PyMuPDF-Utilities/blob/master/extract-imga.py>`_, and `extract-imgb.py <https://github.com/JorjMcKie/PyMuPDF-Utilities/blob/master/extract-imgb.py>`_ above also contain this logic.
218
217
@@ -2108,10 +2107,10 @@ If it is *False* or if you want to be on the safe side, pick one of the followin
2108
2107
2109
2108
Missing or Unreadable Extracted Text
2110
2109
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2111
-
This can be a number of different problems.
2110
+
Fairly often, text extraction does not work text as you would expect: text may be missing at all, or may not appear in the reading sequence visible on your screen, or contain garbled characters (like a ? or a "TOFU" symbol), etc. This can be caused by a number of different problems.
2112
2111
2113
-
Problem: no text
2114
-
^^^^^^^^^^^^^^^^
2112
+
Problem: no text is extracted
2113
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2115
2114
Your PDF viewer does display text, but you cannot select it with your cursor, and text extraction delivers nothing.
2116
2115
2117
2116
Cause
@@ -2130,7 +2129,7 @@ Text extraction does not deliver the text in readable order, duplicates some tex
2130
2129
Cause
2131
2130
^^^^^^
2132
2131
1. The single characters are redable as such (no "<?>" symbols), but the sequence in which the text is **coded in the file** deviates from the reading order. The motivation behind may be technical or protection of data against unwanted copies.
2133
-
2. Many "<?>" symbols occur indicating MuPDF could not interpret these characters. The PDF creator may haved used a font that displays readable text, but obfuscates the unicode character that leads to the readable symbol (glyph).
2132
+
2. Many "<?>" symbols occur, indicating MuPDF could not interpret these characters. The font may indeed be unsupported by MuPDF, or the PDF creator may haved used a font that displays readable text, but on purpose obfuscates the originating corresponding unicode character.
**Copy and scale:** Copy *source* pixmap, scaling new width and height values -- the image will appear stretched or shrunk accordingly. Supports partial copying. The source colorspace may be *None*.
@@ -96,7 +108,7 @@ Have a look at the :ref:`FAQ` section to see some pixmap usage "at work".
96
108
97
109
:arg irect_like clip: restrict the resulting pixmap to this region of the **scaled** pixmap.
98
110
99
-
.. note:: If width or height are not *de facto* integers (i.e. ``value.is_integer() != True``), then the resulting pixmap **will have an alpha channel**.
111
+
.. note:: If width or height do not *represent* integers (i.e. ``value.is_integer() != True``), then the resulting pixmap **will have an alpha channel**.
100
112
101
113
.. method:: __init__(self, source, alpha=1)
102
114
@@ -255,7 +267,7 @@ Have a look at the :ref:`FAQ` section to see some pixmap usage "at work".
255
267
256
268
:arg bytes,bytearray,BytesIO alphavalues: the new alpha values. If provided, its length must be at least *width * height*. If omitted (``None``), all alpha values are set to 255 (no transparency). *Changed in version 1.14.13:* *io.BytesIO* is now also accepted.
257
269
:arg bool premultiply: *New in v1.18.13:* whether to premultiply color components with the alpha value.
258
-
:arg list,tuple opaque: specify a color that should be fully transparent -- ignoring the alpha value of the parameter. A sequence of integers in ``range(256)`` with a length of :attr:`Pixmap.n`. Default is *None*. E.g. in the RGB case a typical choice would be ``opaque=(255, 255, 255)`` for white.
270
+
:arg list,tuple opaque: ignore the alpha value and set this color to fully transparent. A sequence of integers in ``range(256)`` with a length of :attr:`Pixmap.n`. Default is *None*. For example, a typical choice for RGB would be ``opaque=(255, 255, 255)`` (white).
This documentation covers PyMuPDF v1.18.17 features as of **2021-08-23 00:00:01**.
4
+
This documentation covers PyMuPDF v1.18.19 features as of **2021-09-16 16:45:29**.
5
5
6
6
.. note:: The major and minor versions of **PyMuPDF** and **MuPDF** will always be the same. Only the third qualifier (patch level) may deviate from that of MuPDF.
0 commit comments