pymupdf
diff --git a/‎README.md
Lines changed: 9 additions & 9 deletions b/‎README.md
Lines changed: 9 additions & 9 deletions
diff --git a/‎changes.rst
Lines changed: 31 additions & 3 deletions b/‎changes.rst
Lines changed: 31 additions & 3 deletions
diff --git a/‎fitz/__main__.py
Lines changed: 3 additions & 3 deletions b/‎fitz/__main__.py
Lines changed: 3 additions & 3 deletions
@@ -1,8 +1,8 @@
-# PyMuPDF 1.19.1
+# PyMuPDF 1.19.2
 
 ![logo](https://github.com/pymupdf/PyMuPDF/blob/master/demo/pymupdf.jpg)
 
-Release date: October 23, 2021
+Release date: November 20, 2021
 
 On **[PyPI](https://pypi.org/project/PyMuPDF)** since August 2016: [![Downloads](https://static.pepy.tech/personalized-badge/pymupdf?period=total&units=international_system&left_color=black&right_color=orange&left_text=Downloads)](https://pepy.tech/project/pymupdf)
 
@@ -11,7 +11,7 @@ On **[PyPI](https://pypi.org/project/PyMuPDF)** since August 2016: [![Downloads]
 
 # Introduction
 
-PyMuPDF (current version 1.19.1) is a Python binding with support for [MuPDF](https://mupdf.com/) (current version 1.19.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
+PyMuPDF (current version 1.19.2) is a Python binding with support for [MuPDF](https://mupdf.com/) (current version 1.19.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
 
 MuPDF can access files in PDF, XPS, OpenXPS, CBZ, EPUB and FB2 (e-books) formats, and it is known for its top performance and high rendering quality.
 
@@ -27,9 +27,9 @@ For all supported document types (i.e. **_including images_**) you can
 * search for text
 * extract text and images
 * convert to other formats: PDF, (X)HTML, XML, JSON, text
-* perform Optical Character Recognition if Tesseract is installed
+* do OCR (Optical Character Recognition) if Tesseract is installed
 
-> To some degree, PyMuPDF can therefore be used as an [image converter](https://github.com/pymupdf/PyMuPDF/wiki/How-to-Convert-Images): it can read a range of input formats and can produce **Portable Network Graphics (PNG)**, **Portable Anymaps** (**PNM**, etc.), **Portable Arbitrary Maps (PAM)**, **Adobe Postscript** and **Adobe Photoshop** documents, making the use of other graphics packages obselete in these cases. But interfacing with e.g. PIL/Pillow for image input and output is easy as well.
+> To some degree, PyMuPDF can also be used as an [image converter](https://github.com/pymupdf/PyMuPDF/wiki/How-to-Convert-Images): it can read a range of input formats and can produce **Portable Network Graphics (PNG)**, **Portable Anymaps** (**PNM**, etc.), **Portable Arbitrary Maps (PAM)**, **Adobe Postscript** and **Adobe Photoshop** documents, making the use of other graphics packages obselete in these cases. But interfacing with e.g. PIL/Pillow for image input and output is easy as well.
 
 For **PDF documents,** there exists a plethora of additional features: they can be created, joined or split up. Pages can be inserted, deleted, re-arranged or modified in many ways (including annotations and form fields).
 
@@ -52,12 +52,12 @@ For **PDF documents,** there exists a plethora of additional features: they can
     - **_layout-preserving text extraction_** (all documents)
 
 
-Have a look at the basic [demos](https://github.com/pymupdf/PyMuPDF-Utilities/tree/master/demo), the [examples](https://github.com/pymupdf/PyMuPDF-Utilities/tree/master/examples) (which contain complete, working programs), and the **recipes** section of our [Wiki](https://github.com/pymupdf/PyMuPDF/wiki) sidebar, which contains more than a dozen of guides in How-To-style.
+Have a look at the basic [demos](https://github.com/pymupdf/PyMuPDF-Utilities/tree/master/demo), the [examples](https://github.com/pymupdf/PyMuPDF-Utilities/tree/master/examples) (which contain complete, working programs), and [notebooks](https://github.com/pymupdf/PyMuPDF-Utilities/tree/master/jupyter-notebooks).
 
 
 # Documentation
 
-Our documentation, written using Sphinx, is available in various formats from the following sources. It currently is a combination of reference guide and user manual. For a **quick start** look at the [tutorial](https://pymupdf.readthedocs.io/en/latest/tutorial.html) and the [recipes](https://pymupdf.readthedocs.io/en/latest/faq.html) chapters.
+Documentation is written using Sphinx and is available in various formats from the following sources. It currently is a combination of reference guide and user manual. For a **quick start** look at the [tutorial](https://pymupdf.readthedocs.io/en/latest/tutorial.html) and the [recipes](https://pymupdf.readthedocs.io/en/latest/faq.html) chapters.
 
 * You can view it online at [Read the Docs](https://readthedocs.org/projects/pymupdf/). This site also provides download options for PDF.
 * The search function on Read the Docs does not work for me currently. If you want a working searchable local version, please download a zipped HTML for [here](https://github.com/pymupdf/PyMuPDF-optional-material/tree/master/doc/pymupdf.zip).
@@ -68,7 +68,7 @@ The latest changelog can be viewed [here](https://pymupdf.readthedocs.io/en/late
 
 # Installation
 
-PyMuPDF requires **Python 3.6 or later**.
+PyMuPDF **requires Python 3.6 or later**.
 
 Python wheels exist for **Windows** (32bit and 64bit), **Linux** (64bit, Intel and ARM) and **Mac OSX** (64bit, Intel only), so it can be installed from [PyPI](https://pypi.org/search/?q=pymupdf) in the usual way:
 
@@ -77,7 +77,7 @@ python -m pip install --upgrade pip
 python -m pip install --upgrade pymupdf
 ```
 
-There are **no mandatory** external dependencies. However, a some **optional features** become available if additional packages are installed:
+There are **no mandatory** external dependencies. However, some **optional features** become available if additional packages are installed:
 
 * [Pillow](https://pypi.org/project/Pillow/) for using pillow image output directly from PyMuPDF
 * [fontTools](https://pypi.org/project/fonttools/) for creating font subsets
 
@@ -3,15 +3,43 @@ Change Log
 
 ------
 
+**Changes in Version 1.19.2**
+
+This patch version implements minor improvements for :meth:`Page.get_drawings` and also some important fixes.
+
+* **Fixed** `#1388 <https://github.com/pymupdf/PyMuPDF/discussions/1388>`_. Fixed intermittent memory corruption when insert or updating annotations.
+
+* **Fixed** `#1375 <https://github.com/pymupdf/PyMuPDF/discussions/1375>`_. Inconsistencies between line numbers as returned by the "words" and the "dict" options of :meth:`Page.get_text` have been corrected.
+
+* **Fixed** `#1364 <https://github.com/pymupdf/PyMuPDF/issues/1342>`_. The check for being a ``"rawdict"`` span in :meth:`recover_span_quad` now works correctly.
+
+* **Fixed** `#1342 <https://github.com/pymupdf/PyMuPDF/issues/1364>`_. Corrected the check for rectangle infiniteness in :meth:`Page.show_pdf_page`.
+
+* **Changed** :meth:`Page.get_drawings`, :meth:`Page.get_cdrawings` to return an indicator on the area orientation covered by a rectangle. This implements `#1355 <https://github.com/pymupdf/PyMuPDF/issues/1355>`_. Also, the recognition rate for rectangles and quads has been significantly improved.
+
+* **Changed** all text search and extraction methods to set the new ``flags`` option ``TEXT_MEDIABOX_CLIP`` to ON by default. That bit causes the automatic suppression of all characters that are completely outside a page's mediabox (in as far as that notion is supported for a document type). This eliminates the need for using ``clip=page.rect`` or similar for omitting text outside the visible area.
+
+* **Added** parameter ``"dpi"`` to :meth:`Page.get_pixmap` and :meth:`Annot.get_pixmap`. When given, parameter ``"matrix"`` is ignored, and a :ref:`Pixmap` with the desired dots per inch is created.
+
+* **Added** attributes :attr:`Pixmap.is_monochrome` and :attr:`Pixmap.is_unicolor` allowing fast checks of pixmap properties. Addresses `#1397 <https://github.com/pymupdf/PyMuPDF/discussions/1397>`_.
+
+* **Added** method :meth:`Pixmap.color_count` to determine the unique colors in the pixmap.
+
+* **Added** boolean parameter ``"compress"`` to PDF document method :meth:`Document.update_stream`. Addresses / enables solution for `#1408 <https://github.com/pymupdf/PyMuPDF/discussions/1408>`_.
+
+------
+
 **Changes in Version 1.19.1**
 
-* **Fixed** `#1328 <https://github.com/pymupdf/PyMuPDF/issues/1328>`_. "words" text extraction again returns correct coordinates.
+This is the first patch version to support MuPDF v1.19.0. Apart from one bug fix, it includes important improvements for OCR support and the option to **sort extracted text** to the standard reading order "from top-left to bottom-right".
+
+* **Fixed** `#1328 <https://github.com/pymupdf/PyMuPDF/issues/1328>`_. "words" text extraction again returns correct ``(x0, y0)`` coordinates.
 
-* **Changed** :meth:`Page.get_textpage_ocr` -- support specifying the desired OCR quality via parameter ``dpi``, support choice between full page OCR versus only OCRing displayed images.
+* **Changed** :meth:`Page.get_textpage_ocr`: it now supports parameter ``dpi`` to control OCR quality. It is also possible to choose whether the **full page** should be OCRed or **only the images displayed** by the page.
 
 * **Changed** :meth:`Page.get_drawings` and :meth:`Page.get_cdrawings` to automatically convert colors to RGB color tuples. Implements `#1332 <https://github.com/pymupdf/PyMuPDF/discussions/1332>`_. Similar change was applied to :meth:`Page.get_texttrace`.
 
-* **Changed** :meth:`Page.get_text` to support a new parameter ``sort``. If set to ``True`` the output is conveniently sorted.
+* **Changed** :meth:`Page.get_text` to support a parameter ``sort``. If set to ``True`` the output is conveniently sorted.
 
 
 ------
 
@@ -555,7 +555,7 @@ def page_simple(page, textout, GRID, fontsize, noformfeed, skip_empty, flags):
         if not skip_empty:
             textout.write(eop)  # write formfeed
         return
-    textout.write(text.encode("utf8"))
+    textout.write(text.encode("utf8", errors="surrogatepass"))
     textout.write(eop)
     return
 
@@ -569,7 +569,7 @@ def page_blocksort(page, textout, GRID, fontsize, noformfeed, skip_empty, flags)
         return
     blocks.sort(key=lambda b: (b[3], b[0]))
     for b in blocks:
-        textout.write(b[4].encode("utf8"))
+        textout.write(b[4].encode("utf8", errors="surrogatepass"))
     textout.write(eop)
     return
 
@@ -793,7 +793,7 @@ def make_textline(left, slot, minslot, lchars):
             textout.write(b"\n")
             rowpos += rowheight
         text = make_textline(left, slot, minslots[k], lines[k])
-        textout.write((text + "\n").encode("utf8"))
+        textout.write((text + "\n").encode("utf8", errors="surrogatepass"))
         rowpos = k + rowheight
 
     textout.write(eop)  # write formfeed