Skip to content

Commit 60d5ad1

Browse files
committed
more update fr v1.18.7
1 parent 394bf7c commit 60d5ad1

40 files changed

+2304
-1751
lines changed

docs/annot.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@ There is a parent-child relationship between an annotation and its page. If the
195195

196196
Three overlapping 'Circle' annotations with each opacity set to 0.5:
197197

198-
.. image:: images/img-opacity.jpg
198+
.. image:: images/img-opacity.*
199199

200200
.. attribute:: blendmode
201201

@@ -322,7 +322,7 @@ There is a parent-child relationship between an annotation and its page. If the
322322
* 'Line', 'Polyline', 'Polygon' annotations: use it to give applicable line end symbols a fill color other than that of the annotation *(changed in v1.16.16)*.
323323

324324
:arg bool cross_out: *(new in v1.17.2)* add two diagonal lines to the annotation rectangle. 'Redact' annotations only. If not desired, *False* must be specified even if the annotation was created with *False*.
325-
:arg int rotate: new rotation value. Default (-1) means no change. Supports 'FreeText' and several other annotation types (see :meth:`Annot.setRotation`), [#f1]_. Only choose 0, 90, 180, or 270 degrees for 'FreeText'. Otherwise any integer is acceptable.
325+
:arg int rotate: new rotation value. Default (-1) means no change. Supports 'FreeText' and several other annotation types (see :meth:`Annot.set_rotation`), [#f1]_. Only choose 0, 90, 180, or 270 degrees for 'FreeText'. Otherwise any integer is acceptable.
326326

327327
:rtype: bool
328328

@@ -515,7 +515,7 @@ Annotation Icons in MuPDF
515515
-------------------------
516516
This is a list of icons referencable by name for annotation types 'Text' and 'FileAttachment'. You can use them via the *icon* parameter when adding an annotation, or use the as argument in :meth:`Annot.setName`. It is left to your discretion which item to choose when -- no mechanism will keep you from using e.g. the "Speaker" icon for a 'FileAttachment'.
517517

518-
.. image:: images/mupdf-icons.jpg
518+
.. image:: images/mupdf-icons.*
519519

520520

521521
Example
@@ -547,7 +547,7 @@ This is how the circle annotation looks like before and after the change (pop-up
547547

548548
|circle|
549549

550-
.. |circle| image:: images/img-circle.png
550+
.. |circle| image:: images/img-circle.*
551551

552552

553553
.. rubric:: Footnotes

docs/app1.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ Following are three sections that deal with different aspects of performance:
1212

1313
In each section, the same fixed set of PDF files is being processed by a set of tools. The set of tools varies -- for reasons we will explain in the section.
1414

15-
.. |fsizes| image:: images/img-filesizes.png
15+
.. |fsizes| image:: images/img-filesizes.*
1616

1717
Here is the list of files we are using. Each file name is accompanied by further information: **size** in bytes, number of **pages**, number of bookmarks (**toc** entries), number of **links**, **text** size as a percentage of file size, **KB** per page, PDF **version** and remarks. **text %** and **KB index** are indicators for whether a file is text or graphics oriented.
1818
|fsizes|
@@ -72,8 +72,8 @@ This is how each of the tools was used:
7272

7373
**Observations**
7474

75-
.. |cpyspeed1| image:: images/img-copy-speed-1.png
76-
.. |cpyspeed2| image:: images/img-copy-speed-2.png
75+
.. |cpyspeed1| image:: images/img-copy-speed-1.*
76+
.. |cpyspeed2| image:: images/img-copy-speed-2.*
7777

7878
These are our run time findings (in **seconds**, please note the European number convention: meaning of decimal point and comma is reversed):
7979

@@ -115,7 +115,7 @@ All tools have been used with their most basic, fanciless functionality -- no la
115115

116116
For demonstration purposes, we have included a version of *GetText(doc, output = "json")*, that also re-arranges the output according to occurrence on the page.
117117

118-
.. |textperf| image:: images/img-textperformance.png
118+
.. |textperf| image:: images/img-textperformance.*
119119

120120
Here are the results using the same test files as above (again: decimal point and comma reversed):
121121

@@ -141,7 +141,7 @@ We have tested rendering speed of MuPDF against the *pdftopng.exe*, a command li
141141
print "processing:", datei
142142
doc=fitz.open(datei)
143143
for p in fitz.Pages(doc):
144-
pix = p.getPixmap(matrix=mat, alpha = False)
144+
pix = p.get_pixmap(matrix=mat, alpha = False)
145145
pix.writePNG("t-%s.png" % p.number)
146146
pix = None
147147
doc.close()
@@ -151,7 +151,7 @@ We have tested rendering speed of MuPDF against the *pdftopng.exe*, a command li
151151
::
152152
pdftopng.exe file.pdf ./
153153

154-
.. |renderspeed| image:: images/img-render-speed.png
154+
.. |renderspeed| image:: images/img-render-speed.*
155155

156156
The resulting runtimes can be found here (again: meaning of decimal point and comma reversed):
157157

docs/app2.rst

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -33,18 +33,18 @@ A **span** consists of adjacent characters with identical font properties: name,
3333
Plain Text
3434
~~~~~~~~~~
3535

36-
Function :meth:`TextPage.extractText` (or *Page.getText("text")*) extracts a page's plain **text in original order** as specified by the creator of the document (which may not equal a natural reading order).
36+
Function :meth:`TextPage.extractText` (or *Page.get_text("text")*) extracts a page's plain **text in original order** as specified by the creator of the document (which may not equal a natural reading order).
3737

3838
An example output::
3939

40-
>>> print(page.getText("text"))
40+
>>> print(page.get_text("text"))
4141
Some text on first page.
4242

4343

4444
BLOCKS
4545
~~~~~~~~~~
4646

47-
Function :meth:`TextPage.extractBLOCKS` (or *Page.getText("blocks")*) extracts a page's text blocks as a list of items like::
47+
Function :meth:`TextPage.extractBLOCKS` (or *Page.get_text("blocks")*) extracts a page's text blocks as a list of items like::
4848

4949
(x0, y0, x1, y1, "lines in block", block_type, block_no)
5050

@@ -54,15 +54,15 @@ This is a high-speed method with enough information to re-arrange the page's tex
5454

5555
Example output::
5656

57-
>>> print(page.getText("blocks"))
57+
>>> print(page.get_text("blocks"))
5858
[(50.0, 88.17500305175781, 166.1709747314453, 103.28900146484375,
5959
'Some text on first page.', 0, 0)]
6060

6161

6262
WORDS
6363
~~~~~~~~~~
6464

65-
Function :meth:`TextPage.extractWORDS` (or *Page.getText("words")*) extracts a page's text **words** as a list of items like::
65+
Function :meth:`TextPage.extractWORDS` (or *Page.get_text("words")*) extracts a page's text **words** as a list of items like::
6666

6767
(x0, y0, x1, y1, "word", block_no, line_no, word_no)
6868

@@ -72,7 +72,7 @@ This is a high-speed method with enough information to extract text contained in
7272

7373
Example output::
7474

75-
>>> for word in page.getText("words"):
75+
>>> for word in page.get_text("words"):
7676
print(word)
7777
(50.0, 88.17500305175781, 78.73200225830078, 103.28900146484375,
7878
'Some', 0, 0, 0)
@@ -88,9 +88,9 @@ Example output::
8888
HTML
8989
~~~~
9090

91-
:meth:`TextPage.extractHTML` (or *Page.getText("html")* output fully reflects the structure of the page's *TextPage* -- much like DICT / JSON below. This includes images, font information and text positions. If wrapped in HTML header and trailer code, it can readily be displayed by an internet browser. Our above example::
91+
:meth:`TextPage.extractHTML` (or *Page.get_text("html")* output fully reflects the structure of the page's *TextPage* -- much like DICT / JSON below. This includes images, font information and text positions. If wrapped in HTML header and trailer code, it can readily be displayed by an internet browser. Our above example::
9292

93-
>>> for line in page.getText("html").splitlines():
93+
>>> for line in page.get_text("html").splitlines():
9494
print(line)
9595

9696
<div id="page0" style="position:relative;width:300pt;height:350pt;
@@ -153,7 +153,7 @@ To address the font issue, you can use a simple utility script to scan through t
153153
DICT (or JSON)
154154
~~~~~~~~~~~~~~~~
155155

156-
:meth:`TextPage.extractDICT` (or *Page.getText("dict")*) output fully reflects the structure of a *TextPage* and provides image content and position details (*bbox* -- boundary boxes in pixel units) for every block and line. This information can be used to present text in another reading order if required (e.g. from top-left to bottom-right). Images are stored as *bytes* (*bytearray* in Python 2) for DICT output and base64 encoded strings for JSON output.
156+
:meth:`TextPage.extractDICT` (or *Page.get_text("dict")*) output fully reflects the structure of a *TextPage* and provides image content and position details (*bbox* -- boundary boxes in pixel units) for every block and line. This information can be used to present text in another reading order if required (e.g. from top-left to bottom-right). Images are stored as *bytes* (*bytearray* in Python 2) for DICT output and base64 encoded strings for JSON output.
157157

158158
For a visuallization of the dictionary structure have a look at :ref:`textpagedict`.
159159

@@ -183,7 +183,7 @@ Here is how this looks like::
183183

184184
RAWDICT
185185
~~~~~~~~~~~~~~~~
186-
:meth:`TextPage.extractRAWDICT` (or *Page.getText("rawdict")*) is an **information superset of DICT** and takes the detail level one step deeper. It looks exactly like the above, except that the *"text"* items (*string*) are replaced by *"chars"* items (*list*). Each *"chars"* entry is a character *dict*. For example, here is what you would see in place of item *"text": "Text in black color."* above::
186+
:meth:`TextPage.extractRAWDICT` (or *Page.get_text("rawdict")*) is an **information superset of DICT** and takes the detail level one step deeper. It looks exactly like the above, except that the *"text"* items (*string*) are replaced by *"chars"* items (*list*). Each *"chars"* entry is a character *dict*. For example, here is what you would see in place of item *"text": "Text in black color."* above::
187187

188188
"chars": [{
189189
"origin": [50.0, 100.0],
@@ -216,9 +216,9 @@ RAWDICT
216216
XML
217217
~~~
218218

219-
The :meth:`TextPage.extractXML` (or *Page.getText("xml")*) version extracts text (no images) with the detail level of RAWDICT::
219+
The :meth:`TextPage.extractXML` (or *Page.get_text("xml")*) version extracts text (no images) with the detail level of RAWDICT::
220220
221-
>>> for line in page.getText("xml").splitlines():
221+
>>> for line in page.get_text("xml").splitlines():
222222
print(line)
223223

224224
<page id="page0" width="300" height="350">
@@ -249,7 +249,7 @@ The :meth:`TextPage.extractXML` (or *Page.getText("xml")*) version extracts text
249249

250250
XHTML
251251
~~~~~
252-
:meth:`TextPage.extractXHTML` (or *Page.getText("xhtml")*) is a variation of TEXT but in HTML format, containing the bare text and images ("semantic" output)::
252+
:meth:`TextPage.extractXHTML` (or *Page.get_text("xhtml")*) is a variation of TEXT but in HTML format, containing the bare text and images ("semantic" output)::
253253

254254
<div id="page0">
255255
<p>Some text on first page.</p>
@@ -259,7 +259,7 @@ XHTML
259259

260260
Text Extraction Flags Defaults
261261
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
262-
*(New in version 1.16.2)* Method :meth:`Page.getText` supports a keyword parameter *flags* *(int)* to control the amount and the quality of extracted data. The following table shows the defaults settings (flags parameter omitted or None) for each extraction variant. If you specify flags with a value other than *None*, be aware that you must set **all desired** options. A description of the respective bit settings can be found in :ref:`TextPreserve`.
262+
*(New in version 1.16.2)* Method :meth:`Page.get_text` supports a keyword parameter *flags* *(int)* to control the amount and the quality of extracted data. The following table shows the defaults settings (flags parameter omitted or None) for each extraction variant. If you specify flags with a value other than *None*, be aware that you must set **all desired** options. A description of the respective bit settings can be found in :ref:`TextPreserve`.
263263

264264
=================== ==== ==== ===== === ==== ======= ===== ======
265265
Indicator text html xhtml xml dict rawdict words blocks
@@ -277,14 +277,14 @@ dehyphenate 0 0 0 0 0 0 0 0
277277

278278
To show the effect of *TEXT_INHIBIT_SPACES* have a look at this example::
279279

280-
>>> print(page.getText("text"))
280+
>>> print(page.get_text("text"))
281281
H a l l o !
282282
Mo r e t e x t
283283
i s f o l l o w i n g
284284
i n E n g l i s h
285285
. . . l e t ' s s e e
286286
w h a t h a p p e n s .
287-
>>> print(page.getText("text", flags=fitz.TEXT_INHIBIT_SPACES))
287+
>>> print(page.get_text("text", flags=fitz.TEXT_INHIBIT_SPACES))
288288
Hallo!
289289
More text
290290
is following

docs/app3.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,4 +29,4 @@ PyMuPDF Support
2929
------------------
3030
We continue to support the full old API with respect to embedded files -- with only minor, cosmetic changes.
3131

32-
There even also is a new function, which delivers a list of all names under which embedded data are resgistered in a PDF, :meth:`Document.embeddedFileNames`.
32+
There even also is a new function, which delivers a list of all names under which embedded data are resgistered in a PDF, :meth:`Document.embfile_names`.

docs/app4.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ Python on the other hand implements the OO-model in a very clean way. The interf
113113

114114
When you use one of PyMuPDF's objects or methods, this will result in excution of some code in *fitz.py*, which in turn will call some C code compiled with *fitz_wrap.c*.
115115

116-
Because SWIG goes a long way to keep the Python and the C level in sync, everything works fine, if a certain set of rules is being strictly followed. For example: **never access** a :ref:`Page` object, after you have closed (or deleted or set to *None*) the owning :ref:`Document`. Or, less obvious: **never access** a page or any of its children (links or annotations) after you have executed one of the document methods *select()*, *deletePage()*, *insert_page()* ... and more.
116+
Because SWIG goes a long way to keep the Python and the C level in sync, everything works fine, if a certain set of rules is being strictly followed. For example: **never access** a :ref:`Page` object, after you have closed (or deleted or set to *None*) the owning :ref:`Document`. Or, less obvious: **never access** a page or any of its children (links or annotations) after you have executed one of the document methods *select()*, *delete_page()*, *insert_page()* ... and more.
117117

118118
But just no longer accessing invalidated objects is actually not enough: They should rather be actively deleted entirely, to also free C-level resources (meaning allocated memory).
119119

0 commit comments

Comments
 (0)