Skip to content

Commit ce95278

Browse files
authored
Merge pull request #1182 from danofsteel32/patch-1
fix broken link layout preserving text extraction
2 parents 279028a + 44edc1a commit ce95278

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,9 @@ Have a look at the basic [demos](https://github.com/pymupdf/PyMuPDF-Utilities/tr
5555

5656
**_New: Layout preserving text extraction!_**
5757

58-
> Script [textlayout.py](https://github.com/pymupdf/PyMuPDF-Utilities/blob/master/text-extraction/textlayout.py) extracts document text in a **_layout-preserving_** manner. So text in tables or multi-column pages should deliver quite faithful text output. Works for all supported document types, not just PDF.
58+
> Via its subcommand "gettext", script [fitzcli.py](https://github.com/pymupdf/PyMuPDF-Utilities/blob/master/text-extraction/fitzcli.py) offers text extraction in different formats. Of special interest surely is layout preservation, which produces text as close to the original physical layout as possible, surrounding areas where there are images, or reproducing text in tables and multi-column text.
59+
60+
See [here](https://github.com/pymupdf/PyMuPDF-Utilities/tree/master/text-extraction#layout-preserving-text-extraction) for more information on layout preserving text extraction.
5961

6062
Our **documentation**, written using Sphinx, is available in various formats from the following sources. It currently is a combination of a reference guide and a user manual. For a **quick start** look at the [tutorial](https://pymupdf.readthedocs.io/en/latest/tutorial/) and the [recipes](https://pymupdf.readthedocs.io/en/latest/faq/) chapters.
6163

0 commit comments

Comments
 (0)