tm or cm matrix with a visitor for 'extract_text'?

I was testing [this example](https://pypdf.readthedocs.io/en/latest/user/extract-text.html#example-1-ignore-header-and-footer) from the documentation, but `print(text_body)` printed an empty string. I then changed the visitor function to this:

```
def visitor_body(text, cm, tm, font_dict, font_size):
    #y = cm[5]
    y = tm[5]
    print(cm, tm)
    if 50 < y < 720:
        parts.append(text)
```

and ran it again.This produced the expected outcome, i.e. that  `print(text_body)` printed the text of the 4th page of the PDF without header and footer. All printed cm matrices looked like this: `[1.0, 0.0, 0.0, 1.0, 0.0, 0.0]`. I didn't look under the hood of pypdf itself to check where this might be used "incorrectly", but I think it's the same problem with the [second example on that page](https://pypdf.readthedocs.io/en/latest/user/extract-text.html#example-2-extract-rectangles-and-texts-into-a-svg-file).

See the SVG files created from the code of the 2nd example, that I have attached here. The first one uses the cm matrix in `visitor_svg_text`, and puts all of the text in the top left corner. The second one uses the tm matrix and while it won't win beauty contests, it looks a lot better:

![Image](https://github.com/user-attachments/assets/1309cdd4-3f72-4a65-a9d1-1d7a642c119a)
![Image](https://github.com/user-attachments/assets/cb53024a-3944-46f1-9bd4-64ca07848190)

## Environment

Python 3.10.0, pypdf 5.7.0, Windows 10

## Code + PDF

See the links to the examples in the documentation above.

## Traceback

N/A


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tm or cm matrix with a visitor for 'extract_text'? #3377

Environment

Code + PDF

Traceback

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

tm or cm matrix with a visitor for 'extract_text'? #3377

Description

Environment

Code + PDF

Traceback

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions