-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When to do the cell rasterization? #8
Comments
In my WIP PoC implementation the only part that actually stores the image data is the upload command, which puts the image into the storage pull. When a render-image command is received, the image is being rasterized based on the properties the render-image command has been associated with (number of columns / lines, alignment/resize policy). There is no need for the actual screen cell to know anything about the image's data contents except a pointer (reference) to the rasterized image (holding the above line/col/align/resize properties) and a coordinate into the rasterized image that this screen cell likes to draw - no image data needs to be extracted nor held. When the terminal renderer kicks in and iterates over each screen cell and then hits a cell that contains an image fragment, it then starts fetching the actual data from the rasterized image, and only then the actual rasterization takes place. I think that is not to resource wasty, in fact, on the host you only keep the image once in memory and rasterize and GPU-upload also only once per displayed cell, everything else are index operations.
That is interesting. I think that is exactly what people wanted. splitting upload from render, and be able to reuse the uploaded image. Forcing the upload to also know how to rasterize might be protocol-wise possible but would be a layer-violation if you ask me and also limit the possibilities of reusing the uploaded image in different scenarios.
It's only done on the first operation (as mentioned above), I don't think that makes it more expensive.
When you here refer to image alignment and image resize policy, I actually agree. However the client app does not necessarily know how to crop/resize to fill every pixel as it does not necessarily know the aspect ratio and pixel dimension of the grid cells. I've taken that part also from the orginal forum post that explains the here formalized "good image protocol". :)
Maybe I do not understand your point here exactly, but I was attempting very hard to not have any notion of pixels for the client app side. The other good point should be, a proxying terminal emulator does not even need to understand the image format, it can simply treat it as a blob, which hopefully should satisfy the tmux author - I hope :)
👍 (i remember I have read that, will do again and see what we can incoorperate)
I mentioned above why I am against that (and I think the original GIP post didn't see it that way either, I have to reread while getting into the subject again though)
In the current protocol draft, when a font change / font resize would take place (at least in my implementation), the GPU cache is being flushed and therefore the rasterized image has to be refreched which would then realign/resize based on the given rasterization properties.
I took special care of that in the spec to make it as simple as possible for tmux-like proxy terminals (as mentioned above). If I have missed something, I of course have to adapt then :) p.s: sorry for the late reply (you know), I had busy times in other areas I had to close down first before resuming my work on this draft spec and PoC impl. :) |
Ah sorry, my post is kinda old and I have to refresh my memory first, why I wrote it that way. I think we are not that far from each other, as it might sound above. Lemme quickly iterate, which steps I do it in my playground branch and found to be efficient in the browser env:
Thats basically all I have to do. There is just one size transformation needed during upload, and later if the user changes font/cell size. The reason why I do the size transformation on first sight is simple - it is a single action per image and speeds up later rendering in 4. roughly 2 times (the real rendering gets reduced to a simple 1:1 bitmap clipping). But for that to work, I need the later ref-clip-drawing sequence to be transformation free - thats basically the idea behind the post. This is directly linked to the GPU here - I see where you come from, having full access to OpenGL primitives gives you much more transformation freedom. But thats not the case for many TEs (framebuffer TEs, kernel consoles), thus should not influence the sequence shapes. I think pretagging a fixed grid size to the image resource and clipping tiles 1:1 is the better general purpose interface. Or to put it more simple:
Ofc this all is implementation detail, I could also do late transformation all the time (I actually started with that). But by reshaping the sequences as above, a TE without OpenGL access can decide to optimize image resource handling the way I did in xterm.js. Thats not possible, if the size transformation is hard tagged to the later ref-drawing sequence. |
okay. this is were the major difference lies, however, your rasterization step (creating the "actual" image) in GIP is currently done in the wrt OpenGL, you are absolutely right, that is one of the reasons for me to be strictly introducing a Z-axis (see kitty protocol - we should have a separate discussion on that). |
For me thats the crucial point. With your sequence an app can freely resize on every clip-drawing sequence (thus I need to store several resized versions), while I think we should limit it to one grid size for an image ref resource as long as it lives in the storage.
Thats basically the idea behind my layering issue #11. Guess we should start over from scratch, my issue is convoluted with image layering/blending and whether a layer should follow text grid mechanics and where it should appear relative to other TE layers. That are in fact multiple issues in one. |
To be clear, I am not against that, I just want to be super sure about.
Currently in GIP point 2 and 3 are merged. In your request, point 1 and 2 are merged. I wouldn't try to propose yet another solution (even more VT GIP commands to satisfy all), because then you'd still need to implement the current proposed commands. I think I'm fine in moving point 2 up into the upload command then (maybe there isn't that much usecase for that afterall, and if, it could be added as an improved protocol version (1.1 or so) and builds upon the existing but adds more commands if necessary.
I'll post my reply on that matter in #11 then. |
👍 Yes, thats basically what I had in mind (beside some more params like the real bytelength to do at least basic sanity check in the sequence parser).
Interesting that you bring that up now, I also have thought about that (as mentioned earlier in the image xterm.js PR) up to generalize that further as a universal blob upload sequence. As I wrote over there, I think we will see more upload demand for other data types as well - next hottest candidates are vector graphics (we dont even have a good format for that yet), and some reduced PDF things. With broader image support we basically open a door for terminals - limited document support. Not sure how far that could be generalized in the end, I didnt went further down that road yet. |
note that i am already displaying many PDFs in Notcurses via sixel/kitty as a result of ffmpeg cheerfully decoding them into RGBA. IMHO this kind of thing ought be client-side, rather than adding formats to decode to the terminal (and how will that work -- will you have mime-type discovery available to the client?). |
@dankamongmen Nah nothing concrete yet into the document direction, and in general I agree with you - we should not try to reshape terminals into a text stream based browser/rich client. It is beyond their scope as IO "extension" of cmdline apps. Still just providing capable basic output interfaces vs. going document style is a thin line we already crossed here with image formats. Same gonna happen with vector graphics, as there is some demand for that from typical plotting REPLs. (If I ever find time for that I prolly will strip down SVG-tiny into a terminal secure variant as SVG-T.) |
As already indicated in our talk I am a bit concerned about applying image transformations late in the draw sequence, as your current draft implies. There are several reasons I find this an unfortunate choice, mainly:
Ofc I see the benefits in doing it this way, it would open the terminal to much finer grained image output from just one upload. Still the reasons above make we wonder if we should go with a much simpler abstraction and leave complicated image transformations to the appside. Mainly the pixel notion in the draw command bugs me - to be used successfully the appside always needs to know the correct image dimensions, which is not always the case (e.g. from curl <some_image_url>).
My proposal would basically separate the image-to-terminal-cells-clipping (cell rasterization) from the later cropping:
CellRange(width, height)
CellRange
The interim state of this upload command would be an image reference, that has a width and a height in terminal cells, not pixels anymore. It makes any follow-up drawing rather simple - just draw
CellRange(width, height)
starting atCell(sx, sy)
of the image toCell(dx, dy)
of the display buffer. (We basically replaced pixels with cells as metrics, thus do the cropping within that new metrics.)Furthermore changing font size in a terminal and thus different pixel cell coverage can easily be implemented by resizing the image resource behind the reference. (Imho it is important not to allow an image to change cell coverage from different font sizes.)
This would also work easily with multiplexers - they can just forward the upload and the drawing commands and expect them to work if supported.
The text was updated successfully, but these errors were encountered: