-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compressing an entire level of mip levels #136
Comments
It is compressing an entire level. As with the OpenGL API for uploading textures, what is stored in the mip level of a cubemap is all 6 faces at that level, of an array texture is the images of all layers at that level. KTX v2 is just like KTX v1 in this regard except that the level order is reversed and there are no level size fields mixed in with the image data. Please point me at the confusing language in the spec. Compressing the mip tail in one go would break the idea of being able to stream the file and display a low resolution image right away. It could be done by adding a new supercompression scheme but, if we were to add a new scheme, I think we would create one that used a dictionary shared between the mip levels. In conjunction with the zstd API that uses a decompression context I think the additional overhead for each mip level after the first would be very small. ETC1S/BasisLZ already uses a shared dictionary (a.k.a codebook). |
Snippets below. I named these chunks in my KTX encoder/decoder to distinguish them from mip levels. An individual element (array, slice, or face) then represents a chunk. I know when I first implemented KTX support, and especially with storing face size and level size in the same field, I got that wrong on import/export. Maybe calling these an "aggregate level" in the spec would help. One other part I couldn't find was a clarification of the formula for mip level calculation. All the hardware uses round-down, but it's not ideal for mipgen. There's round-up and round-down, but since DX/GL did round-down the other APIs followed suit. We have a lot of 2D textures, but I guess we can aggregate them into a 2D array instead of dealing with the packed miptail. I appreciate that I only need one array of mip sizes, and then the chunks are just offsets once unpacked. This is something that could be done with slightly modified KTX file that strips the size and stores the zstd compressed levels. But that's mostly what KTX2 already does. The main idea is supercompress the mips as KTX2 to get them to player, decode/transcode a level to memory as a shared MTLBuffer or mmap larger decompressed mips directly as shared MTLBuffer, and then copy/twiddle that to a private MTLTexture or sparse texture. I have a viewer and encoder I wanted to shared at https://github.com/alecazam/kram. It at least starts to open up KTX creation and visualization on macOS. Next I'd like to add KTX2 support. ** Here's one snippet where levelCount could mean the array holds one level per mip level as in a simple 2D texture which is where most start out with these texture formats. 3.7. levelCount 𝑚𝑎𝑥=⌊log2(max(𝑝𝑖𝑥𝑒𝑙𝑊𝑖𝑑𝑡ℎ,𝑝𝑖𝑥𝑒𝑙𝐻𝑒𝑖𝑔ℎ𝑡,𝑝𝑖𝑥𝑒𝑙𝐷𝑒𝑝𝑡ℎ))⌋ 𝑙𝑒𝑣𝑒𝑙𝐶𝑜𝑢𝑛𝑡=0 is allowed, except for block-compressed formats, and means that a file contains only the base level and consumers, particularly loaders, should generate other levels if needed. ** and this one, is this a level of mip levels or a single mip level Should KTX support level sizes > 4GB? Discussion: Users have reported having base levels > 4GB for 3D textures. For this the imageSize field needs to be 64-bits. Loaders on 32-bit systems will have to ensure correct handling of this and check that imageSize <= 4GB, before loading. Resolved: Be future proof and make all image-size related fields 64 bits. ** And this made me think the individual mips were compressed, so you could offset into them across the array. Should the supercompression scheme be applied per-mip-level? Discussion: Should each mip level be supercompressed independently or should the scheme, zlib, zstd, etc., be applied to all levels as a unit? The latter may result in slightly smaller size though that is unclear. However it would also mean levels could not be streamed or randomly accessed. Resolved: Yes. The benefits of streaming and random access outweigh what is expected to be a small increase in size. |
I finished KTX and KTX2 support in my viewer. It's working well, and converting KTX -> KTX2 and then using ETC/BC/ASTC + zstd supercompression really smashes them down. Also seems good for storing HDR 16f/32f source images in KTX2 files for sourc control. I added an "any" path to test out BasisLZ, but see results on my github page since the archive was 10x bigger and 10x slower to generate using UASTC. If further encoding of the UASTC file is needed, then that defeats the purpose of having each mip level available to decompress. |
Sounds great. I want to try out kramv but need to finish integrating the latest Basis Universal code into KTX-Software first. See below.
I am not too surprised by the 10x bigger, after all UASTC is 2x larger c/f ETC1S, but am by the 10x slower. There have been recent encoder improvements in the Basis Universal code that may help. Hence my desire to get that code integrated.
For UASTC, Zstd supercompression is needed to begin closing in on BasisLZ, which has built in supercompression. With UASTC + Zstd, each miplevel is supercompressed independently so you can still decompress individual levels. I do not understand your comment. |
I think I was looking at just UASTC and RDO. So with zlib on the KTX2 file, that brought it down. I thought the intent was then the overall KTX2 file needed to be compressed, but I was hoping individual mips could be. Rich is working on the perf, so I'm sure it will improve.
From the ktxsc usage, I wasn't sure if zstd and uastc could both be specified together. I thought those might be exclusive of one another due to the supercompression setting being either BasisLZ or Zstandard. I just tried them both, and that worked. Also I found out 1D and 1DArray textures are pretty limited on Metal. 1D can't have compression, and can't have mips. It feels like this texture type could be replaced with 2D and 2DArray. I adjusted my scripts and kept these types. Also I have a 4x4 checkerboard texture that's failing with "out of memory" in ktxsc. I'll file another issue on that. |
BasisLZ is supercompressed ETC1S universal format. UASTC is another universal format which can be supercompressed with Zstd. So as you have discovered you can use both uastc and zstd in |
Seems that vkFormat = 0 and supercompression == 2 (Zstd) in the above scenario. So now I check for that, and reject the file until I can get UASTC decode. Seems that to get compressed transcodable files, there's another stage and temporary memory involved.
Also with some formats (BC6 and ASTC HDR and ASTC5x5+), one will still need platform-specific encoding formats. If an app doesn't use those, then I suppose they can all be transcoded from UASTC. I could also see decode and skip for various array or cube faces that aren't needed. I'm trying to avoid writing out decompressed data to disk for mmap, since it wastes space on the mobile devices. But mmap avoids jetsam, so there are tradeoffs with compressed mips. |
The
How is that? |
Those sentences are still not clear to me, but what you have it a little better. Removing the technical references helps me read the definition better. A diagram would probably also help. See what you think of the following: When streaming a KTX file, the smallest images of the mip level arrays can be decoded as received, transcoded if needed, then uploaded to a buffer or texture to display a low-resolution mip chain while the remaining larger mip level arrays finish streaming. Use of lod clamping and calls to copy mips into larger textures may be needed.
Some thoughts about my use of KTX2 in the wild. Is there documentation about supercompression preventing per image access? Formats like JPG had Huffman reset markers that let you process the compressed stream across multiple threads or skip chunks, but KTX2 doesn't have that. I can see for large 2d array atlases and sparse textures, where individual access to larger mips might be of value. Also supercompressed Basis UASTC requires a further transcode requiring additional memory, where supercompressed BC/ETC/ASTC can decode direct to staging buffer to be twiddled to the private texture format. Note, that on consoles, the twiddling would likely be stored directly into the KTX2 blocks to avoid staging but that would have to be conveyed in a prop. I think in general, transmitting the entire KTX2 files/bundle, mmap-ing that as compressed backing store, and then decoding mips as needed to staging buffer, then blit twiddle to private textures is ideal. Maybe gltf2 can benefit in the browser from progressive download, but it really complicates the texture loader and memory and gpu resource handling. A loader that progressively loads/drops the larger mips to conserve memory once the full KTX2 or bundle of KTX2 is available is more common for games and works to avoid jetsam on mobile. We can only supply textures in signed bundles on mobile and console, not as individual textures. One can flush the entire GPU copy, since the KTX2 is the backing store in compressed form similar to a PNG. Also just wanted to say thanks for all the great work on KTX and KTX2. These formats are such a joy compared to all the formats I've worked with prior. |
The big downside to reversed mips in KTX2 is that I have to seek backwards to write in-place mips to the file and special case code vs. KTX. With KTX, I could write mips in-order for a 2d texture. The single texture streaming isn’t applicable to any of my use cases. And KTX2 aren’t stored compressed in my archives, only the mips are. |
No. Perhaps I should add a note. Only zstd supercompression prevents per-image access. BasisLZ has an index of the offsets and sizes of the data for each image in the
You have individual access to any mip level. Do you mean individual access to the images of a mip level? I'm not sure that is useful. For example you have to have all of a cube map's face images at a particular level size before you can use that mip level. |
Yes, I was specific about 2d and 2d array atlases (and sparse), but the spec does have partial cubes, and there are cube arrays which are often locationally dependent. For example, there are many problems with combining atlas entries into a single 2d texture (f.e. mip, alignment, block bleed and no wrap support) but I see many sparse textures built this way. Also Substance uses charts which break all hope of mips. So I'm moving more towards storing atlas/flipbook data in 2d array textures. These are a fixed dimension, but make it easy for artists to build, but limited to 2048 elements. I could see load and grow the array strategies to only load atlas entries that are referenced. These are ES3 level now, so supported by all hardware of import. |
Seek backwards in what? What I do in the libktx writer is have a calculation of the offset of a mip level with the data and I write the data for a level to the calculated offset. The only thing that differs between KTX and KTX2 is the calculation.
I don't understand what you mean by this. |
@alecazam if per-image access in zstd compressed mip levels is important to you I suggest you propose a new supercompression scheme the permits it. Basically it would have supercompression global data with an index of the images within the compressed data. If it's going to have global data, it's worth considering having a global dictionary as well. |
Yes, I do something similar, but originally I tried to minimize memory use by writing mips directly to file, and then mmap-ing them back in read-only. I should probably decouple the file system from mip encoding. But currently I fseek to the offsets that I have. It just means the file system zeros a bunch of pages, and then as I seek back, then they get filled in with the mip data generated from the largest level.
Yes, that's reasonable. I'm still building out the atlasing commands, and have some info on my kram page about the idea for using 2d arrays instead of charts. I likely don't have any atlases yet large enough to justify per image decode, but trying to think ahead. Mostly my arrays are small particle textures. |
Thanks for the kind words @alecazam. Khronos will soon be announcing KTX 2.0 & universal textures support and we kindly ask if we may use this quote in the press materials. If you are okay with that, please tell me the company name and title we should use for attribution. I'm sorry for asking in this forum but I don't have any direct contact info for you and GitHub doesn't seem to have a way to send private messages. |
@alecazam I received your private reply to my question about using your quote. I sent several responses from 2 different e-mail addresses asking for some additional info. I have not received any further reply from you. The announcement will be happening r.s.n so please contact me again with the info I requested. |
Hey Mark, I sent you a private reply just so you had my email address from that. I didn't get any responses to the message that I sent you on that email address. [email protected] is my email. Happy to confirm anything you need, and I also confirmed with my company that attributing my name and company are okay. |
I sent messages to that address on Mar 6th, 13th and 17th. The last was from a different address than you sent you message to. Strange you never got them. In the email you sent me you did no identify your company or position which we would like for the attribution. That is what I was asking for in my e-mails. |
Responded in private email. Company was in the original, but not position so I added that. |
Thank you. I got your message. Sorry I missed the company name in your first e-mail. Strange my other messages were never delivered. |
KTX and KTX2 store mips at levels (reversed from one another). For arrays, cubes, etc the spec language of mip level and a level of mips levels gets a bit conflated.
Supercompression of individual mips seems overkill at the smaller mip levels, and necessary at the larger mip levels Is there any possibility to have compression of an entire level of mip levels? For a cube or cube array, I'd want to decompress 6 faces at a time, since the texture is useless without all the data. For a 3D volume, I need an entire level before it can be displayed.
For 1D arrays, there are no mips, but I may want supercompress all levels in one compressor and then copy out the results from a single decompress. I know Basis can also optimize blocks across mip levels, and maybe across a level of mips.
Even for the basic 2D with mips case, I'm thinking of wanting to unpack a compressed packed mip tail for sparse textures, but not wanting to hit the decompressor so much. With hardware decompressors, I could see repeatedly sending small mips as performance prohibitive. I also thought with KTX1, the idea was to upload the entire level in one upload call (or copy to a buffer).
Also if a file indicated that only levels were supercompressed, you'd basically just need the same mip count setup as KTX1 but with the compressed sizes vs. uncompressed. Once uncompressed, the offset into the level is the same as with KTX1. That would save storing the compressed/uncompressed sizes for every mip level.
The text was updated successfully, but these errors were encountered: