Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposing changes to drawing backend #8

Open
M2-TE opened this issue Nov 22, 2024 · 10 comments · May be fixed by #9
Open

Proposing changes to drawing backend #8

M2-TE opened this issue Nov 22, 2024 · 10 comments · May be fixed by #9
Labels
enhancement New feature or request

Comments

@M2-TE
Copy link
Contributor

M2-TE commented Nov 22, 2024

Disclaimer: I am mostly talking about the drawing of graphs, as that is what I focused on for now in #6.

The current drawing method relies on the << operator of std::cout in combination of '\n' for line breaks.
This may result in uncontrolled flushes and somewhat suboptimal console writes.

Since you put "performance" as one of the TODOs, I would propose a change to the way graphs are drawn to the console via a Temporary Buffer:
The size is mostly known beforehand, where enough memory should be allocated to fit
height*width*(unicode_size+ANSI_escape_size)
which assumes that every unicode character may be accompanied by an ANSI escape code for color. The color here would be fixed width and may be 4-/8- or 24-bit color.
The entire buffer or the separate rows can be written to the console more efficiently using std::cout.write(row, row_bytes) followed by manual flushing, which will be much more efficient than using std::cout with <<.

Some side effects of using such a buffer will be the freedom of what parts of the graph can be written to and in what order; e.g. it would allow first drawing the plot/graph itself, followed by potentially overwriting via axis drawing in a second pass. Not only may this be more optimal, it may also allow some better encapsulation of the code used to draw stuff.

If not a full temporary buffer, it could instead be a good idea to simply create a new buffer for every row during the iteration in the current drawing method, at the end of which std::cout.write() can be called to write that entire row, while not really changing any of the existing code apart from replacing std::cout << instances.

Additionally, we could provide a manual buffer for std::cout itself via rdbuf and disable automatic buffer flush with sync_with_stdio (may be of interest to do especially when you want to keep using std::cout<<), since we know pretty well when flushes should occur.

@tdulcet tdulcet added enhancement New feature or request help wanted Extra attention is needed labels Nov 22, 2024
@tdulcet
Copy link
Owner

tdulcet commented Nov 22, 2024

Thanks for the detailed proposal. I was not as familiar with some of these low level C++ APIs.

One change I have been considering is replacing cout with an instance of ostringstream and then doing a just single cout at the end. I do not expect that this would improve performance, but it would at least solve the flushing problem. Doing this on a per row basis and/or with cout.write could also work as well. The libraries used to manually flush the buffer at the end by using endl for the final new line, but I recently removed it, as it was unnecessary. I have also considered making the output destination configurable in the options structs by setting an instance of basic_ostream, so the user could for example output it to a file instead, but that may be more applicable to the tables library.

Using a string and preallocating the size would work for the graphs library where everything is already a string or char array, but we would still need to use ostringstream for the tables library, as it could be outputting arbitrary data types if they overloaded the << operator. (Note that this would also be more similar to the Python port of the libraries, where they build up a str and then have just a single print call at the end.)

Your proposal to generate the graphs in layers would be amazing, but I suspect it would be extremely difficult to implement in practice, as the ANSI escape codes are not applied to every character and are also not fixed width. For example, the tables library has to use a regular expression to temporarily strip any ANSI escape codes from the provided strings, but that only supports a small subset of the available codes. In addition, due to the Unicode characters, the string length even without the ANSI escapes is not the same as the column size, so it would not be obvious what indexes in the buffer to replace.

@M2-TE
Copy link
Contributor Author

M2-TE commented Nov 22, 2024

One change I have been considering is replacing cout with an instance of ostringstream and then doing a just single cout at the end. I do not expect that this would improve performance, but it would at least solve the flushing problem.

I believe using ostringstream would make a difference as every std::cout<< is synchronized to the standard C streams at the moment, though setting sync_with_stdio to false would achieve the same result.

I have also considered making the output destination configurable in the options structs by setting an instance of basic_ostream, so the user could for example output it to a file instead, but that may be more applicable to the tables library.

That actually sounds like a really good idea, people may want to store even some graphs to a file (or their own basic_ostream).

ANSI escape codes are not applied to every character and are also not fixed width

My basic idea was to make stuff fixed width by padding with unrendered characters and simply supplying color for every single unicode character.. but I see now that it might result in a little too much overhead. I'll fiddle around with this idea a bit and see if we can do something to have obvious indices without too much overhead

@tdulcet tdulcet removed the help wanted Extra attention is needed label Nov 23, 2024
@tdulcet
Copy link
Owner

tdulcet commented Nov 23, 2024

That actually sounds like a really good idea, people may want to store even some graphs to a file (or their own basic_ostream).

I updated the libraries to replace cout with ostringstream, as well as added a new output stream option to change the output destination. Note that displaying the tables and graphs requires a fixed width font and the ANSI escapes of course require a terminal, so outputting them to a file would likely not be useful. I also made some improvements to how the formatting/alignment are specified for the tables library. See ed5d06d for the full changes. Feedback is welcome.

PRs are also welcome if you would like to make any additional improvements.

@M2-TE
Copy link
Contributor Author

M2-TE commented Nov 24, 2024

I updated the libraries to replace cout with ostringstream, as well as added a new output stream option to change the output destination.

That looks pretty good!

Note that displaying the tables and graphs requires a fixed width font and the ANSI escapes of course require a terminal, so outputting them to a file would likely not be useful.

Using the right font would be entirely up to the user, but yea I forgot about the ANSI escapes.

As for the idea of multi-staged drawing the graphs/tables, here's something that might work really well:
To write in multiple stages, using the actual ANSI escapes and characters does not seem to make much sense, so a lightweight intermediate representation would be much better

struct Backbuffer {
    using Row = std::vector<Pixel>;
    std::vector<Row> rows;
    // (or just use a single std::vector<Pixel> for the entire thing)
};

where every Pixel can either be

struct Pixel_A {
    uint16_t col_index; // index into custom color palette
    uint16_t char_index;
}; // 4 bytes, 2 alignment

or

struct Pixel_B {
    union {
        struct { // 4-/8-bit ANSI color index
            uint16_t col_index;
            // uint16_t automatic_padding;
        };
        struct { // 24-bit true color
            uint8_t r;
            uint8_t g;
            uint8_t b;
            // uint8_t automatic_padding;
        };
    };
    uint16_t char_index;
}; // 6 bytes, 2 alignment

The Pixel_A contains a col_index into a temporary color palette for that specific graph/table, which should be very small in most cases; 2 bytes in size simply for alignment.
For an approach where you want to store color directly, Pixel_B would allow storing either an ANSI or 24-bit true color.

This approach carries two benefits:

  1. The drawing can occur in various stages (would also allow for live-updating?) that will help code readability and makes things easier to extend down the line.
  2. All rows can be drawn independently, even across stages, so multi-threading would become an option. Of course, thread creation time needs to be weighed against the potential benefit, but for larger plots with e.g. large width, this might be pretty neat.

After the intermediate representation is built, it can be drawn to the requested output stream rather easily. In the case of multi-threading, each thread could build its own ostringstream that would be trivial to write onto the output stream.

What are your thoughts on this?

@tdulcet
Copy link
Owner

tdulcet commented Nov 24, 2024

What are your thoughts on this?

Nice, I like this proposal a lot! As you know probably know, the current intermediate representation is just: vector<vector<unsigned short>>, where only the color is stored, so either of your pixel structs would be a good improvement on top of that.

I actually have been considering doing something like this in order to allow the user to alternatively specify 8 and 24-bit color in the options struct:

using color = variant<color_type, unsigned char, array<unsigned char, 3>>;

This is very similar to your Pixel_B struct, just using a C++ variant instead of a union. I just have not yet had time to figure out all of the details of how this might work in practice, but it looks like you have much of it done.

Note that char_index would not work, as there are multiple arrays it is indexing into depending on the configured options and not all of the strings that are used are stored in an array. We would also need some way to handle strings like the axis labels that could span multiple columns. Maybe rather than trying to combine both layers/stages into a single struct, we could just have two separate intermediate representations, one for the actual graph (the color) and another one for the strings (the borders, axis, axis labels, etc.) that are optionally displayed on top.

Live updating would be great. I would be interested in your thoughts about how we could potenchally support it. I have some commented out code to simulate that here using more ANSI escapes, although it of course redraws the entire graph each time:

for (unsigned k = 10; k < 300; ++k)
{
cout << "\e[1;1H"
<< "\e[2J";
graphs::functions(k, k, xmin, xmax, ymin, ymax, size(functions), functions, aoptions);
usleep(200000);
} */
I suspect the graphs would need to be extremely large in order to get a performance improvement from multi-threading. May I ask what your use case for the libraries is? BTW, if you have not already, you might want to look at my command line programs that wrap these libraries, as they make it a lot easier to try the various features, as one can just change the command line options instead of having to recompile. They also allow one to add an optional legend to the graphs.

@M2-TE
Copy link
Contributor Author

M2-TE commented Nov 24, 2024

Note that char_index would not work, as there are multiple arrays it is indexing into depending on the configured options and not all of the strings that are used are stored in an array

The adressing should work, could e.g. use the first n bits as an index into the right array of characters (or use it as a switch input to not have to change the current character lookup arrays and to handle strings not stored in arrays).

struct Pixel_A {
    uint16_t col_index: 16; // index into custom color palette
    uint16_t char_arr_index: 6; // index or switch input for choosing right array of characters
    uint16_t char_index: 10; // index into character array
}; // 4 bytes, 2 alignment

Maybe rather than trying to combine both layers/stages into a single struct, we could just have two separate intermediate representations, one for the actual graph (the color) and another one for the strings (the borders, axis, axis labels, etc.) that are optionally displayed on top

That should solve the issue of multi-column labels and such.

Live updating would be great. I would be interested in your thoughts about how we could potenchally support it.

I think there are two decent solutions:

  1. Return some sort of handle to the user that retains the intermediate representations and can optionally be provided to the next draw as a starting canvas to draw the new data onto. If the user ignores the returned handle, the data safely destroys itself.
  2. To overwrite the previously drawn graph, one might combine the 1. idea with ncurses, which seems to be the standard to update specific parts of a given terminal output. If not this, some sort of "distance to the start of previous graph" should be stored, which your ANSI escape solution could walk the cursor back to.

I suspect the graphs would need to be extremely large in order to get a performance improvement from multi-threading.

Yea that is very likely true, hence my idea to at least benchmark things before going wild on multithreading. It was just an idea as the drawing process seems very trivially parallelizable.

May I ask what your use case for the libraries is?

I draw the outputs of FFTs, so I like to use the entire width of a fullscreen terminal to inspect the output of a e.g. 4096-point FFT.

BTW, if you have not already, you might want to look at my command line programs that wrap these libraries, as they make it a lot easier to try the various features, as one can just change the command line options instead of having to recompile

Thanks for the pointer! It seems like recompiling works just fine for me, as compilation + output are basically instant at the press of a button

@tdulcet
Copy link
Owner

tdulcet commented Nov 25, 2024

That should solve the issue of multi-column labels and such.

To avoid overwhelming ourselves by attempting to make too many changes at once, maybe we should start with just adding support for 8 and 24-bit color. Then, we could add a second intermediate representation for the strings. For that, I do not think we would need anything too complex, so maybe just store the sting itself and the number of columns:

struct character // a single character, may overflow multiple columns
{
    string str; // the character/string
    size_t columns; // number of terminal columns needed to output the string
};

Note that I am referring to a single terminal character, which could show multiple data points depending on the value of the type option. For example, the default Braille characters can hold 2 × 4 = 8 "pixels" per character. That makes the coloring more complicated when graphing multiple arrays, as one then needs to blend the colors.

@M2-TE
Copy link
Contributor Author

M2-TE commented Nov 26, 2024

Which Pixel representation do you prefer?
I am personally a fan of the Pixel_A struct with the color palette, as those will be very flexible and can easily fit all kinds of color types without needing to store the information in the pixel directly.

Also, the name Pixel does not really fit all that well I believe.. since it may often cover more than one data point, aka 8 pixels in the case of Braille. In computer graphics, this kind of thing is often referred to as a "Fragment" instead of "Pixel". Do you have a better name idea?

@tdulcet
Copy link
Owner

tdulcet commented Nov 26, 2024

Which Pixel representation do you prefer?

Do you mean you prefer Pixel_A from #8 (comment). With that one, I am not sure what we would be indexing into. What did you mean by "custom color palette"? All 16 of the 4-bit colors are hard coded into an enum, but with the 8 and 24-bit colors, the user could of course choose any arbitrary values.

As mentioned above, when plotting multiple arrays or functions, we should also blend the colors when points for different ones end up on the same character. Currently with the 4-bit colors, it just uses the color white when this happens, as there are not enough available colors, but supporting 8 and 24-bit colors should allow better support for this, so we would likely need to use more colors in the resulting graph than just those explicitly selected by the user.

Anyway, I was thinking about using something similar to your Pixel_B, but I am of course open to other solutions:

// Structure for 24-bit true colors
struct true_color
{
    uint8_t red;
    uint8_t green;
    uint8_t blue;
};
// Using the existing color_type enum for the 4-bit colors
using color = variant<color_type, uint8_t, true_color>;

Do you have a better name idea?

No, either "fragment" or "pixel" would be fine with me. Thanks for the ideas.

@M2-TE
Copy link
Contributor Author

M2-TE commented Nov 26, 2024

What did you mean by "custom color palette"?

To explain, it is unlikely that we will have a large number of different colors for a single graph, even when blending colors. The color palette would be a simple array of colors (e.g. vector of your color variant type) present in the current graph that is indexed into. Even an 8-bit index would likely be sufficient for this, but no need to over-complicate it.

Either way, this can easily be changed later. I will start by using something like Pixel_B for now

@M2-TE M2-TE linked a pull request Dec 2, 2024 that will close this issue
@tdulcet tdulcet linked a pull request Dec 3, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants