Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for #678 and #1739 #1743

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zkxvh
Copy link

@zkxvh zkxvh commented Jan 21, 2025

Fixes for #678 and #1739.
Initially I wanted to fix #678, but after the fix the problems described in #1739 got worse, so I fixed them too.

Fixes #678
Fixes #1739

@zkxvh
Copy link
Author

zkxvh commented Jan 21, 2025

Fix for #678

This is the 1st commit in this branch.

Reproducing the crash:

The cause of the crash is described here:
#678 (comment) In short, the crash happens due to read accesses to an already disposed renderer.
The sequence of action leading to the crash was:

  • in a Tree with SWT.VIRTUAL a TreeItem is rendered for the first time or after clear()
    • Tree.cellDataProc() is executed for the item and one of the renderers
      • Tree.checkData(item) is called for the item
        • SWT.SetData event is created and sent for the item
          • TreeItem.setImage() is executed by the event handler for SWT.SetData``
          • Tree.createRenderers() executes and disposes the current renderer
      • further actions in Tree.cellDataProc() that access the already-disposed renderer (they think it's alive)

How it's fixed: in Tree.cellDataProc() wrap Tree.checkData(item) into Display.asyncExec().

Why fixed this way:

  1. on one hand, Tree.cellDataProc() is a cell data function which is not supposed to change tree structure. Violation of this leads to C memory errors.
  2. On the other hand, SWT.SetData event handlers are written by swt users and therefore can contain any code.

Using Display.asyncExec() to postpone SWT.SetData event handlers until Tree.cellDataProc() is finished seems like the most simple and bullet-proof solution to #678 and all similar bugs.

@zkxvh
Copy link
Author

zkxvh commented Jan 21, 2025

Fix for #1739

This is the 2nd and 3rd commits in this branch.

The 2nd commit adds native method gtk_tree_view_column_queue_resize(...) from GTK.
I'm not sure whether something else should be done when new native methods are added, so I made as a separate commit.

The 3rd commit is the fix for image display for Tree and Table (both for normal and SWT.VIRTUAL modes).
To test that images are displayed correctly now I used the following snippet-programs: snippets.zip.
Of course, it would be great to create unit tests for that, but I haven't found a way to get the size of the image shown in a tree/table, so manual tests is the best I can offer.

All the snippets are similar, here is what SetImage_TreeVirtual (it's a snippet for Tree with SWT.VIRTUAL) looks like:
pic1

At the top there are 4 rows of buttons with images.
Pressing any of the buttons sets its image to the tree cell at the row and column given in the button text.
Pressing the same button again removes the image from the cell.

Buttons -col,+col,col-,col+ can be used to add/delete columns.

This how Tree and Table should work (or at least this is how it seems to me after digging into the source code):

  1. all images of a Tree/Table are of the same size, which is the size of the 1st image set to the Tree/Table. All the images added after the 1st one are automatically scaled to that fixed size.
  2. for each column: initially no width for images is shown in the cells, but when the 1st image is set for any of the cell, then the width for images gets shown for all the cells of that column.
  3. row height: for Trees and Tables without SWT.VIRTUAL the height of each row is changed automatically to fit all images in the row cells, for Trees and Tables with SWT.VIRTUAL fixed-height is used and the height of all rows is set to >=fixed_image_heigth when the 1st image is set for the Tree/Table

I used the following tests with SetImage_TreeVirtual to check that everything works:

  1. restart snippet, press buttons [1,0], then [2, 1], then [3, 2].
    Expected result: all 3 images are shown with size of [1,0]
    1. same as (1) but press [2, 1] first, all images in the tree should have size of [2, 1]
    2. same as (1) but press [3, 2] first, all images in the tree should have size of [3, 2]
  2. restart snippet, press button [32,1], then scroll down until cell [32,1] is visible, then press buttons [31,0], [33,2]
    Expected result: all 3 images are shown with size of [32,1], the height of all rows grows to fit the image when you scroll to [32,1]
    1. same as (2) but press [33,2] first, all images in the tree should have size of [33,2]
    2. same as (2) but press [31,0] first, all images in the tree should have size of [31,0]
  3. same as (1) and (2) but for nested rows (use the 3d and 4nd rows of the buttons with images)
  4. special test: Tree have special logic for a case when Tree.getColumns().length==0 - it shows some default column in this case. Using buttons -col,+col,col-,col+ make sure that the default column is copied to/from the 1st column when Tree.getColumns().length changes between 1 and 0.

The same manual tests I performed with snippet SetImage_Tree (Tree without SWT.VIRTUAL) and for Table (snippets SetImage_TableVirtual and SetImage_Table).

@iloveeclipse
Copy link
Member

There is one test failing:

java.lang.AssertionError: SetData callback count not in range: 10001
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.assertTrue(Assert.java:42)
	at org.eclipse.swt.tests.junit.Test_org_eclipse_swt_widgets_Tree.test_Virtual(Test_org_eclipse_swt_widgets_Tree.java:966)

I haven't checked if this is related to this PR or not, could you please check?

@zkxvh zkxvh mentioned this pull request Jan 21, 2025
Copy link
Contributor

github-actions bot commented Jan 21, 2025

Test Results

   502 files  ±0     502 suites  ±0   9m 40s ⏱️ -7s
 4 334 tests ±0   4 320 ✅ ±0   14 💤 ±0  0 ❌ ±0 
16 575 runs  ±0  16 466 ✅ ±0  109 💤 ±0  0 ❌ ±0 

Results for commit 064e80f. ± Comparison against base commit 7d718fe.

♻️ This comment has been updated with latest results.

@zkxvh
Copy link
Author

zkxvh commented Jan 21, 2025

There is one test failing:

java.lang.AssertionError: SetData callback count not in range: 10001
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.assertTrue(Assert.java:42)
	at org.eclipse.swt.tests.junit.Test_org_eclipse_swt_widgets_Tree.test_Virtual(Test_org_eclipse_swt_widgets_Tree.java:966)

I haven't checked if this is related to this PR or not, could you please check?

Sure, I'll check this test.
It looks like this test checks that for virtual trees SWT.SetData callbacks should only execute for visible rows, but now they seem to execute for all rows (which is bad).
This may be caused by the changes in the PR.

@zkxvh
Copy link
Author

zkxvh commented Jan 21, 2025

Test_org_eclipse_swt_widgets_Tree.test_Virtual() is fixed in commit 62b1a13.

The problem was in setScrollWidth() which is invoked at the end of cellDataProc(...).
As I understand, setScrollWidth() works when Tree.getColumnCount()==0, i.e. when the default column is shown.
For each rendered cell the code sets the width of the default column to be max(currentWidth, width_of_current_cell_content).

The fix removes recursive visiting of the nested cells for the current item - this should work because during rendering cellDataProc(...) should be invoked for every updated visible cell anyway.

I checked this case in SetImage_TreeVirtual snippet and the scroll width seems to increase when an image is inserted - i.e. it seems to work correctly.

@deepika-u
Copy link
Contributor

I tried applying your pr. It works. But if anything else needs to be tested for this, please do let me know.

image

image

@zkxvh
Copy link
Author

zkxvh commented Jan 28, 2025

Hi @deepika-u

You've checked that the fix for #1739 works.

Could you please also check if the fix for #678 works on your machine?
For that please run snippet Issue678_JvmCrashOnTreeItemSetImage.java.
This snippet should crash jvm or throw an exception on SWT without this PR.
And it should work without errors on SWT with this PR applied.

BTW it's possible that you will see no errors even on SWT without this PR (for example @iloveeclipse wasn't able to reproduce the jvm crash) - that's fine, this is how such bugs work.
But if you will reproduce the jvm crash, then it will confirm that this PR fixes #678.

@deepika-u
Copy link
Contributor

deepika-u commented Feb 3, 2025

Hi, Sorry for a delayed response. I thought i couldnt make it work so tried multiple times so took abit time on it.

  • When pr is not applied, I am not seeing any error in .log file but seeing below errors in console next to problems view ->
    Console_output_1743.txt

  • When the pr is applied, I am not seeing any such errors atall nor a crash.

  • .log file in both the cases is still clean for me.

  • I tried on below environment
    Eclipse SDK
    Version: 2025-03 (4.35)
    Build id: I20250201-1800
    OS: Linux, v.6.8.0-52-generic, x86_64 / gtk 3.24.41, WebKit 2.46.5
    Java vendor: Oracle Corporation
    Java runtime version: 23+36-2368
    Java version: 23
    Description: Ubuntu 24.04 LTS
    XDG_SESSION_TYPE=wayland

@zkxvh
Copy link
Author

zkxvh commented Feb 10, 2025

I'm sorry for the delayed response too.

Thank you for checking this PR.

The errors in the console output is another way how the bug can show itself.

Basically the error is a so called use-after-free, i.e. an object's memory in native code was at first freed, then later that freed memory was accessed again via a dangling pointer.
It between the memory freeing and the later access, the memory can be allocated and modified by some other native code.
What happens next (an app crash, an error message in the console, some other buggy behavior, or even nothing at all) depends on what was changed in the memory.
It's difficult to reproduce reliably some specific behavior (like app crash) - because the native code of the jvm is extremely complex.

Basically, assertion messages in the console output indicate that the code is trying to do something with some GTK3 objects, but the provided pointers to GTK3 objects point to malformed GTK3 objects (GTK3 performs some basic checks almost in every method).
In other words, on your machine the changes to the freed memory are such, that they are caught by GTK3 basic argument checks, while on my machine the changes at some different place in the memory, which result in SIGSEGV.

@iloveeclipse
Copy link
Member

@zkxvh : this branch has merge conflicts with master. Could you please rebase the changes on latest master state?

@iloveeclipse
Copy link
Member

Also note, you are not supposed to check-in so files!

@zkxvh
Copy link
Author

zkxvh commented Feb 13, 2025

@zkxvh : this branch has merge conflicts with master. Could you please rebase the changes on latest master state?

OK.
I will rebase the changes and check again that everything works by next Wednesday.
Sorry for not being able to do that sooner, but right now I'm busy with another urgent task which I need to finish ASAP.

@zkxvh
Copy link
Author

zkxvh commented Feb 13, 2025

Also note, you are not supposed to check-in so files!

OK. I'll remove *.so from the changes when I'll make the rebase.

BTW I saw in the history that *.so files are usually updated as separate commits, but I couldn't find anything describing how exactly changes to *.so should be included in the PR.
So I also put libswt-pi3-gtk-4968r4.so in a separate commit, and included it in the PR, so that the people who are trying this PR would not have to build libswt-pi3-gtk-4968r4.so themselves.

@deepika-u
Copy link
Contributor

Basically, assertion messages in the console output indicate that the code is trying to do something with some GTK3 objects, but the provided pointers to GTK3 objects point to malformed GTK3 objects (GTK3 performs some basic checks almost in every method). In other words, on your machine the changes to the freed memory are such, that they are caught by GTK3 basic argument checks, while on my machine the changes at some different place in the memory, which result in SIGSEGV.

Thanks for the detailed explanation.

Fixes image display problems (cropping, not showing) for Tree and Table
(both normal and virtual variants).

Also fixes eclipse-platform#678

Fixes eclipse-platform#678
Fixes eclipse-platform#1739
@zkxvh zkxvh force-pushed the tree-image-bugfix branch from 62b1a13 to 064e80f Compare February 18, 2025 13:43
@zkxvh
Copy link
Author

zkxvh commented Feb 18, 2025

Hi, @iloveeclipse

I rebased the changes to the latest master + removed *.so from the PR.

I also checked that everything works again - just like I did here.

BTW, I removed the commit (it was 1st commit before the rebase) that wrapped execution of SWT.SetData callbacks inside Tree.cellDataProc() into Display.asyncExec().
The reason:

  1. that commit fixed the problem for Tree.cellDataProc() and made the code more complicated
  2. that commit fixed the problem ONLY for Tree.cellDataProc(). There are other places in the code of Tree that have the exact same problem and that are left unfixed - i.e. the execution of SWT.SetData callbacks becomes inconsistent.
    In the end, that commit adds complexity and doesn't fix the problem fully - so I decided it's not worth it.

To illustrate, here is the same problem in Tree.sendMeasureEvent():

void sendMeasureEvent (long cell, long width, long height) {
			...
			// here inside Tree.checkData()->sendEvent(SWT.SetData)->callback->TreeItem.setImage()->Tree.createRenderers() is called, which frees memory for GtkCellRendererText pointed to by cell.
			Image image = item.getImage (columnIndex); 
			...
			// here's the access to already freed memory of GtkCellRendererText, which can cause SIGSEGV and app crash
			GTK.gtk_cell_renderer_set_fixed_size (cell, -1, contentHeight [0]);
} 

#678 got fixed anyway, because parent.createRenderers(...) (which caused #678) is replaced by OS.g_object_set (handle, OS.fixed_height_mode, ...).

Also just FYI there is an alternative PR that fixes #678 : it's #1612 by @basilevs
That PR has fewer changes, but doesn't fix #1739 like this one.

Have a nice day.

@basilevs
Copy link
Contributor

SWT.SetData callbacks inside Tree.cellDataProc() into Display.asyncExec()

Is there any hope to remove other asyncExecs from this PR? They were done for tangentially same reason, are not they?
Please do not answer, if investigation requires a major effort.

@zkxvh
Copy link
Author

zkxvh commented Feb 19, 2025

SWT.SetData callbacks inside Tree.cellDataProc() into Display.asyncExec()

Is there any hope to remove other asyncExecs from this PR? They were done for tangentially same reason, are not they? Please do not answer, if investigation requires a major effort.

The reasons are different:

  1. I wanted to run SWT.SetData callbacks in Display.asyncExec() because these callbacks are written by users and therefore they can do anything, which sometimes causes native code errors.
    The problem is that SWT.SetData callbacks sometimes executed inside GTK3 code, where GTK3 doesn't expect changes like addition/removal of columns/renderers - otherwise you might get a process crash with SIGSEGV or some other weird undefined C behavior in native code of GTK3.
    One example of this is Tree.cellDataProc(): now if a user creates SWT.SetData callback that adds/removes columns/renderers, then it might get native code problem on tree rendering (this is when Tree.cellDataProc() is invoked).
    Another example is Tree.sendMeasureEvent(...) - there it's very similar to Tree.cellDataProc().
    If the changes to tree columns/renderers in written-by-users SWT.SetData callbacks caused some Exception, then it would be fine. But native code bugs are always a huge problem for java developers.
    Executing SWT.SetData callbacks in Display.asyncExec() removes the possibility of the native code problems: any tree changes are safe in Display.asyncExec().
    Unfortunately, even though I can wrap SWT.SetData callbacks into Display.asyncExec() inside Tree.cellDataProc(), I cannot do the same for every other place where SWT.SetData callbacks are executed. (e.g. in Tree.sendMeasureEvent(...)).
    As a result, I decided that it's better to leave the code as is (i.e. the problem stays, but the code is simple), then to fix the problem in just one place out of many and make the code more complicated.

  2. in [GTK3] Do not crash in SetData event #1612 you wrap Tree.createRenderers(...) into Display.asyncExec() because it removes the current column renderers.
    When SWT.VIRTUAL is set for a Tree and the written-by-the-user SWT.SetData callback contains TreeItem.setImage(...), then TreeItem.setImage(...) will be frequently invoked inside Tree.cellDataProc()/Tree.sendMeasureEvent(...) on every tree rendering.
    This causes the same native code errors as in (1).
    Wrapping into Display.asyncExec() moves the execution of Tree.createRenderers(...) outside of Tree.cellDataProc()/Tree.sendMeasureEvent(...) - this is how the native code errors are avoided.

  3. in this PR Tree.createRenderers(...) is replaced by reseting fixed_height_mode property of GtkTreeView and triggering height recomputation for existing rows.
    Thus, no columns/renderers are created/removed in this PR, so there is no need to use Display.asyncExec() to avoid native call problems.
    Display.asyncExec() is used for another reason here: we need row_changed GTK3 signal to trigger height recomputation for existing rows, but this signal is blocked inside Tree.cellDataProc(...).

    tree rendering
     cell rendering
      Tree.cellDataProc(...)
       Tree.checkData(item)
        OS.g_signal_handlers_block_matched (...);    <== THIS BLOCKS 'row_changed' SIGNAL
        sendEvent(SWT.SetData, event)
         SWT.SetData callback
          TreeItem.setImage()
            if it's the 1st image, then update row height
              reset `fixed_height_mode` property of `GtkTreeView`
              recompute height for existing rows     <== THIS REQUIRES 'row_changed' TO BE UNBLOCKED
        OS.g_signal_handlers_unblock_matched (...);  <== THIS UNBLOCKS 'row_changed' SIGNAL

    That is why height recomputation for existing rows is wrapped into Display.asyncExec() in this PR.

BTW gtk3 itself very actively uses asynchronous execution (similar to what Display.asyncExec() does).
For example rendering happens asynchronously on ticks of the timer, that ticks with framerate frequency.
That's why I don't see the usage of Display.asyncExec() as problematic (of course, unless it's overused).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants