Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with Vte.Terminal.spawn_async #262

Open
sodomon2 opened this issue Mar 6, 2021 · 11 comments
Open

Segmentation fault with Vte.Terminal.spawn_async #262

sodomon2 opened this issue Mar 6, 2021 · 11 comments

Comments

@sodomon2
Copy link

sodomon2 commented Mar 6, 2021

Hello

I have made a terminal in lua
and I wanted to add tabs but I can not because the spawn_async of vte does not work with lgi,I did some tests and I found that the problem is with spawn_async that makes the segmentation program fault

I just wanted to know if there is any way to solve it :)

Thanks for your work on LGI ❤️
Thank You!

@psychon
Copy link
Collaborator

psychon commented Mar 6, 2021

Since you apparently already understand the vte API: Can you provide some self-contained example reproducing the crash? Otherwise, I'd have to write such an example myself and I never used vte before.

Edit: Quick look at vte's API docs say you could be referring to either https://developer.gnome.org/vte/unstable/vte-Vte-PTY.html#vte-pty-spawn-async or https://developer.gnome.org/vte/unstable/VteTerminal.html#vte-terminal-spawn-async. I'll definitely need more information.

@sodomon2
Copy link
Author

sodomon2 commented Mar 6, 2021

@psychon I refer to this
https://developer.gnome.org/vte/unstable/VteTerminal.html#vte-terminal-spawn-async

here is the example

#!/usr/bin/env lua
local lgi   = require("lgi")
local Gtk   = lgi.require('Gtk', '3.0')
local Gdk   = lgi.require('Gdk', '3.0')
local Vte   = lgi.require('Vte','2.91')
local GLib  = lgi.require('GLib', '2.0')

local app   = Gtk.Application()
local term  = Vte.Terminal()

main_window = Gtk.Window {
	width_request	= 600,
	height_request	= 400,
	Gtk.ScrolledWindow{ id = 'scroll' }
}

function app:on_activate()
	font = term:get_font()
	font:set_size(font:get_size() * 1.1)
	term:spawn_async(
		Vte.PtyFlags.DEFAULT,                  		-- pty flag
		nil,                  				-- working directory
		{ '/bin/bash' },	  			-- envv
		nil,               				-- argv
		GLib.SpawnFlags.DEFAULT,                        -- spawn_flags
		1,                 				-- child_setup
		nil,               				-- child_setup_data
		nil,                 				-- child_setup_data_destroy
		1000,                  		                -- timeout
		nil,                 				-- cancel callback
		function() print('Hello World!') end
	)
	main_window.child.scroll:add(term)
        main_window:show_all()
	self:add_window(main_window)
end

app:run()

@psychon
Copy link
Collaborator

psychon commented Mar 6, 2021

Hm. According to gdb, the arguments are somehow not the expected ones. You specify a timeout of 1000, but the function is called with 1:

(gdb) frame 5
#5  0x00007ffff580ef71 in vte_terminal_spawn_async (terminal=<optimized out>, pty_flags=<optimized out>, 
    working_directory=<optimized out>, argv=<optimized out>, envv=<optimized out>, spawn_flags=<optimized out>, 
    child_setup=0x7ffff7fb4160, child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, 
    cancellable=0x0, callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513	../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.

I will need to figure out what the actual API of this function in gobject-introspection and lua is. It might be that it has less arguments than the C function (e.g. child_setup and child_setup_data (and perhaps also child_setup_data_destroy) could be magically turned into one callback function).

Edit:
This one has less "optimized out":

Thread 2.1 "lua" hit Breakpoint 2, vte_terminal_spawn_async (terminal=0x555555758720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x555555938c00, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513

Edit: From the same run as the above:

Thread 2.1 "lua" received signal SIGSEGV, Segmentation fault.
0x00007ffff7fb4270 in ?? ()

This is a random pointer somewhere after callback.

Edit: Another run:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555759720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558cffe0, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) c
Continuing.
[Detaching after fork from child process 6833]
[New Thread 0x7ffff20bb700 (LWP 6834)]
[New Thread 0x7ffff1607700 (LWP 6835)]

Thread 1 "lua" hit Breakpoint 3, 0x00007ffff7fb4240 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4240,+20
Dump of assembler code from 0x7ffff7fb4240 to 0x7ffff7fb4254:
=> 0x00007ffff7fb4240:	00 00	add    %al,(%rax)
   0x00007ffff7fb4242:	00 00	add    %al,(%rax)
   0x00007ffff7fb4244:	00 00	add    %al,(%rax)
   0x00007ffff7fb4246:	00 00	add    %al,(%rax)
   0x00007ffff7fb4248:	00 00	add    %al,(%rax)
   0x00007ffff7fb424a:	00 00	add    %al,(%rax)
   0x00007ffff7fb424c:	00 00	add    %al,(%rax)
   0x00007ffff7fb424e:	00 00	add    %al,(%rax)
   0x00007ffff7fb4250:	00 00	add    %al,(%rax)
   0x00007ffff7fb4252:	00 00	add    %al,(%rax)
End of assembler dump.

This is executing all-zero memory? The crash then happens when it runs into something that it should not run into, I guess.

@psychon
Copy link
Collaborator

psychon commented Mar 6, 2021

I now think that this is a bug in Vte. The callback argument of vte_terminal_spawn_async() has no annotations. According to https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations, scopes are:

Scope types:

  • call (default) - Only valid for the duration of the call. Can be called multiple times during the call.
  • async - Only valid for the duration of the first callback invocation. Can only be called once.
  • notified - valid until the GDestroyNotify argument is called. Can be called multiple times before the GDestroyNotify is called.

Since no scope is given, the scope is call. Thus, Vte may only call this callback before vte_terminal_spawn_async returns. This is clearly incorrect. I guess it should be async instead...?

Could you open a bug report at https://gitlab.gnome.org/GNOME/vte/-/issues? Perhaps the people there will conclude that my reasoning here is wrong. We will see.

Edit: However, there is still something wrong going on. I do not understand what is going on with the child_setup argument, but it also seems to point to all-zeros:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555757720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558e4090, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513	../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) set follow-fork-mode child
(gdb) c
Continuing.
[Attaching after Thread 0x7ffff7c1c2c0 (LWP 7320) fork to child process 7333]
[New inferior 2 (process 7333)]
[Detaching after fork from parent process 7320]
[Inferior 1 (process 7320) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Switching to Thread 0x7ffff7c1c2c0 (LWP 7333)]

Thread 2.1 "lua" hit Breakpoint 2, 0x00007ffff7fb4160 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4160,+20
Dump of assembler code from 0x7ffff7fb4160 to 0x7ffff7fb4174:
=> 0x00007ffff7fb4160:	00 00	add    %al,(%rax)
   0x00007ffff7fb4162:	00 00	add    %al,(%rax)
   0x00007ffff7fb4164:	00 00	add    %al,(%rax)
   0x00007ffff7fb4166:	00 00	add    %al,(%rax)
   0x00007ffff7fb4168:	00 00	add    %al,(%rax)
   0x00007ffff7fb416a:	00 00	add    %al,(%rax)
   0x00007ffff7fb416c:	00 00	add    %al,(%rax)
   0x00007ffff7fb416e:	00 00	add    %al,(%rax)
   0x00007ffff7fb4170:	00 00	add    %al,(%rax)
   0x00007ffff7fb4172:	00 00	add    %al,(%rax)
End of assembler dump.

@sodomon2
Copy link
Author

sodomon2 commented Mar 6, 2021

I now think that this is a bug in Vte. The callback argument of vte_terminal_spawn_async() has no annotations. According to https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations, scopes are:

Scope types:

  • call (default) - Only valid for the duration of the call. Can be called multiple times during the call.
  • async - Only valid for the duration of the first callback invocation. Can only be called once.
  • notified - valid until the GDestroyNotify argument is called. Can be called multiple times before the GDestroyNotify is called.

Since no scope is given, the scope is call. Thus, Vte may only call this callback before vte_terminal_spawn_async returns. This is clearly incorrect. I guess it should be async instead...?

Could you open a bug report at https://gitlab.gnome.org/GNOME/vte/-/issues? Perhaps the people there will conclude that my reasoning here is wrong. We will see.

Edit: However, there is still something wrong going on. I do not understand what is going on with the child_setup argument, but it also seems to point to all-zeros:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555757720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558e4090, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513	../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) set follow-fork-mode child
(gdb) c
Continuing.
[Attaching after Thread 0x7ffff7c1c2c0 (LWP 7320) fork to child process 7333]
[New inferior 2 (process 7333)]
[Detaching after fork from parent process 7320]
[Inferior 1 (process 7320) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Switching to Thread 0x7ffff7c1c2c0 (LWP 7333)]

Thread 2.1 "lua" hit Breakpoint 2, 0x00007ffff7fb4160 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4160,+20
Dump of assembler code from 0x7ffff7fb4160 to 0x7ffff7fb4174:
=> 0x00007ffff7fb4160:	00 00	add    %al,(%rax)
   0x00007ffff7fb4162:	00 00	add    %al,(%rax)
   0x00007ffff7fb4164:	00 00	add    %al,(%rax)
   0x00007ffff7fb4166:	00 00	add    %al,(%rax)
   0x00007ffff7fb4168:	00 00	add    %al,(%rax)
   0x00007ffff7fb416a:	00 00	add    %al,(%rax)
   0x00007ffff7fb416c:	00 00	add    %al,(%rax)
   0x00007ffff7fb416e:	00 00	add    %al,(%rax)
   0x00007ffff7fb4170:	00 00	add    %al,(%rax)
   0x00007ffff7fb4172:	00 00	add    %al,(%rax)
End of assembler dump.

You mean the problem is with vte and not LGI?

@sodomon2
Copy link
Author

sodomon2 commented Mar 6, 2021

Hm. According to gdb, the arguments are somehow not the expected ones. You specify a timeout of 1000, but the function is called with 1:

(gdb) frame 5
#5  0x00007ffff580ef71 in vte_terminal_spawn_async (terminal=<optimized out>, pty_flags=<optimized out>, 
    working_directory=<optimized out>, argv=<optimized out>, envv=<optimized out>, spawn_flags=<optimized out>, 
    child_setup=0x7ffff7fb4160, child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, 
    cancellable=0x0, callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
3513	../src/vtegtk.cc: Datei oder Verzeichnis nicht gefunden.

I will need to figure out what the actual API of this function in gobject-introspection and lua is. It might be that it has less arguments than the C function (e.g. child_setup and child_setup_data (and perhaps also child_setup_data_destroy) could be magically turned into one callback function).

Edit:
This one has less "optimized out":

Thread 2.1 "lua" hit Breakpoint 2, vte_terminal_spawn_async (terminal=0x555555758720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x555555938c00, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513

Edit: From the same run as the above:

Thread 2.1 "lua" received signal SIGSEGV, Segmentation fault.
0x00007ffff7fb4270 in ?? ()

This is a random pointer somewhere after callback.

Edit: Another run:

Thread 1 "lua" hit Breakpoint 1, vte_terminal_spawn_async (terminal=0x555555759720, pty_flags=VTE_PTY_DEFAULT, 
    working_directory=0x0, argv=0x5555558cffe0, envv=0x0, spawn_flags=G_SPAWN_DEFAULT, child_setup=0x7ffff7fb4160, 
    child_setup_data=0x7ffff7fb41d0, child_setup_data_destroy=0x5555555892a0, timeout=1000, cancellable=0x0, 
    callback=0x7ffff7fb4240, user_data=0x7ffff7fb42b0) at ../src/vtegtk.cc:3513
(gdb) break *0x7ffff7fb4160
Breakpoint 2 at 0x7ffff7fb4160
(gdb) break *0x7ffff7fb4240
Breakpoint 3 at 0x7ffff7fb4240
(gdb) c
Continuing.
[Detaching after fork from child process 6833]
[New Thread 0x7ffff20bb700 (LWP 6834)]
[New Thread 0x7ffff1607700 (LWP 6835)]

Thread 1 "lua" hit Breakpoint 3, 0x00007ffff7fb4240 in ?? ()
(gdb) disassemble/r 0x00007ffff7fb4240,+20
Dump of assembler code from 0x7ffff7fb4240 to 0x7ffff7fb4254:
=> 0x00007ffff7fb4240:	00 00	add    %al,(%rax)
   0x00007ffff7fb4242:	00 00	add    %al,(%rax)
   0x00007ffff7fb4244:	00 00	add    %al,(%rax)
   0x00007ffff7fb4246:	00 00	add    %al,(%rax)
   0x00007ffff7fb4248:	00 00	add    %al,(%rax)
   0x00007ffff7fb424a:	00 00	add    %al,(%rax)
   0x00007ffff7fb424c:	00 00	add    %al,(%rax)
   0x00007ffff7fb424e:	00 00	add    %al,(%rax)
   0x00007ffff7fb4250:	00 00	add    %al,(%rax)
   0x00007ffff7fb4252:	00 00	add    %al,(%rax)
End of assembler dump.

This is executing all-zero memory? The crash then happens when it runs into something that it should not run into, I guess.

I don't quite understand

Do you mean that the whole spawn_async is being called wrong?

@psychon
Copy link
Collaborator

psychon commented Mar 6, 2021

You mean the problem is with vte and not LGI?

Well... I'd be more careful: I think there is at least one problem with vte. :-)

According to the annotation on its arguments, vte_terminal_spawn_async() may only call its callback argument before returning. However, it is obviously meant to be called some time later. Thus, a scope async annotation is missing (I am not sure if scope async is correct, but I guess so).

I don't quite understand

Neither do I.

@sodomon2
Copy link
Author

sodomon2 commented Mar 6, 2021

According to the annotation on its arguments, vte_terminal_spawn_async() may only call its callback argument before returning. However, it is obviously meant to be called some time later. Thus, a scope async annotation is missing (I am not sure if scope async is correct, but I guess so).

The error may be due to the vte, since the scope async may be null.

@sodomon2
Copy link
Author

Hi @psychon

I have been investigating and the error is not only with vte but with all GTK async methods.

see #241

ntd added a commit to ntd/lgi that referenced this issue May 23, 2022
Data bound to callbacks must be flagged as "internal". The problem is
also the callback itself data is marked as internal because of the
following issue:

https://gitlab.gnome.org/GNOME/gobject-introspection/-/issues/430

Avoid the problem by checking the scope: if it is invalid, it is likely
real data (and not a callback), hence it should be marked.

Fixes lgi-devs#241, lgi-devs#262 and lgi-devs#285.
ntd added a commit to ntd/lgi that referenced this issue May 24, 2022
Data bound to a callback must be flagged as "internal". The problem is
also the callback itself is marked as such because of this issue:

https://gitlab.gnome.org/GNOME/gobject-introspection/-/issues/430

Avoid the problem by checking the scope: if it is invalid, it is likely
real data (and not a callback), hence it can be marked.

Fixes lgi-devs#241, lgi-devs#262 and lgi-devs#285.
@psychon
Copy link
Collaborator

psychon commented May 25, 2022

Fixed via #285. I assume.

@psychon psychon closed this as completed May 25, 2022
@sodomon2
Copy link
Author

sodomon2 commented Apr 8, 2024

Fixed via #285. I assume.

Not at all, the error is still valid today.

@psychon psychon reopened this Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants