-
Notifications
You must be signed in to change notification settings - Fork 1
robustify starting xfwm4 in Desktop #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Flagging @saroj-lbl for any thoughts. I didn't see this problem in limited testing on LRC OOD, but given the apps seem identical (or nearly so), I wouldn't expect to to be Savio-only. |
|
Hi Chris, We are not aware of any such problem on LRC OOD as of yet. I will try to keep a desktop app open for several hours today to test. Is there a particular application or set of applications for which the problem shows up? Looking at the LRC OOD apps' script.sh.erb, it looks quite similar to BRC OOD: At first I noticed the difference being |
|
Also, in case this is useful: Taking a look at the OOD docs at https://osc.github.io/ood-documentation/latest/tutorials/tutorials-interactive-apps/add-matlab/edit-script-sh.html#use-xfce-for-the-window-manager, there is an example of how xfwm4 is set up in the ../template/script.sh.erb file for the ood matlab interactive app. The example is as follows: Use XFCE for the Window Manager: XFCE is OSC's preferred desktop environment for launching VNC applications. The code for starting XFCE in the background looks like this (see highlighted lines 1-20): Launch Xfce Window Manager and Panel( cd "$HOME" Start MATLABLoad the required environmentmodule load xalt/latest <%= context.version %> Launch MATLABSwitch the implementation on if the user requested a visualization GPU node<%- if context.node_type.include?("vis") -%> When not using a GPU nodemodule list # List loaded modules for debugging purposes |
|
Taking a look at the file /global/home/users/myashar/ondemand/dev/brc_desktop/template/script.sh.erb, for example, we do see that it includes the following: Launch Xfce Window Manager and Panelexport SEND_256_COLORS_TO_REMOTE=1 cd "$HOME" |
|
@markyashar I'm not seeing any difference between the OSC script and ours that might help explain our Desktop mis-behavior. Was there something in particular you were pointing out? |
|
@saroj-lbl regarding keeping the app open for hours, my current understanding is that the issue happens immediately upon starting the app, so I don't think you'll see anything by keeping it open. Just an FYI -- given this is mysterious, any diagnostic effort is welcome! |
|
@paciorek I think the only basic differences I'm seeing between the OSC script and ours is that their script has the line "module restore" before the line "set -x", which our script does not have, and in our script we have the line "sleep 5" after the line "xfwm4 --compositor=off --sm-client-disable &", whereas the OSC script does not have the "&" or the "sleep 5" command. There are some similar differences between our script and the LRC OOD apps' script.sh.erb that Saroj sent. These may be minor differences, but it could be worth experimenting to remove these differences between our script and the OSC script and the LRC OOD apps' script.sh.erb script (?), e.g., take out the ampersand in our script and remove the "sleep 5" line and add the "module restore" line, etc. Also, our script has xfce4-panel --sm-client-disable & whereas the others just have xfce4-panel --sm-client-disable Or, the other scripts have the & outside of the parenthesis, but not sure how much of a difference this ultimately makes ... |
|
Yeah, as far as I myself introduced the I'm going to experiment more with the "sleep"s to see if reducing the amount of time would be ok, as having 5x3=15 seconds of delay seems like a bit much from a user experience perspective. |
|
Ok, so one very confusing thing is that the LRC script and the Savio script that is currently live at However, somehow the Savio script that is live at /var/www/... is not what we have in this GitHub repository. In particular it does not reflect the changes in commit 1b23744 from PR #48 (in which the whole block of commands is not backgrounded). I'm not sure how that could be the case. So that makes this all the more confusing. I'm going to have to check with @wfeinstein to understand why |
Remove a `sleep` that is probably not needed.
|
Now that I've tested my robustification efforts, I realized that However, in testing today, I can't reproduce the failures that user Jeff Lockhart and I were seeing last week. So it's hard to pin down what changes we want. Here's a summary of my current thinking:
|
|
In case there's anything useful here, here is how Bearborg AI Chatbot (Gemini Pro) responds to all of this after I've fed this thread and all of the background and context into it (-- Note that one thing that it suggests is to add the line "pkill -f xfwm4" to the script....): "Let's break down this complex issue and create a plan to fix the BRC OOD Desktop problems. Problem Summary:
Root Causes & Potential Solutions:
Action Plan:
By systematically addressing these points, you should be able to create a more robust and reliable Xfce desktop experience on Savio. Clear communication with @wfeinstein about the deployment discrepancy is essential. " |
|
Well, I still don't understand what is going on, and now I am seeing over the last few days that the Desktop app seems to be behaving fine, so it's hard to experiment more. One other thing I've realized is that with the current Savio and LRC Desktop apps, with So that is odd. I'll also note that with this PR (as well as with the code from commit 1b23744) it looks like XFCE puts up some pop-up windows (related to the "notification area" and "XFCE4 Policy Kit agent") perhaps because I'm going to wait more and try to reproduce the behavior that Jeff and I saw a few weeks ago. If I can, I will follow up here, possibly with a proposal to modify the current backgrounded block approach to add |
|
It looks like Wei's merge of PR #62 and copying to the live app removed the discrepancy between the repo and the live apps versions of Desktop that I noted in a previous comment. We'll have to decide whether to also merge in this PR, which might fix some cases where there are problems with xfwm4 dying. But I don't feel like I have a good handle on to what extent this PR would address those problems. And in contradiction of my claim earlier in this discussion, XFCE is starting even though |
This is an attempt to start OOD Desktop (in particular xfwm4) in a way that prevents current problems. It's not ready to merge yet, but rather to start some discussion and for some testing.
The current problems are that users and I have noticed that (sometimes) the Desktop is almost unusable because one can't resize or move app windows. Also the borders around the windows (and around the entire Desktop) disappear.
I believe I have diagnosed as occurring because the xfwm4 window manager process dies shortly after the Desktop starts. I've noticed that very briefly (about one second) when the Desktop appears, the usual border around the Desktop is there (i.e., the top 'management' bar and the small app bar at the bottom). Then it disappears. Then when I look using
ps,xfwm4is not running.Looking at
output.log, I see the following messages:Following some hints from Claude, I tried to see if xfwm4 was being started multiple times, but I didn't see any indications of that happening.
So I don't know why another xfwm4 is already running.
Ideally it would be nice to figure that out so as to come up with a robust solution.
Even if we can't, my thought with this PR is that either sleeping before starting xfwm4 (perhaps helping if something needs to finish starting before xfwm4 can start robustly) or using
--replace(to replace whatever problematic xfwm4 has already started) might help.