Skip to content

Fix memory leaks. #2703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 29, 2025
Merged

Fix memory leaks. #2703

merged 2 commits into from
Apr 29, 2025

Conversation

drgrice1
Copy link
Member

@drgrice1 drgrice1 commented Apr 14, 2025

It turns out that none of the ContentGenerator controller objects are being destroyed when a request finishes. So each hypnotoad process (or morbo in development) has every ContentGenerator controller object for every request it renders saved in memory until the process ends. That means that everything it had a reference to is also saved in memory. That includes a WeBWorK::CourseEnvironment instance, a WeBWorK::Authen instance, a WeBWorK::Authz instance, and a WeBWorK::DB instance (and everything it has a reference to).

Furthermore, even if the controller objects are destroyed and the WeBWorK::DB instance with it, none of the WeBWorK::DB::Schema instances (one for each table) are ever destroyed.

There are two things that cause these references to be kept when they shouldn't be.

The first is the more obvious circular reference..

A WeBWorK::Controller object (from with the WeBWorK::ContentGenerator modules derive) keeps a reference to a WeBWorK::Authz instance, and that instance keeps a reference back to the controller. However, the WeBWorK::Authz doesn't weaken the reference back to the controller. That was my fault in the conversion to Mojolicious I commented out the weaken statement that prevented this circular reference. That was because in the initial conversion the controller didn't have a reference to the WeBWorK::Authz instance, and so it was going out of scope and causing problems. However, when the reference to that instance was added that should have been uncommented.

Another case of this is that WeBWorK::Authen::LTIAdvanced and WeBWorK::Authen::LTIAdvantage packages were keeping a circular reference to the controller as well. The new methods in those packages was just deleted so that they use the WeBWorK::Authen new method which already does the right thing.

A third case occurs with the WeBWorK::DB instance and the WeBWorK::DB::Schema instances both of which hold references to each other.

The other thing that causes an extra reference to be kept is an anonymous subroutine (or closure) using an instance. In this case Perl forces the instance to be kept in scope for usage in the closure.

The global $SIG{__WARN__} handler defined in Mojolicious::WeBWorK uses the $c controller instance, and that is what prevents the WeBWorK::ContentGenerator modules from going out of scope. So that instance in the around_action hook needs to be weakened.

Edit: Instead of weakening the controller in the around_action hook, just make sure that the $SIG{__WARN__} handler is reset in the after_dispatch hook which removes the code reference to the handler defined in the around_action hook, and thus releases its reference on the controller.

For the WeBWorK::DB::Schema::NewSQL::Std and WeBWorK::DB::Schema::NewSQL::Merge objects the issue is the transform_table and transform_all closures for the sql abstract instances. Those prevent the schema objects from going out of scope and so the $self in the sql_init methods where those closures are defined needs to be weakened as well.

@drgrice1
Copy link
Member Author

drgrice1 commented Apr 14, 2025

To facilitate testing this I have created two patches. One for the develop branch and one for the branch for this pull request. They are in the attached file.
patches.zip

Edit: Here is an updated patch for this pull request. The one in the zip file has an offset after another change to this pull request: destroy-test-pr.patch.txt

To apply the patches run patch -p1 < path/to/destroy-test-develop.patch from the webwork2 root directory (usually /opt/webwork/webwork). Note both patches will actually work for both branches, but the one intended for the other branch will succeed at an offset for some files and leave a .orig file behind.

After applying the patch run the webwork2 app either via morbo or hypnotoad. Note that to see the output with hypnotoad you will not be able to use the systemd service. Instead from the webwork2 root directory run:

  • sudo systemctl stop webwork
  • sudo mkdir /run/webwork2
  • sudo chown /run/webwork2 your_server_user where your_server_user will be www-user or apache in most cases
  • sudo hypnotoad -f bin/webwork2 add MOJO_REVERSE_PROXY=1 between sudo and hypnotoad if using a proxy like apache2 or nginx

Then open webwork in the browser, and navigate to several different pages.

If you are on the develop branch, you will see that most of the modules in question are not destroyed. Note that if you have multiple authentication modules (like LTI 1.1 and LTI 1.3), then you will see the unused authentication modules destroyed when they are skipped to move on to the next authentication module because the controller releases the reference to those. However, the used authen module will not be destroyed. After visiting numerous pages then stop the webwork2 app (using Ctrl-C with morbo or sudo hypnotoad -s bin/webwork2 from another terminal with hypnotoad -- or just Ctrl-C that also). Then you will see a whole slew of objects destroyed as Perl does its final clean up.

If you are using this pull request, then you will see all of these objects destroyed after each request completes as you move from page to page in the browser.

@drgrice1
Copy link
Member Author

Note that this will have a minor merge conflict with #2702 in the lib/WeBWorK/DB/Schema/NewSQL/Std.pm and lib/WeBWorK/DB/Schema/NewSQL/Merge.pm files.

@drgrice1 drgrice1 force-pushed the memory-leaks branch 3 times, most recently from be832e1 to 2ffeaf6 Compare April 16, 2025 17:54
@Alex-Jordan
Copy link
Contributor

I checked out develop and applied the patch. I stopped webwork2 with sudo systemctl stop webwork2. I followed the steps sudo mkdir /run/webwork2, sudo chown alex.jordan /run/webwork2 (note the arguments are reversed in the instructions), and then run hypnotoad -f bin/webwork2. That gives me this:

WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
Can't load application from file "/opt/webwork/webwork2/bin/webwork2": Can't open file "/opt/webwork/webwork2/logs/webwork2.log": Permission denied at /usr/local/share/perl5/Mojo/Log.pm line 19.

The log file is owned by apache with group apache. It is writable by owner and group. I added my user alex.jordan to the apache group (see note at the end) and tried again, but I got the same result.

Then I ran sudo hypnotoad -f bin/webwork2 and got

WeBWorK::CourseEnvironment destroyed
Web application available at http://127.0.0.1:80

However using a web browser, I could not reach WeBWorK. Neither at its usual address nor at http://127.0.0.1:80. I used ctrl-C to exit. To start regular business back up I git stashed the patch changes and ran sudo systemctl start webwork2 and it seemed to work as far as terminal output. But now I cannot get to the aplciation from a web browser. Firefox gives me the "Hmm. We’re having trouble finding that site" page.


When I added myself to the apache group, I got:

alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ sudo usermod -a -G apache alex.jordan
[sudo] password for alex.jordan: 
[sss_cache] [sysdb_domain_cache_connect] (0x0010): DB version too old [0.23], expected [0.24] for domain implicit_files!
Higher version of database is expected!
In order to upgrade the database, you must run SSSD.
Removing cache files in /var/lib/sss/db should fix the issue, but note that removing cache files will also remove all of your cached credentials.
Could not open available domains
[sss_cache] [sysdb_domain_cache_connect] (0x0010): DB version too old [0.23], expected [0.24] for domain implicit_files!
Higher version of database is expected!
In order to upgrade the database, you must run SSSD.
Removing cache files in /var/lib/sss/db should fix the issue, but note that removing cache files will also remove all of your cached credentials.
Could not open available domains

I then checked to see that I was indeed in the apache group using getent group apache. But I am wondering if the error messages above indicate I messed something up in a way that is relevant to no longer being able to run the app.

@drgrice1
Copy link
Member Author

I guess my instructions for running hypnotoad without the service are not quite correct for most deployments. I should have instead told you to run sudo -u www-data hypnotoad -f bin/webwork2, or maybe even just sudo hypnotoad -f bin/webwork2 as you eventually did. Although, that still might not be enough.

As to your issue in general, you might need to reboot to get the usermod change to take full effect. Usually changes to the group require a reboot to take effect in all cases.

@drgrice1
Copy link
Member Author

drgrice1 commented Apr 20, 2025

I updated the instructions above. If you deploy webwork2 to serve directly say with the server user www-data, then you should run

  • sudo systemctl stop webwork2
  • sudo mkdir /run/webwork2
  • sudo chown www-data /run/webwork2
  • sudo hypnotoad -f bin/webwork

Note that it is also important that the last command be run from the webwork2 root directory.

@drgrice1
Copy link
Member Author

By the way, it is odd that you got errors running sudo usermod -a -G apach alex.jordan. That may be an indication of a separate problem with your system.

@Alex-Jordan
Copy link
Contributor

OK, I'm not sure I'm completely up and back yet, but I did get this far:

alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ sudo MOJO_REVERSE_PROXY=1 hypnotoad -f bin/webwork2 
WeBWorK::CourseEnvironment destroyed
Starting hot deployment for Hypnotoad server 2502.
WeBWorK::CourseEnvironment destroyed
alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $

and I can enter a course from a web browser. However I am unsure where I should be looking to see modules destroyed. As you can see above, I was taken back to the usual terminal prompt. There is no feed. I navigated to many pages and there was no change in the terminal. Then I ran:

alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ hypnotoad -s bin/webwork2
WeBWorK::CourseEnvironment destroyed
Can't remove file "/run/webwork2/webwork2.pid": Permission denied at /usr/local/share/perl5/Mojo/Server/Prefork.pm line 31.
WeBWorK::CourseEnvironment destroyed
alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $

So I'm not seeing everything get destroyed at any stage. Am I doing this wrong?

@Alex-Jordan
Copy link
Contributor

Wait I tried again and now I'm seeing a feed in the terminal.

@Alex-Jordan
Copy link
Contributor

OK, I ran

alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ sudo MOJO_REVERSE_PROXY=1 hypnotoad -f bin/webwork2 
WeBWorK::CourseEnvironment destroyed
Web application available at http://127.0.0.1:80
Can't create process id file "/run/webwork2/webwork2.pid": Can't open file "/run/webwork2/webwork2.pid": No such file or directory at /usr/local/share/perl5/Mojo/Server/Prefork.pm line 42.
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed

But now I can't get to a course from a web browser. Either at its usual address or at http://127.0.0.1:80. Firefox just gives me the "Unable to connect" page.

@Alex-Jordan
Copy link
Contributor

This is happening on a remote server and I'm not sure what to do to access its localhost (127.0.0.1) from a web browser running on my laptop.

@drgrice1
Copy link
Member Author

alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ sudo MOJO_REVERSE_PROXY=1 hypnotoad -f bin/webwork2 
Starting hot deployment for Hypnotoad server 2502.
alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $

If you see that, it means that you already had webwork2 running. I guess I need to add another step. Make sure you run sudo systemctl stop webwork2 first. What is happening is that if the app is already running via the systemd service, then when you run sudo MOJO_REVERSE_PROXY=1 hypnotoad -f bin/webwork it hot reloads the app. This is the equivalent of running sudo systemctl reload webwork2. In this case the -f flag is ignored. You might still see some of the output because it is still connected to the terminal.

and I can enter a course from a web browser. However I am unsure where I should be looking to see modules destroyed.

You will see all of the output in the terminal once we get it running in the foreground properly.

alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ hypnotoad -s bin/webwork2
Can't remove file "/run/webwork2/webwork2.pid": Permission denied at /usr/local/share/perl5/Mojo/Server/Prefork.pm line 31.
alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $

You will also need to use sudo to stop the app (run sudo hypnotoad -s bin/webwork).

I apologize for not giving the best instructions for testing this with hypnotoad. On my local machine I don't use root for running the webwork2 app, so things are simpler.

@Alex-Jordan
Copy link
Contributor

I think my issue now is my most recent post. If localhost on a remote server is where the web application is running, how do I actually access the application from a web browser on my local computer?

@drgrice1
Copy link
Member Author

You should be able to access the application from the web browser the same way that you usually do when you are using the systemd service. The commands that I have given are just doing what the service does to run hypnotoad. The rest will work the same. Of course you will need to view the terminal on the server to see the output. That won't be in the browser.

@drgrice1
Copy link
Member Author

Ahh, wait. Remove the MOJO_REVERSE_PROXY=1 from the command. That is only for if you are serving via proxy like apache2 or nginx.

@drgrice1
Copy link
Member Author

Although, I tested this on a server that is serving directly in a production like environment and had MOJO_REVERSE_PROXY=1 in it, and it still worked.

@Alex-Jordan
Copy link
Contributor

No luck.

I'm doing this on the development server my school set up for me. To get there, I must turn on my school VPN. Normally, I then go to http://vmwebworkdevw02.pcc.edu/webwork2/ and proceed to a course.

Just now I ran:

alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ sudo systemctl stop webwork2
[sudo] password for alex.jordan: 
alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ sudo hypnotoad -f bin/webwork2 
WeBWorK::CourseEnvironment destroyed
Web application available at http://127.0.0.1:80
Can't create process id file "/run/webwork2/webwork2.pid": Can't open file "/run/webwork2/webwork2.pid": No such file or directory at /usr/local/share/perl5/Mojo/Server/Prefork.pm line 42.
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
alex.jordan@vmwebworkdevw02:/opt/webwork/webwork2 $ WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed
WeBWorK::CourseEnvironment destroyed

and then I refresh the browser, still at the same URL, and I get the "unable to connect" page in Firefox. I quit with command-C and then run sudo systemctl start webwork2 and it's back to what I would expect.

@drgrice1
Copy link
Member Author

Can you still access the server in the browser when you are using the systemd service?

@Alex-Jordan
Copy link
Contributor

Yes. I just noticed my /run/webwork2 folder probably has the wrong owner. Let's see if that fixes this.

@Alex-Jordan
Copy link
Contributor

OK, each time I start the systemd service back up I see that it is clobbering the /run/webwork2 folder. So I stopped the systemd service. Then created the /run/webwork2 folder and set apache to be its owner.

Then running hypnotoad, it seems like I'm good to continue testing.

@drgrice1
Copy link
Member Author

If you can access the server with the systemd service, then you will also be able to access it doing it this way. We just have to do everything the way the systemd service does.

One thing to note is that when you run sudo systemctl start webwork2 and then sudo systemctl stop webwork, that will create (if it does not exist) and then delete the /run/webwork2 directory. So to run without the systemd service you have to start over with manually creating the directory and changing ownership again.

An alternative approach to this is to change pid_file: /run/webwork2/webwork2.pid to pid_file: /path/to/server_writable_and_existing_dir/webwork2.pid in conf/webwork2.mojolicious.yml. For example, you could change that to pid_file: /opt/webwork/webwork2/tmp which the server should be able to write to already.

Copy link
Contributor

@Alex-Jordan Alex-Jordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to test using the posted patches and see processes be destroyed as described.

Copy link
Member

@pstaabp pstaabp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running on my Mac with morbo, I'm seeing everything destroyed on each page request.

If needed, I will try to find another Dev server to test.

@drgrice1
Copy link
Member Author

Do you mean that you are seeing everything destroyed on each request with this pull request, or with develop? With the develop branch that won't happen. You might see some things destroyed, but certainly not everything.

@pstaabp
Copy link
Member

pstaabp commented Apr 27, 2025

To clarify, I was seeing everything destroyed on the PR, but not on develop. I think this is how I read this should be happening.

@drgrice1
Copy link
Member Author

Yes, that is correct. I was just seeking clarification. Thanks.

It turns out that none of the ContentGenerator controller objects are
being destroyed when a request finishes.  So each hypnotoad process (or
morbo in development) has every ContentGenerator controller object for
every request it renders saved in memory until the process ends. That
means that everything it had a reference to is also saved in memory.
That includes a `WeBWorK::CourseEnvironment` instance, a
`WeBWorK::Authen` instance, a `WeBWorK::Authz` instance, and a
`WeBWorK::DB` instance (and everything it has a reference to).

Furthermore, even if the controller objects are destroyed and the
`WeBWorK::DB` instance with it, none of the `WeBWorK::DB::Schema`
instances (one for each table) are ever destroyed.

There are two things that cause these references to be kept when they
shouldn't be.

The first is the more obvious circular reference..

A `WeBWorK::Controller` object (from with the `WeBWorK::ContentGenerator`
modules derive) keeps a reference to a `WeBWorK::Authz` instance, and
that instance keeps a reference back to the controller.  However, the
`WeBWorK::Authz` doesn't weaken the reference back to the controller.
That was my fault in the conversion to Mojolicious I commented out the
`weaken` statement that prevented this circular reference.  That was
because in the initial conversion the controller didn't have a reference
to the `WeBWorK::Authz` instance, and so it was going out of scope and
causing problems.  However, when the reference to that instance was
added that should have been uncommented.

Another case of this is that `WeBWorK::Authen::LTIAdvanced` and
`WeBWorK::Authen::LTIAdvantage` packages were keeping a circular
reference to the controller as well. The new methods in those packages
was just deleted so that they use the `WeBWorK::Authen` new method
which already does the right thing.

A third case occurs with the `WeBWorK::DB` instance and the
`WeBWorK::DB::Schema` instances both of which hold references to each
other.

The other thing that causes an extra reference to be kept is an
anonymous subroutine (or closure) using an instance.  In this case Perl
forces the instance to be kept in scope for usage in the closure.

The global `$SIG{__WARN__}` handler defined in `Mojolicious::WeBWorK`
uses the `$c` controller instance, and that is what prevents the
`WeBWorK::ContentGenerator` modules from going out of scope.  So that
instance in the `around_action` hook needs to be weakened.

For the `WeBWorK::DB::Schema::NewSQL::Std` and `WeBWorK::DB::Schema::NewSQL::Merge`
objects the issue is the `transform_table` and `transform_all` closures
for the sql abstract instances.  Those prevent the schema objects from
going out of scope and so the `$self` in the `sql_init` methods where
those closures are defined needs to be weakened as well.
that the $SIG{__WARN__} handler is reset in the after_dispatch hook so
that the reference to the controller is released.
@drgrice1 drgrice1 changed the base branch from develop to WeBWorK-2.20 April 29, 2025 12:24
@dlglin dlglin merged commit 0399c10 into openwebwork:WeBWorK-2.20 Apr 29, 2025
2 checks passed
@drgrice1 drgrice1 deleted the memory-leaks branch April 29, 2025 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants