Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pangolin hangs requests for some services? #29

Closed
oschwartz10612 opened this issue Jan 11, 2025 Discussed in #25 · 35 comments
Closed

Pangolin hangs requests for some services? #29

oschwartz10612 opened this issue Jan 11, 2025 Discussed in #25 · 35 comments

Comments

@oschwartz10612
Copy link
Member

Discussed in https://github.com/orgs/fosrl/discussions/25

Originally posted by r3nor January 11, 2025
For services like Immich or Nextcloud, I face some requests hanging, mostly for some API calls or XHR requests that happen in the background, but sometimes the whole service does not respond (if I visit it locally, skipping pangolin it works!)

I'd like to be able to debug and find out where is this failing, or why... I have Pangolin on a VPS, I am the only user for it, the VPS has 2GB or RAM and 2 vCPUs...

@oschwartz10612
Copy link
Member Author

I think we should try to pin down if this is an issue with traefik or the WireGuard tunnels.

Are you using Newt or another WireGuard client?

I think what could be interesting is to run a tcpdump like tcpdump -i any -n port <your target port> on the same host as the Newt/WireGuard client and try to reproduce the not responding issue. If you can tell if there are packets arriving out the other end of newt then we are probably good with WireGuard.

Also, are you on the latest builds? I added some pinging to the Newt WireGuard client to make sure it keeps the session alive in the latest build. I am not sure if that will help or not.

@vayan
Copy link

vayan commented Jan 12, 2025

I do have a similar issue sometimes, on 1.0.0-beta.4 using Newt

@r3nor
Copy link

r3nor commented Jan 13, 2025

I am currently using newt (dockerized), and I seem to only be facing this issue with particular services; these are: Immich and Nextcloud. For other services, it does not seem to happen, although these two are the ones I use more often on my homelab.

@oschwartz10612
Copy link
Member Author

Interesting. How often does it happen? Is there any sort of pattern? Does the page partially load or just sometimes not load at all?

These two services I would guess have more frequent HTTP requests than others so I wonder if that's involved.

We are going to try to look into it this week.

@ironicbadger
Copy link

ironicbadger commented Jan 13, 2025

Audiobookshelf exhibits similar behavior. It loads for about 1 minute and then (I assume) the tunnel dies and it takes 1-2 mins to reestablish and the cycle repeats. I see "socket disconnected" errors in the webUI when this occurs if I was lucky enough to get logged in fast enough.

Topology is:

Hetzner VM (running pangolin) -> LXC running Newt forwarding 10.42.1.10:2284 -> container running on the same VLAN as Newt LXC

@oschwartz10612
Copy link
Member Author

I wonder if this is actually mtu related. We are using 1420 - the default for wireguard - but I wonder if packet fragmentation is occurring and dropping requests.

I am going to try to make this configurable, test it myself and ping you guys here to help me test and see if it improves your experience.

@oschwartz10612
Copy link
Member Author

@r3nor and/or @ironicbadger would you be able to test for me a Newt and Gerbil build that has the MTU set to 1280? I am wondering if this will help or do nothing at all...

I have pushed some test containers to our ECR repo just for testing. Could you replace these images in your compose and let me know how it goes. I would recommend using the new builds of both and not on just one side.

public.ecr.aws/g5j5p7x0/newt:latest
public.ecr.aws/g5j5p7x0/gerbil:latest

These are the changes I made:
fosrl/newt@e90e55d
fosrl/gerbil@b44d320

@r3nor
Copy link

r3nor commented Jan 14, 2025

Hey, I will try to test later today and get back

@cmulter
Copy link

cmulter commented Jan 14, 2025

The problem ist still present with these images.

@oschwartz10612
Copy link
Member Author

Hum... okay so it is not MTU I guess. I will need to think on this more.

@r3nor
Copy link

r3nor commented Jan 15, 2025

Same for me, the problem seems to persist with the new images you shared.

@danhusan
Copy link

Might still be MTU related. Just theorizing but if newt recieves a packet from for example immich with a size that exceeds that of the tunnel - how does it handle it? Assuming it is more of a proxy than a network-forwarding device - would it communicate that the packet is too large to immich, or fragment?

One way to debug might be to increase the wg mtu to 1500 so wg will fragment and we can see if that solves the issue. And if it does something has to be done in newt for it to gracefully handle large packetsizes.

@oschwartz10612
Copy link
Member Author

This is a good point. We are using gVisor's netstack with WireGuard

Maybe it needs a little more configuration to handle this correctly with the additional proxy method I have implemented. Will continue to dig.

@manilx
Copy link

manilx commented Jan 16, 2025

I do have a similar issue sometimes, on 1.0.0-beta.4 using Newt

Same here. Using Newt on Docker.
Immich sire loads partially, even after several refreshes. Gives an error message (problem with server...).
restarted Immich Docker containers and it went away.

@m0nji
Copy link

m0nji commented Jan 16, 2025

i see the same behavior with plex for example. sometimes requests hang completly then i have to wait 20-30sec and try to refresh the page, then i normally jump back to plex home user authentication.

edit: using v1.0.0-beta.6
pangolin (hetzner vps) --> newt (proxmox lxc) --> plex (proxmox lxc)

@CalHenze
Copy link

I have Pangolin (The latest version/installed yesterday) working perfectly (Using Newt on Unraid in docker) and proxying everything except for the one thing I actually need to proxy: Overseer.

When I access the overseerr.mydomain.xyz url, I see the Overseer/login with plex screen.

I can enter the login/password, click the login button and then everything just hangs.

This seems to be the same issue as above. Is there anything I can offer to help you test?

@oschwartz10612
Copy link
Member Author

Thanks for the feedback everyone. The MTU is set to 1280 on the latest releases of Newt and Gerbil, but clearly there is more to investigate here.

I think this weekend I will add MSS clamping to Gerbil and we can test that out. I need to also look into packet fragmenting in Newt or MSS clamping on the Newt internal TCP proxy.

@oschwartz10612
Copy link
Member Author

Hi Everyone,

I just release 1.0.0-beta.3 of Gerbil with some MSS clamping support. If anyone has a chance could you pull that latest image and let me know if it improves anything for you?

@ironicbadger
Copy link

Unfortunately no change here.

@oschwartz10612
Copy link
Member Author

Unfortunately no change here.

Alright thanks. I am going to try to setup Audiobookshelf myself and see if I can reproduce.

@oschwartz10612
Copy link
Member Author

Good news! I was able to track this down to an issue with the TCP proxy in newt - though I dont think the MTU and MSS work will hurt anything. I was able to eliminate the hanging on Audiobookshelf and Overseerr as a test with these changes: fosrl/newt@7597805

I am working on resolving some small bugs and I will release a new beta version of newt tomorrow for you guys to test. I am crossing my fingers that this is the final fix.

@pchimbolo
Copy link

Brilliant project! I have several successful tunnels setup into my home lab. Came here to share that I am seeing the same "hanging" behavior using "paperless-ngx". Seems almost certain to be the same issue everyone else comments on above.

pangolin (racknerd vps) --> Linux VM (proxmox lxc / Ubuntu 24.04.1 LTS) --> newt (docker) --> paperless-ngx (docker)

Will watch closely and try any suggestions others may have.

@r3nor
Copy link

r3nor commented Jan 20, 2025

@oschwartz10612 that's awesome! I can't wait to try it out 😃

@abiteman
Copy link

Just wanted to toss in I'm also running into hanging on Paperless.

Linveo VPS (pangolin) > Newt on Unraid

Login screen is accessible but hangs once credentials are entered.

@oschwartz10612
Copy link
Member Author

Hi Everyone,

Lets try this one more time. I completely rewrote the proxy manager in Newt. Can you try to update Newt to 1.0.0-beta.4 and let me know if it fixes the hanging? I would also update Pangolin to the latest beta.8 because I updated how internal ports are chosen.

@ironicbadger I specifically tested with Audiobookshelf so I hope you have some success there!

I am crossing my fingers this is the fix!

@pchimbolo
Copy link

BINGO! No more hanging with my paperless-ngx instance. All services and pages load without issue. Well done @oschwartz10612 !!! THANK YOU! Looking forward to seeing this project continue to grow. Will also keep an eye out for your potential hosted service.

@danhusan
Copy link

Confirmed issue solved with Immich. Well done!

@abiteman
Copy link

Confirmed fixed with paperless on Unraid as well, and this fixed a login issue with Jellyfin login on a smart TV.

@r3nor
Copy link

r3nor commented Jan 21, 2025

I can also confirm the issue has been solved with the latest newt update. Thank you very much for looking into this and for the fast solution. I think this issue can now be closed!

@m0nji
Copy link

m0nji commented Jan 21, 2025

Plex Playback Issue is now solved. Thank You. The Only Problem with Plex is still the Login Issue. If you Login through Pangolin, you will be redirected to the Login of Plex. If you Login to Plex, you will be redirected to Pangolin Login. Its a Loop. The Workaround for this is: After Login to Plex and Redirect to Pangolin, you need to Refresh tha Page, then you will be redirected to the Plex Dashboard. But this could be a Plex specific Problem.

@vayan
Copy link

vayan commented Jan 21, 2025

Smooth now! Thanks for the fix

@nstomasi
Copy link

Did some testing with n8n this morning and things seem to be working great.
Thank you for the update!

@ironicbadger
Copy link

ironicbadger commented Jan 21, 2025

Can confirm the new update working here. 🚀

@oschwartz10612
Copy link
Member Author

Great thank you so much everyone! I am going to close this now but please post back here if something is still broken.

@duderuud
Copy link

Working for me too! I can log in to Qbittorrent, Jellystat and Immich now. Thanks @oschwartz10612 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests