Skip to content

Conversation

terrcin
Copy link

@terrcin terrcin commented Jul 4, 2025

I've had a sudden need for adding mounts to Fly FLAME servers to shift infrequent long running CPU intensive data exports off our main app machines and into the background where they won't bother anybody. So I'm picking up the torch for this work.

To achive this i've initially forked from #22 to maintain @benbot's commit credit as they did the initial heavy lifting for this work. I've then updated to the latest resolving all merge conflicts, which then didn't compile as the way http requests are made has changed.

My contribution to this PR is resolving the compile errors and then making a bunch of improvements to add resilience:

  • refactored FlyBackend.http_post! into FlyBackend.http_request! to add GET support
  • update FlyBackend.Mount to be parsed the same way that FlyBackend is
  • additionally filter Fly volumes by the region to ensure they match the machine spec if that's set
  • also check the volume's "host_status" is "ok"
  • re-fetch list of volumes each time machine create request fails, to support this the body opt is now a function
  • shuffle volume list to prevent always picking the same one as create machine failure might be due to lack of capacity on volume's host

Note: we currently have this running on a QA server without issue and I plan future work to allow creating volumes if none are available to limit the number of big volumes hanging around unused

@terrcin
Copy link
Author

terrcin commented Aug 28, 2025

Update: we've had this in our production and multiple feature environments for the past month and it's been working great so far.

@venkatd
Copy link

venkatd commented Sep 11, 2025

Hi would love to get this reviewed by someone so it's official!

We also have a sudden need for this. The ephemeral filesystem on fly.io is extremely slow which would defeat the purpose of FLAME because we have to read/write artifacts to disk.

For example a 200mb document ~25 seconds to write to "/tmp" and 4 sec to an actual volume.

Will report back on how this works for us in production.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants