- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 49
Description
For last few months, you might have seen, we aren't pushing enough updates to Swiftwave.
Why no updates ?
There are some fundamental issues in any PaaS which are built on top of Docker Swarm.
Issues -
- If you scale beyond 3~4 servers, you will start seeing out of sync issues + overlay networking problem
- Docker doesn't give enough support to tweak with overlay networking and we ended up using some tcp/udp proxy (just wasting resource on packet processing, which can be done using iptables)
- In swarm overlay networking, tough to implement firewall between containers (in k8s, it's much easier and come with default)
- Cluster based volume support is flaky and doesn't looks like it's coming soon (ref)
- Doesn't have support for task (containerd has this)
In my opinion and experience, swarm is not ready for real production load (especially in cluster mode).
Solution ?
So my idea was to throw swarm and let's build one orchestrator.
For networking, we will use wireguard mesh and iptables to route and firewall.
For storage, we will have some remote storage NFS based and one cluster syncable S3 based support (need to write fuse based filesystem driver)
Wrote agent already for it.
Started implementing in Swiftwave as well (pr#1226)
What's the problem then ?
It's getting way too complicated to keep everything in a single project.
Proposal -
Thinking to just seperate out the orchestrator , so that anyone can build stuff on top of that as well (no need to install swiftwave for that)
And then we can wire things to that orchestrator and eventually drop swarm (or keep it as an option)
If we go on this route , I will like to replace docker runtime with podman.
Some pointers about orchestrator -
- Single master design (but ensure if master goes down nothing will happen, we can think of multi master later)
- Use podman to run in user space
- In built wireguard mesh for overlay networking
- In built tcp/udp ingress support
- Integrated DNS server for service discovery
- Firewall support for containers in single network
- Zero downtime deployment (will be possible due to managed service discovery)
- Support for task
- Task is like a container which will execute something and exit (useful to backup/restore and many sysadmin purpose)
 
- Support for cron in container
- The agent will manage to run something in periodic manner as configured
 
- Cluster Volume
- Managed NFS : orchestrator can have some in-built support to provison NFS from master or remote node to the containers automaticall
- S3 Sync : Need better version of s3fs which can be used in a distributed manner for cluster wide support.
 
- CLI + Stable API to build something on top of it