I wish there was something like this for Docker rather than Lambda functions.
I'm new to all of it, but the security groups, route tables, internet gateways and other implementation details of AWS left me feeling overwhelmed and insecure (literally, because roles and permissions are nearly impossible for humans to reason about). AWS also suffers from the syndrome of: if you want to use some of it, you have to learn all of it.
Basically what I need is a sandbox for running Docker containers with any reasonable scale (under 100? what's big these days?). Then I just want to be able to expose incoming port 443 and one or two others for a WebSocket or an SSL port so admins can get to the database and filesystem (maybe). Why is something so conceptually trivial not offered by more hosting providers?
I researched Heroku a bit but am not really sure what I'm looking at without actually doing the steps. I'm also not entirely certain why CI/CD has been made so complicated. I mean conceptually it's:
1) Run a web hook to watch for changes at GitHub and elsewhere
2) Optionally run a bunch of unit tests and if they pass, go to step 3
3) Run a command like "docker-compose --some-option-to-make-this-happen-remotely up"
So why is a 3 step thing a 3000 step thing? Full disclose, I did the 3000 steps with Terraform and while I learned a lot from the experience, I can't say that I see the point of most of it. I would not recommend the bare-hands way on any cloud provider to anyone, ever (unless they're a big company or something).
I guess what I'm asking is, could you adapt what you've done here to work with other AWS services like ECS? It's all of the same configuration and monitoring stuff. I've already hit several bugs in ECS where you have to manually run docker prune and other commands in the EC2 instance because the lifetimes are in hours and they haven't finished the rough edges around their cleanup commands. So I've hit problems where even though I've spun down the cluster, the new one won't spin up because it says the Nginx container is still using the port. I can't tell you how infuriating it is to have to work around issues like that which ECS was supposed to handle in the first place. And I've hit similar gotchas on the other AWS services too, to the point where I'm having trouble seeing the value in what they're offering, or even understanding why a service exists in the first place, when I might have done it a different way if I was designing it.
TL;DR: if you could make deploying Docker as "easy" as Lambda, you'd quickly run out of places to store the money.
We run some ECS clusters internally and have run into some of the issues you mentioned. We use Seed to deploy them but the speed and reliability bit that I talked about in the post mainly applies to Lambda. So Seed can do the CI/CD part but it can't really help with the issues you mentioned.
Ah that's cool, makes sense. We may eventually move to Fargate, but the project has some legacy stuff that somewhat relies on having a host machine because of its shared directory. I've set up a roadmap to gradually remove the restrictions that prevent us from transitioning from EC2 to Fargate.
I've learned a lot more implementation details in this project than I expected. For example, I think stuff like awsvpc network mode is a code smell. I did appreciate some of the work that AWS did though for just mounting an EFS filesystem like any other path in the ecs-params.yml file though.
I did try it, but EFS latency is too high to run a whole server (at least for PHP). It does work for a storage folder though. Specifically, PHP Composer feels like it will never finish if the whole project directory is on EFS. But if I changed the build system to pre-build all of the Docker images, it might be ok.
To me, Amazon doing their job would look like: no distinction between EC2 and Fargate. They should have provided a host filesystem out-of-the-box (that uses EFS internally) enabled by default with the option to disable it. But that's not the AWS way. In AWS, each service gives you 90% of a typical use case. The other 10% comes from the 10 other services that you must learn in unison.
But hey, this pain could easily be someone else's meal ticket if they automate the worst parts!
We're building something like what you describe (YC S20) - https://layerci.com - it's similar to OP but meant for standard containers instead of serverless.
Thank you! I remember in the 90s, if I thought of a website or invention, I figured I had about 2-3 years to make it (certainly less than 5) before someone else did. That number dropped to maybe 6 months by 2010, and today most things are either about to be released or were released 2 weeks ago (minimum). So I'm not sure if I manifested what you made by needing it months ago, is what I'm saying.
Anyway, the value proposition of LayerCI may not exactly be in the CI/CD stuff. What caught my eye was the 12 staging servers with high power CPUs and the layer caching like Docker (which takes multi-minute build times down to seconds). I think if you manage to include backups and monitoring from the start, you'll really have something. And if you've already done them, good job manifesting that.
Have you tried cloud run on GCP? It sits in the niche you're describing between a serverless platform and some managed container orchestration platform like kubernetes (GKE or EKS).
Does that use Cloud Run? I haven't tried Google cloud yet because I thought I'd have to learn Kubernetes. I have an aversion to learning Kubernetes because I still can't figure out what problem it's trying to solve. Admittedly, I probably haven't gotten far enough with cloud hosting to know what limitations I'll hit yet though. Some ok answers here:
The computer science part of me just looks at a Docker swarm as a big graph. We should be able to balance a load if we just know the remaining CPU capacity of each container. But I look at the astonishing complexity of all this stuff (not to pick on k8s too much) and my first thought is: never have I seen so much code do so little!
K8s on DigitalOcean might be a solution. K8s can be pretty complex but for a single tenant/single app you can probably skip some of the complexity.
Even at 100 containers you're probably going to want health checks (some load balancer integration), rolling deploys, metrics, and aggregated logging.
Amazon also added support for Docker containers to Lambda. You need to make sure your container implements the correct interface so Lambda can start it which is in their docs
I think you could check Moncc https://docs.moncc.io/ - you can wrap all of the above in a template (provisioning and orchestration) and run locally or on gcp/aws
hey Zack, we have a prototype of this, we would love to have you try out (and anyone else). We just helped a couple customers migrate their Docker code repos from DigitalOcean to AWS and save $2K a month with our template. Gives you a CI/CD pipeline and deploys on ECS/Fargate.
I'm new to all of it, but the security groups, route tables, internet gateways and other implementation details of AWS left me feeling overwhelmed and insecure (literally, because roles and permissions are nearly impossible for humans to reason about). AWS also suffers from the syndrome of: if you want to use some of it, you have to learn all of it.
Basically what I need is a sandbox for running Docker containers with any reasonable scale (under 100? what's big these days?). Then I just want to be able to expose incoming port 443 and one or two others for a WebSocket or an SSL port so admins can get to the database and filesystem (maybe). Why is something so conceptually trivial not offered by more hosting providers?
I researched Heroku a bit but am not really sure what I'm looking at without actually doing the steps. I'm also not entirely certain why CI/CD has been made so complicated. I mean conceptually it's:
1) Run a web hook to watch for changes at GitHub and elsewhere
2) Optionally run a bunch of unit tests and if they pass, go to step 3
3) Run a command like "docker-compose --some-option-to-make-this-happen-remotely up"
So why is a 3 step thing a 3000 step thing? Full disclose, I did the 3000 steps with Terraform and while I learned a lot from the experience, I can't say that I see the point of most of it. I would not recommend the bare-hands way on any cloud provider to anyone, ever (unless they're a big company or something).
I guess what I'm asking is, could you adapt what you've done here to work with other AWS services like ECS? It's all of the same configuration and monitoring stuff. I've already hit several bugs in ECS where you have to manually run docker prune and other commands in the EC2 instance because the lifetimes are in hours and they haven't finished the rough edges around their cleanup commands. So I've hit problems where even though I've spun down the cluster, the new one won't spin up because it says the Nginx container is still using the port. I can't tell you how infuriating it is to have to work around issues like that which ECS was supposed to handle in the first place. And I've hit similar gotchas on the other AWS services too, to the point where I'm having trouble seeing the value in what they're offering, or even understanding why a service exists in the first place, when I might have done it a different way if I was designing it.
TL;DR: if you could make deploying Docker as "easy" as Lambda, you'd quickly run out of places to store the money.