Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Serverless Framework Support #37

Open
necevil opened this issue Dec 21, 2019 · 2 comments
Open

[FEATURE] Serverless Framework Support #37

necevil opened this issue Dec 21, 2019 · 2 comments
Labels
enhancement New feature or request

Comments

@necevil
Copy link

necevil commented Dec 21, 2019

Is your feature request related to a problem? Please describe.
Many of the API endpoints for instamancer could in theory be ported to a Serverless function that relies on either AWS Lambda (with puppeteer layer) or Google Cloud Functions (that automatically has access to puppeteer by default). This would increase the scalability of the solution and also allow lower level / starter users to take advantage of their free Lambda / function executions on a monthly basis.

Describe the solution you'd like
Add Serverless Framework as a dependency and create a serverless config file to handle configuration when deploying.

Describe alternatives you've considered
Serverless Framework would help to abstract the difference in platforms etc for anyone who wants to run this serverlessly have not considered alternatives.

Additional context
The biggest issue will be data persistence (where to deposit photos / which db to insert records into).

@necevil necevil added the enhancement New feature or request label Dec 21, 2019
@ScriptSmith
Copy link
Owner

Your proposal is interesting, there are a couple of things to consider.

  1. Cold-boot time for the lambda function would be prohibitively slow given that a browser has to launch, load the page, and retrieve the results from the API.

  2. Assuming it would act as an http endpoint, a containerised application would be just as simple to use, more cross-platform accessible, and could be run locally just as easily.

  3. This feature might need to be its own project rather than part of the instamancer package. Instamancer would remain the core module, and then you can build whatever server system you want around it. Having an authoritative server model doesn't really justify adding additional weight and complexity to the current module (except perhaps if it was just an extremely simple express server).

Interested to hear your thoughts. We could create a new repo in https://github.com/instamancer

@necevil
Copy link
Author

necevil commented Dec 23, 2019

True.
I think a secondary repo would for sure make sense.

On the cold boot & browser spin up time
In my experience the bigger issue is the spin up time for the browser. Most of the reasoning behind working on a more sophisticated deployment is to handle a larger (possibly concurrent instances executing at once) and/or more consistently executed / scheduled use case.

Since the Cold Boot only applies to containers that haven't been run in a while USUALLY it doesn't add a huge amount of overhead on it's own since really it's only your first execution. This assumes running the container 50 or 100 times after the first execution (which warms it up).

In cases where thousands of lambda / serverless executions occur prior to cool down the overhead for the warm up doesn't end up impacting things in a meaningful way (in my experience!).

The bigger issue is the browser spinning up each time — but again to me this is just sort of par for the course to avoid the scraping defenses out there (in this case with instagram) by using Chrome / Puppeteer, but I think it's worth it.

In my experience almost all of the user behaviors that can be used to detect a scraper can be replicated in puppeteer so there is a huge amount of value / resilience added to the project by relying on Chrome — even though you eat the above mentioned overhead.

If the reasoning behind moving toward serverless is to be able to abstract the management of consistently run (every day, every hour, etc) Instamancer queries then this separated project could also provide for the use / application of proxies to allow concurrent executions for larger projects.

I would like to play around with Instamancer a little more in a containerized environment but I don't see any reason why it would be super hard to configure.

Have you done any work on containerization / dockerization locally?
I can probably at the least contribute there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants