Serverless inference
Using the Efficient ADapproach, trained on the cookies dataset number three!
Anomaly Results
Score
1.3962
Time
12.39 (sec)
Classification
anomaly
Serverless inference
This model was trained remotely on a g4dn.xlarge instance. It slightly differs from EfficientAD by using 5k steps instead of 70k while still achieving similar result. The model deployed here is using the cookie dataset three.
Why is it taking so much time ?
For cost saving! The inference is taking ~190ms on a CPU and less than 3 ms on a GPU. It's behind a Sagemaker Serverless Endpoint, the cost of a proof of concept like this is about 0.2$ per 1k inference. You can make it very fast by using a provisioned concurrency endpoint, putting the price at ~13$/month to keep it running + 0.18$ per 1k inference. Or even instant by using real-time inference (starting at 42$/month).
Using serverless allows to have hundreds of endpoints to show differents ideas without impacting your wallet, once you're ready to scale up, simply switch!
I'm getting a 502 error while uploading an image
There might be multiple reason behind it:
- The endpoint configuration is set to allow two concurrent executions. If you try to upload images too quickly, it will result in a 502 error.
- The model has not been fully optimized to work with any image size, though it can be configured to do so. In this POC, use square images and ensure they are under 5MB.
- If you still have a problem, create an issue in the Github repository and provide the image you're using.

