This operation shuts down an inference deployment, making it unavailable for handling requests.
The deployment will scale down to **0** replicas, overriding any minimum replica settings.
Once stopped, the deployment will **not** process any inference requests or SQS messages.
It will **not** restart automatically and must be started manually.
While stopped, the deployment will **not** incur any charges.