How to Deploy Stable Diffusion (SD.Next)
A guide to deploying a custom stable diffusion model on SaladCloud with SD.Next
High Level
Regardless of your choice of stable diffusion inference server, models, or extensions, the basic process is as follows:
- Get a docker image that runs your inference server
- Copy any models and extensions you want into the docker image
- Ensure the container is listening on an ipv6 address
- Push the new image up to a container registry
- Deploy the image as a SaladCloud container group
Find a Docker Image
Find a docker image of SD.Next. Here is one that we have verified works on Salad:
SD.Next - Git Repo: https://github.com/SaladTechnologies/sdnext -
Docker Image: saladtechnologies/sdnext:base
- Data Directory: /webui/data
- Model Directory: /webui/data/models
-
Extension Directory: /webui/data/extensions
- Controlnet Model Directory:
/webui/extensions-builtin/sd-webui-controlnet/models
Note that you will be interacting with this as an API, and not through the browser user interface.
Download Your Models and Extensions
Download any model files you plan to use. For our example, we’re going to use Dreamshaper 8 , available on Civitai.com (https://blog.salad.com/civitai-salad/)
Create a Dockerfile
-
Create a new file called
Dockerfile
and open it in your preferred text editor. At this point, your directory should look like this: -
Copy the following into your Dockerfile:
Build and Test Your Docker Image
- Build the docker image. You should change the specified tag to suit your purpose.
- (Recommended) Run the docker image locally to confirm it works as expected
Navigate to http://localhost:7860/docs in your browser to see the API docs for SD.Next.
Note that we set the HOST
environment variable to 0.0.0.0
for local development. The default value of HOST
is
[::]
, which is the ipv6 address that SaladCloud uses to route traffic to your container.
- Test a Text-to-Image request
See the docs for more information on submitting a text-to-image request. Here’s an example JSON request body:
Submit this to the /sdapi/v1/txt2img
endpoint as a POST request:
You will receive back a response in JSON format, including the generated image in Base64 encoded format:
- Decode the base64 encoded string into your image. You can do this in a free browser tool such as https://codebeautify.org/base64-to-image-converter
or using CLI tools like jq
and base64
. For this method, first save your response to a file called response.json
.
Then, run the following command:
Push and Deploy Your Docker Image
- Push your docker image up to docker hub (or the container registry of your choice.)
-
Deploy your image on Salad, using either the Portal or the SaladCloud Public API
We’re going to name our container group something obvious, and fill in the configuration form. We’re going to use 3 replicas, to ensure coverage during node interruptions and reallocations.
Since this is a stable diffusion 1.5 based model, we’re gong to give ourselves fairly modest hardware: 4 vCPUs, 12GB ram, and an RTX 3060 Ti GPU.
We want to add the container gateway to our deployment, so that we will get a URL we can use to access it. Make sure to set the port to 7860, or whatever you set the PORT environment variable to in your Dockerfile.
We need to enable a startup probe and a liveness probe, to make sure the container gateway only routes requests to nodes that are ready for them.
Interact with Your Deployment
-
Wait for the deployment to be ready.
-
First, SaladCloud pulls your container image into our own internal high-performance cache.
-
Next, SaladCloud locates eligible nodes for your workload, based on the configuration you provided.
-
Next, SaladCloud begins downloading the cached container image to the nodes that have been assigned to your workload.
This step can take tens of minutes in some cases, depending on the size of the image, and the internet speed of the individual nodes. Note that our progress bars are only estimates, and do not necessarily reflect real-time download status. These slow cold starts, and the possibility of nodes being interrupted by their host without warning, are why we always want to provision multiple replicas.
-
Eventually, you will see instances listed as “running”, with a green check in the “ready” column.
-
-
Submit your prompt to the provided Access Domain Name. You will get back a json response within a few seconds. See above for how to submit the request and process the response.
Was this page helpful?