This guide is a continuation of our previous article, focusing on deploying the YOLO model for batch processing on Salad. This approach is ideal for processing large volumes of data where real-time analysis is not required, or if you want to process multiple streams not worrying’s which node will pick the job up.

You can check more detailed project overview in our previous guide Fast API for Real-Time Processing here.

Reference Architecture for Batch Processing for Asynchronous Workloads

  • Objective: Set up a batch processing system that reacts to new video stream links stored in Azure.
  • Process Flow:
    • Video stream links are saved in an Azure storage container.
    • An event grid with subscriptions is configured to monitor the storage container and create a message in a storage queue whenever a new file is added.
    • A Python script for batch processing is containerized and deployed across multiple Salad compute nodes.
    • This batch process routinely checks the storage queue. When a new message appears, it picks up the corresponding stream for processing.

Folder Structure

Our full solution is stored here: git repo

Here is the folder structure we will have once the project is done:

├─ src/
│  ├─ infrastructure/
│  │  ├─ main.bicep (azure resources deployment for batch)
│  ├─ python/
│  │  ├─ api (architecture 1)/
│  │  │  ├─ inference/
│  │  │  │  ├─ dev/
│  │  │  │  │  ├─ setup
│  │  │  │  ├─ (single link)
│  │  │  │  ├─ (multi-threading)
│  │  │  ├─
│  │  │  ├─ Dockerfile
│  │  ├─ batch (architecture 2)/
│  │  │  ├─ inference/
│  │  │  │  ├─ dev/
│  │  │  │  │  ├─ setup
│  │  │  │  ├─ (Batch processsing script)
│  │  │  │  ├─ (Batch processsing script with multi threading)
│  │  │  │  ├─ requirements.txt
│  │  │  ├─ .dockerignore
│  │  │  ├─
│  │  │  ├─ Dockerfile

Deploy Azure Resources

It’s essential to first deploy the necessary infrastructure on Azure.

We will deploy:

  • Storage Account for storing input and output data
    • Blob Containers for organizing the data within the Storage Account (default names: requests, yolo-results, failed-requests)
    • Storage Queue for storing messages (default names: requestqueue, poison-requestqueue). Our batch process will be checking if there are any new messages in the queue.
    • Event Grid System Topic and Event Subscription for triggering events when new blobs are created in the input container. Every time a new file is created in the requests container a message will be created in the queue.

Here’s a step-by-step guide to deploying these resources:

  1. Create an Azure Subscription and Resource Group: If you haven’t already, sign up for Azure and create a subscription. Within this subscription, create a resource group that will contain all the resources for your project.
  2. Login to Azure:
    • Open your command line interface (CLI).
    • Use the Azure CLI command to log in:
      az login --tenant <tenant_id>
    • Set your subscription:
      az account set --subscription <subscription_name>
  3. Deploy resources:
    • Navigate to the main.bicep file inside of infrastructure folder and run the following command:
      az cli
      az deployment group create \
        --resource-group <resource group name> \
        --template-file main.bicep
    • All resource names are set as parameters, so you can override them:
      az cli
       az deployment group create \
        --resource-group <resource group name> \
        --template-file main.bicep \
        --parameters RequestsQueueName=<NewName>
  4. Check resources:
    • Open your resource group in azure and check the resources. You should see something similar to this:

Local Environment

Before we deploy solution on Salad’s cloud infrastructure, it’s crucial to evaluate YOLOv8’s capabilities in a local setting. This allows us to troubleshoot any issues and make any make our solution meet our needs.

Local Development Setup: Installing Necessary Libraries

Setting up an efficient local development environment is essential for a smooth workflow. We ensure this by preparing setup and requirements files to facilitate the installation of all dependencies. These files help verify that the dependencies function correctly during the development phase. We provide the complete contents of the requirements file and the setup script below.

The Setup Script:

#! /bin/bash

set -e

echo "setup the curent environment"
CURRENT_DIRECTORY="$( dirname "${BASH_SOURCE[0]}" )"
echo "current directory: $( pwd )"
echo "setup development environment for inference"
YOLOv8_SRC_BATCH_DIR="$( cd .. && pwd )"
echo "dev directory set to: ${YOLOv8_SRC_BATCH_DIR}"
echo "remove old virtual environment"
rm -rf "${YOLOv8_SRC_BATCH_DIR}/.venv"
echo "create new virtual environment"
python3.9 -m venv "${YOLOv8_SRC_BATCH_DIR}/.venv"
echo "activate virtual environment"
source "${YOLOv8_SRC_BATCH_DIR}/.venv/bin/activate"
echo "installing dependencies ..."

(cd "${YOLOv8_SRC_BATCH_DIR}" && python -m pip install pip==21.1.1)
(cd "${YOLOv8_SRC_BATCH_DIR}" && pip install numpy)
(cd "${YOLOv8_SRC_BATCH_DIR}" && pip install --upgrade --force-reinstall "git+")
(cd "${YOLOv8_SRC_BATCH_DIR}" && pip install lap==0.4.0)
(cd "${YOLOv8_SRC_BATCH_DIR}" && pip install --upgrade pip && pip install -r requirements.txt)

To establish a clean virtual environment and install all the necessary libraries, you simply need to execute the script using following command:

bash dev/setup

Exploring YOLOv8’s Capabilities and Data Compatibility

As we delve into the practicalities of implementing YOLOv8 for object detection, a fundamental step is to understand the range of its capabilities and the types of data it can process effectively. This knowledge will shape our approach to solving the problem at hand. Here is a list of possible inputs:

Our project’s scope now narrows to the realm of video processing, with a particular focus on YouTube videos and live stream data.

Video processing

We’ll begin by experimenting with an example straight from the Ultralytics documentation, which illustrates how to apply the basic object detection model provided by YOLO on video sources. This example uses the ‘yolov8n’ model, which is the YOLOv8 Nano model known for its speed and efficiency.

Here’s the starting code snippet provided by Ultralytics for running inference on a video:

from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('')
# Define source as YouTube video URL
source = ' '
# Run inference on the source
results = model(source)  

To better understand processing capabilities of YOLOv8, we will integrate it with OpenCV (cv2), a powerful library for computer vision tasks. This setup will allow us to visualize object detection as it happens. Before running the script, ensure that you have installed the necessary packages, opencv-python, ultralytics, and cap_from_youtube, the last of which addresses OpenCV’s current limitations with YouTube video streams.

from ultralytics import YOLO
import cv2
from cap_from_youtube import cap_from_youtube

# Load the YOLOv8 model
model = YOLO("")

# Open youtube video
link = " "
cap = cap_from_youtube(link, "720p")

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame =
    if success:
        # Run YOLOv8 inference on the frame
        results = model(frame)
        # Visualize the results on the frame
        annotated_frame = results[0].plot()
        # Display the annotated frame
        cv2.imshow("YOLOv8 Inference", annotated_frame)
        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
        # Break the loop if the end of the video is reached
# Release the video capture object and close the display window

With this script, as YOLO processes the video frames and identifies objects, bounding boxes will be drawn around them. You’ll see a display window pop up showing the video feed with the detections overlaid. Pay attention to the output displayed at the bottom of the screen for additional insights. Let’s run our code and see what happens.

Processing Live Video Stream from Youtube

Now we move on to the task of processing live video streams. While cap_from_youtube works well for YouTube videos by loading them into memory, live streams require a different approach due to their continuous and unbounded nature.

Pafy is a Python library that interfaces with YouTube content, providing various streams and metadata around YouTube videos and playlists. To make it work in the virtual environment we had to solve a few libraries issues we discussed earlier. To make the process easier for you run the dev/setup file we also mentioned earlier. For live video streams, Pafy allows us to access the stream URL, which we can then pass to OpenCV for real-time processing. Here’s how we can use Pafy to open a live YouTube stream:

import pafy

# Open youtube video with a live stream
link = " "
video =
best = video.getbest(preftype="mp4")
cap = cv2.VideoCapture(best.url)

Processing Live Video Stream (RTSP, RTMP, TCP, IP address)

Ultralytics gives us an example of running inference on remote streaming sources using RTSP, RTMP, TCP and IP address protocols. If multiple streams are provided in a *.streams text file then batched inference will run, i.e. 8 streams will run at batch-size 8, otherwise single streams will run at batch-size 1. We will include it in our code as well. Based on specific link parameters our process will pick which way to process the link. You can check that in the full script

from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('')

# Single stream with batch-size 1 inference
source = 'rtsp://'  # RTSP, RTMP, TCP or IP streaming address

# Multiple streams with batched inference (i.e. batch-size 8 for 8 streams)
source = 'path/to/list.streams'  # *.streams text file with one streaming address per row

# Run inference on the source
results = model(source, stream=True)  # generator of Results objects

Implementing Object Tracking

Having established the method to process live streams, the next crucial feature of our project is to track the identified objects. YOLOv8 comes equipped with a built-in tracking system that assigns unique IDs to each detected object, enabling us to follow their movement across frames. There is a way to pick between 2 tracking models, but for now we will just use the default one.

Enabling Tracking with YOLOv8:

To utilize the tracking functionality, we simply need to modify our inference call. By adding .track to our model call and setting persist=True, we instruct the model to maintain object identities consistently over time. This addition to the code will look like this:

results = model.track(frame, persist=True)

Let’s try it:

That is great, but our main goal is to get a summary of how long an object was present on the video. I will now remove the appearing window with the bounding boxes and we will pay more attention to the data. For this part of the project we are using Jupyter in vscode and pandas. Let’s check what results object is:

We can see that results object has some metadata and all the detections. Let’s only keep the detections and convert them to a more readable format:


To get a detailed summary of object detections, we’ll collate all the individual detection data into a pandas DataFrame. We will not only include the basic detection details provided by YOLOv8, such as bounding box coordinates, class ID, class name, and tracking ID, but also additional contextual information for a richer analysis.

Calculating Additional Metrics:

We defined a function to calculate the percentage of the frame that each detection occupies. Alongside this, we’ll record the video link, frame timestamp, original frame shape, and the processing speed for each frame.

Here’s how we’ll assemble our results into a DataFrame and enrich it with the additional data:

import datetime
import json
import pandas as pd

def calculate_percentage(bbox, original_shape):

    bbox_area = (bbox['x2'] - bbox['x1']) * (bbox['y2'] - bbox['y1'])

    original_shape_area = original_shape[0] * original_shape[1]

    percentage = (bbox_area / original_shape_area) * 100

    return percentage

# we will store all the results as a list of dictionaries
all_results = []

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame =
    if success:
        # Run YOLOv8 inference on the frame
        results = model.track(frame, persist=True)
        timestamp =
        # save every box with label
        for box in json.loads(results[0].tojson()):
            box["input"] = link
            box["timestamp"] = timestamp
            box["date"] = timestamp.strftime("%Y-%m-%d")
            box["time"] = timestamp.time().strftime("%H:%M:%S")
            box["origin_shape"] = results[0].orig_shape
            box["box_percentage"] = calculate_percentage(box["box"], results[0].orig_shape)

            box["full_process_speed"] = sum(results[0].speed.values())

        # Break the loop if 'q' is pressed
        if cv2.waitKey(1) & 0xFF == ord("q"):
            df = pd.DataFrame(all_results)

After the loop is interrupted (by pressing ‘q’), we convert the list of dictionaries into a pandas DataFrame. This DataFrame will then contain a complete log of every detection across the video’s frames, enriched with the additional data points we’ve calculated.

Let’s see what we’ve got.

Readable summary

With the DataFrame populated with tracking and detection data, we’re now ready to create a summary that achieves our main goal: calculating the duration each object was present in the video. We’ll start by filtering out any irrelevant data and then proceed to group the valid detections to summarize our findings.

Filter data

We need to ensure that we only include rows with valid tracking IDs. This means filtering out any detections that do not have a tracking ID or where the tracking ID is 0, which may indicate an invalid detection or an object that wasn’t tracked consistently.

Grouping Detections for Summary:

With a DataFrame of filtered results, we’ll group the data by tracking ID to find the earliest and latest timestamps for each object. Our object will get a new tracking ID if it leaves and re-enters the video. If in your project you need to track a specific label, you will need to add additional logic of grouping by class and summing the durations.

We’ll also determine the most commonly assigned class for each object, in case there are discrepancies in classification across frames, and calculate the average size of the object within the frame.

The summary function will perform these operations and output a readable string

def summary(df, filename, result_blob):
    if (
        "track_id" in df.columns
        and df["track_id"].notna().any()
        and df["track_id"].ne(0).any()
        df_filtered = df[(df["track_id"] != 0) & (df["track_id"].notna())].copy()
        # Group by 'track_id' and calculate duration, most frequent class and
        # corresponding name for each group
        # Group by track_id and calculate average box_percentage, min and max timestamp
        summary_df = (
                average_box_percentage=("box_percentage", "mean"),
                min_timestamp=("timestamp", "min"),
                max_timestamp=("timestamp", "max"),
                    lambda x: x.value_counts().index[0],
                ),  # Most common class per track_id
        # Calculate duration
        summary_df["duration"] = (
            summary_df["max_timestamp"] - summary_df["min_timestamp"]

        # Convert the DataFrame to a string
        output_string = "\n".join(
            f"{row['most_common_class']} with id {row['track_id']} was present in the video for {row['duration']} from {row['min_timestamp']} to {row['max_timestamp']} and was taking  {row['average_box_percentage']:.2f}% of the screen"
            for _, row in summary_df.iterrows()
        output_string = "No objects were detected in the video"


Executing the function will produce output like the following:

person with id 1.0 was present in the video for 0 days 00:01:34.273634 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:26:30.139664 and was taking  8.28% of the screen
person with id 2.0 was present in the video for 0 days 00:00:52.874862 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:25:48.740892 and was taking  3.87% of the screen
person with id 3.0 was present in the video for 0 days 00:01:02.194742 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:25:58.060772 and was taking  6.97% of the screen
person with id 4.0 was present in the video for 0 days 00:00:13.343196 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:25:09.209226 and was taking  1.07% of the screen
person with id 5.0 was present in the video for 0 days 00:00:12.491371 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:25:08.357401 and was taking  1.42% of the screen
person with id 6.0 was present in the video for 0 days 00:00:37.937545 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:25:33.803575 and was taking  1.60% of the screen
person with id 7.0 was present in the video for 0 days 00:00:12.994483 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:25:08.860513 and was taking  2.12% of the screen
car with id 8.0 was present in the video for 0 days 00:00:05.370814 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:25:01.236844 and was taking  3.89% of the screen
car with id 9.0 was present in the video for 0 days 00:00:01.655127 from 2023-11-08 03:24:55.866030 to 2023-11-08 03:24:57.521157 and was taking  2.39% of the screen

Run process on GPU

In our metadata Dataframe the last column “full_process_speed“ reflects a combination of preprocessing, inference and post-processing time. We can see that now we have the processing speed from 60 to 400 milliseconds per frame which is impressive, but we can do better. To unlock the full potential of our solution, we’ve added a simple enhancement: CUDA device detection. With just a few lines of code, our process now intelligently determines if a CUDA-compatible GPU is available and adjusts accordingly.:

# Check for CUDA device and set it
device = "0" if torch.cuda.is_available() else "cpu"
if device == "0":

This adjustment ensures that if the system has the capability, it will leverage the GPU, thus significantly boosting the processing speed.

The results speak for themselves. After running a live stream through our enhanced process for about 40 minutes, we observed a dramatic improvement in processing speed. The times dropped to a stunning 5 to 15 milliseconds per frame. This enhancement means our YOLOv8 solution is now processing frames up to 10 times faster on a GPU compared to CPU processing.

Files format and naming

Next we need to save our and summary into a suitable format for storage, such as CSV, JSON, or plain text. We picked csv for the DataFrame and txt for the summary. For Filename convention we use timestamp of when the process started and a part of the incoming link. That helps us easily identify where the stream is coming from and make file name unique.

Saving Results Incrementally:

  1. Timed Saves: We implement logic within the processing loop to save interim results at intervals that will be specified as environment variable. Since stream might be infinite we need to be able to check the results without breaking the connection to the stream.
  2. Final Save: Ensure all results are saved once the process is completed:

Function to save our Dataframe to Azure container:

def save_df(df, filename, result_blob):
    results_csv_file_name = f"{filename}.csv"
    results_blob_client = result_blob.get_blob_client(results_csv_file_name)
    csv_stream = io.StringIO()
    df.to_csv(csv_stream, index=False)
    # Convert the CSV data to bytes
    csv_bytes = csv_stream.getvalue().encode("utf-8")
    results_blob_client.upload_blob(csv_bytes, overwrite=True)

Add this part of code to the summary function to save the summary results to azure:

    results_txt_file_name = f"{filename}.txt"
    results_blob_client_txt = result_blob.get_blob_client(results_txt_file_name)
    results_blob_client_txt.upload_blob(output_string, overwrite=True)

Integrating with Azure for batch processing

To enable efficient batch processing using YOLO on Salad, establishing a connection between the Python code and Azure Storage resources is crucial.
We will start with initiating and maintaining connections to containers and storage queue:

from import ContainerClient
from import QueueClient

def connect_to_storage(storage_type: str, name: str, connection_str: str):
    """Connect to Azure Storage Container or Queue.
    if storage_type.lower() == "container":
        client = ContainerClient
    elif storage_type.lower() == "queue":
        client = QueueClient
        raise ValueError("Storage type should be 'container' or 'queue'")

    return client.from_connection_string(connection_str, name)

def azure_initiate(input_blob, output_blob, storage_queue, storage_connection_string):
    request = connect_to_storage("container", input_blob, storage_connection_string)
    result = connect_to_storage("container", output_blob, storage_connection_string)
    queue = connect_to_storage("queue", storage_queue, storage_connection_string)

    return request, result, queue

To ensure continuous monitoring of the queue for new messages, we implement an infinite loop in our main processing script. This loop will periodically check the queue for new messages, process them if available, and wait before checking again if the queue is empty:

while True:
            queue_length = (
            if queue_length > 0:
                queue_messages = request_queue_client.receive_messages()

Next our script will loop through messages in the Azure storage queue, decode the message, remove the message from queue to ensure that multiple containers do not process the same message and read the file we’ve put in the requests storage container. We’ll specifically handle JSON files containing video links, along with metadata indicating whether each video is pre-recorded or live:

Here is a link to the full script

Environment Variables

When deploying a batch processing solution using YOLO on Salad, leveraging environment variables provides a flexible and secure way to configure your application. These variables can be defined during the deployment process and accessed within your container. Here’s how you can integrate environment variables into your Python script for key configurations:

import argparse
parser = argparse.ArgumentParser()
args = parser.parse_args()

In this example, the script retrieves the STORAGE_CONNECTION_STRING environment variable. If it’s not set, it defaults to None.

Environment Variables Table

Environment Variable NameFunctionalityRequiredDefault Value
STORAGE_CONNECTION_STRINGConnection string for Azure Storage Account. Should be pulled from azure storage account access keys.YesNone
INPUT_CONTAINER_NAMEName of the input container in Azure Blob Storage where request files be saved to start the process. Value should match name of resource created in AzureNo“requests”
OUTPUT_CONTAINER_NAMEName of the output container where the result files be saved to. Value should match name of resource created in AzureNo“yolo-results”
INPUT_QUEUE_NAMEName of the queue for incoming requests. Value should match name of resource created in AzureNo“requestsqueue”
SAVE_EVERY_MINSInterval for saving results (in minutes).No30
QUEUE_CHECK_TIMERInterval for checking the queue (in seconds).No60

This table provides an overview of the environment variables used in the script, outlining their purpose, whether they are mandatory, and their default values if not explicitly set. This setup ensures that your batch processing application can be tailored to specific needs and conditions of the deployment environment on Salad.

Containerizing code with Docker

The next crucial step is to package our solution into a Docker image. This approach is key to facilitating deployment to our cloud clusters. Containerizing with Docker not only streamlines the deployment process but also ensures that our application runs reliably and consistently in the cloud environment, mirroring the conditions under which it was developed and tested.

When creating the Dockerfile, it’s crucial to select a base image that includes all the necessary system dependencies, because that might cause some networking issues. We’ve tested our solution with “python3.9“ base image, so if possible stick to it. If you have to use a different base image for any reason check out salad documentation on networking. Here is the full Dockerfile we will use to build an image:

# Use the official Python image as the base image
FROM python:3.9

# Set the working directory to /app

# Copy the inference folder to /app/inference
COPY /inference /app/inference

# Update pip and install requirements
RUN apt-get update && apt-get install -y git gcc python3-dev build-essential libgl1-mesa-dev libglib2.0-0 libsm6 libxrender1 libxext6
# RUN apt-get install -y socat
RUN python -m pip install pip==21.1.1
RUN pip install numpy
RUN pip install --upgrade --force-reinstall "git+"
RUN pip install lap==0.4.0
RUN pip install --upgrade pip
RUN pip install  -r inference/requirements.txt

WORKDIR /app/inference

# Set the entrypoint to run
ENTRYPOINT ["python3.9", ""]

We’ve also put together a bash script to easily publish your image to azure container registry. The script is located side by side with the Dockerfile in the git repo. You can use the following command to build and deploy your image:

bash <image name> <image version> <acr name>

Deploying Solution to Salad

We finally got to the last and most exiting part of our project. We will now deploy our full solution to the cloud.

Deploying your containerized application to Salad’s GPU Cloud is a very efficient and cost-effective way to run your object detection solution. Salad has a very user-friendly interface as well as an API for deployment. Let’s deploy our solution using salad portal.

First create your account and log into Create your organization and let’s deploy our container app.

Under container groups click “Deploy a Container Group“:

We now need to set up all of our container group parameters:

Configure Container Group:

  1. Create a unique name for your Container group
  2. Pick the Image Source: In our case we are using a private Azure Container Registry. Click Edit next to Image source. Now switch to “Private Registry“, under “What Service Are You Using“ pick Azure Container Registry.
    Get back to your azure portal and find the image name, username and password of our container registry repository.
    Find your acr in azure and click “repositories“ on the left Chose the image repository and click on the version you want to pull.
    Now under “Manifest“ you will see a Docker pull command in the followings format: docker pull {image name}: Copy the “image name“ and paste back to salad portal.
    Now let’s find acr username and password back in azure portal.
    Get back to your azure container registry page and click Access keys on the left. If your “Admin user” is not enabled, do so. Now copy the username and password and pass it back to Salad portal.
  3. Replica count: This will define how many videos can be processed simultaneously. For example if we have 10 containers we can put 10 files with links and they will all be picked up one by one.
  4. Pick compute resources: That is the best part. Pick how much cpu, ram and gpu you want to allocate to your process. The prices are very low in comparison to all the other cloud solutions, so be creative.
  5. Environment variables: This is where we need to add our environment variables we went through above. Specify them in a key-value format:

With everything in place, deploying your application on Salad is just a few clicks away. By taking advantage of Salad’s platform, you can ensure that your object detection API is running on reliable infrastructure that can handle intensive tasks at a fraction of the cost.
Now check “AutoStart container group once image is pulled“ and hit “Deploy“. We are all set let’s wait till our solution deploys and test it.

Benefits of Using Salad:

  • Cost-Effectiveness: Salad offers GPU cloud solutions at a more affordable rate than many other cloud providers, allowing you to allocate more resources to your application for less.
  • Ease of Use: With a focus on user experience, Salad’s interface is designed to be intuitive, removing the complexity from deploying and managing cloud-based applications.
  • Documentation and Support: Salad provides detailed documentation to assist with deployment, configuration, and troubleshooting, backed by a support team to help you when needed.

Test Full Solution deployed to Salad

Once your solution deployed on Salad you can put a file in the requests container and check the results.

Ones the video is processed 2 files will be saved to results container:

And our solution is now deployed to cloud compute and successful writes the results to our storage account.


Our journey into the realm of object detection using YOLOv8 and deploying it on Salad’s GPU cloud has been both challenging and rewarding. We’ve successfully navigated through all stages of development and deployment of our solution. Batch processing with Salad is a great solution that can provide simultaneous processing of files together with cost savings.

For those who might be interested we’ve also implemented multithreading functionality to our solution. With multithreading you can process several streams simultaneously at one node by providing multiple links in the request file:

Here is a link to the multi-threading code

If you are going to deploy the milti-threading solution, remember to change the python script name in the Dockerfile. Also make sure you have right hardware configurations for the number of threads you want to run.