Как оптимизировать параметры для декодирования нескольких QR-кодов в одном изображении с помощью YOLO

Dynamsoft Barcode Reader SDK позволяет разработчикам настраивать параметры алгоритма для различных сценариев сканирования штрих-кода. Есть несколько предварительно настроенных шаблонов параметров для производительности декодирования. Независимо от того, преследуете ли вы максимальную скорость декодирования, максимальную точность декодирования или компромисс между скоростью и точностью, всегда найдется подходящий вариант. Так как параметры являются критическими, как вы можете установить их правильно? Для большинства сценариев сканирования штрих-кода параметры по умолчанию должны работать очень хорошо. В этой статье рассматривается сложный сценарий: декодирование нескольких QR-кодов в одном изображении. Мы обучаем крошечную модель YOLO определять тип штрих-кода и ожидаемое количество QR-кодов, используемых для Dynamsoft Barcode Reader SDK. Вы увидите, как эти два параметра влияют на производительность декодирования.

Монтаж

Считыватель штрих-кода Dynamsoft Python

pip install dbr

Подать заявку на пробную лицензию

OpenCV Python

pip install opencv-python

даркнет

git clone https://github.com/AlexeyAB/darknet.git

Как получить и использовать предварительно настроенные шаблоны параметров Dynamsoft Barcode SDK

Посетите Онлайн-демонстрацию Dynamsoft Barcode Reader

2. Выберите режим и нажмите Advanced Settings.

3. Прокрутите вниз до раздела Template. Измените ResultCoordinateType на Pixel и нажмите кнопку копирования.

4. Сохраните шаблон как файл JSON. Для сравнения сохраним все шаблоны от лучшей скорости до лучшего покрытия и назовем их от l1.json до l5.json.

Вот тестовое изображение, содержащее несколько QR-кодов:

Мы можем написать программу на Python для сравнения производительности декодирования QR-кода на основе разных шаблонов параметров:

import cv2 as cv
import numpy as np
import time
from dbr import *
import os
reader = BarcodeReader()
# Apply for a trial license: https://www.dynamsoft.com/customer/license/trialLicense?product=dbr
license_key = "LICENSE-KEY"
reader.init_license(license_key)
def decode(filename, template_name):
    frame = cv.imread(filename)
    template_path = os.path.dirname(os.path.abspath(__file__)) + os.path.sep + template_name
    settings = reader.reset_runtime_settings() 
    error = reader.init_runtime_settings_with_file(template_path, EnumConflictMode.CM_OVERWRITE)
    before = time.time()
    results = reader.decode_buffer(frame)
    after = time.time()
    COLOR_RED = (0,0,255)
    thickness = 2
    if results != None:
        found = len(results)
        for result in results:
            text = result.barcode_text 
            points = result.localization_result.localization_points
            data = np.array([[points[0][0], points[0][1]], [points[1][0], points[1][1]], [points[2][0], points[2][1]], [points[3][0], points[3][1]]])
            cv.drawContours(image=frame, contours=[data], contourIdx=-1, color=COLOR_RED, thickness=thickness, lineType=cv.LINE_AA)
            cv.putText(frame, result.barcode_text, points[0], cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
        cv.putText(frame, '%.2f s, Qr found: %d' % (after - before, found), (20, 20), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
    else:
        cv.putText(frame, '%.2f s, Qr found: %d' % (after - before, 0), (20, 20), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
    cv.imshow(template_name, frame)
decode("test.jpg", "l1.json")
decode("test.jpg", "l2.json")
decode("test.jpg", "l3.json")
decode("test.jpg", "l4.json")
decode("test.jpg", "l5.json")
cv.waitKey(0)

Производительность

В сценарии с несколькими QR-кодами мы надеемся найти как можно больше QR-кодов. Хотя шаблон l5 нашел больше всего QR-кодов, затраты времени невыносимы. Можно ли ускорить декодирование QR, если мы знаем, сколько QR-кодов на изображении? Давайте проверим эту гипотезу, используя машинное обучение для обнаружения QR-кода.

Обучение детектора QR-кода с помощью YOLOv4

Мы используем модель YOLO для обучения детектора QR-кода.

Получите общедоступный набор данных с QR-кодами от boofcv.
Аннотируйте изображения QR с помощью labelImg.
Скачать yolov4-tiny.conv.29
Настройте файл конфигурации на основе darknet/cfg/yolov4-tiny-custom.cfg:

batch=64              # line 6
 subdivisions=16       # line 7
 width=640             # line 8
 height=640            # line 9
    
 max_batches = 6000    # line 20

 steps=4800,5400       # line 22

 filters=18            # 212
 classes=1             # 220

 filters=18            # 263
 classes=1             # 269

5. Создайте файл obj.data:

QR_CODE

6. Создайте файл obj.names:

classes = 1
 train  = data/train.txt
 valid  = data/test.txt
 names = data/obj.names
 backup = backup/

7. Используйте следующий сценарий для создания данных обучения и проверки:

import os
 import re
 from shutil import copyfile
 import argparse
 import math
 import random


 def iterate_dir(source, ratio):
     source = source.replace('\\', '/')
     train_dir = 'data/obj/train'
     test_dir = 'data/obj/test'

     if not os.path.exists(train_dir):
         os.makedirs(train_dir)
     if not os.path.exists(test_dir):
         os.makedirs(test_dir)

     images = [f for f in os.listdir(source)
               if re.search(r'([a-zA-Z0-9\s_\\.\-\(\):])+(?i)(.jpg|.jpeg|.png)$', f)]

     num_images = len(images)
     num_test_images = math.ceil(ratio*num_images)

     image_files = []

     for i in range(num_test_images):
         idx = random.randint(0, len(images)-1)
         filename = images[idx]
         image_files.append("data/obj/test/" + filename)
         copyfile(os.path.join(source, filename),
                 os.path.join(test_dir, filename))
         txt_filename = os.path.splitext(filename)[0]+'.txt'
         copyfile(os.path.join(source, txt_filename),
                 os.path.join(test_dir, txt_filename))

         images.remove(images[idx])

     with open("data/test.txt", "w") as outfile:
         for image in image_files:
             outfile.write(image)
             outfile.write("\n")
         outfile.close()

     image_files = []

     for filename in images:
         image_files.append("data/obj/train/" + filename)
         copyfile(os.path.join(source, filename),
                 os.path.join(train_dir, filename))
         txt_filename = os.path.splitext(filename)[0]+'.txt'
         copyfile(os.path.join(source, txt_filename),
                 os.path.join(train_dir, txt_filename))

     with open("data/train.txt", "w") as outfile:
         for image in image_files:
             outfile.write(image)
             outfile.write("\n")
         outfile.close()


 def main():
     parser = argparse.ArgumentParser(description="Partition dataset of images into training and testing sets",
                                     formatter_class=argparse.RawTextHelpFormatter)
     parser.add_argument(
         '-i', '--imageDir',
         help='Path to the folder where the image dataset is stored. If not specified, the CWD will be used.',
         type=str,
         default=os.getcwd()
     )
     parser.add_argument(
         '-r', '--ratio',
         help='The ratio of the number of test images over the total number of images. The default is 0.1.',
         default=0.1,
         type=float)
     args = parser.parse_args()
     iterate_dir(args.imageDir, args.ratio)

 if __name__ == '__main__':
     main()

Запустить скрипт

python partition_dataset.py -i ../images -r 0.1

8. Обучите модель:

darknet detector test data/obj.data yolov4-tiny-custom.cfg backup/yolov4-tiny-custom_last.weights sample/test.png

9. Подтвердите модель:

darknet detector test data/obj.data yolov4-tiny-custom.cfg backup/yolov4-tiny-custom_last.weights sample/test.png

В следующем коде мы используем модель машинного обучения, чтобы сначала обнаружить QR-коды на изображении. Как только QR-код найден, мы можем установить параметры: expected_barcodes_count = 1 и barcode_format_ids = EnumBarcodeFormat.BF_QR_CODE:

import cv2 as cv
import numpy as np
import time
from dbr import *
import os
# Initialize Dynamsoft Barcode Reader
reader = BarcodeReader()
# Apply for a trial license: https://www.dynamsoft.com/customer/license/trialLicense
license_key = "LICENSE-KEY"
reader.init_license(license_key)
# Load YOLOv4-tiny model
class_names = open('obj.names').read().strip().split('\n')
net = cv.dnn.readNetFromDarknet('yolov4-tiny-custom.cfg', 'yolov4-tiny-custom_last.weights')
net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV)
model = cv.dnn_DetectionModel(net)
width = 640
height = 640
CONFIDENCE_THRESHOLD = 0.2
NMS_THRESHOLD = 0.4
COLOR_RED = (0,0,255)
COLOR_BLUE = (255,0,0)
def decode(filename, template_name):
    frame = cv.imread(filename)
    if frame.shape[1] > 1024 or frame.shape[0] > 1024:
        width = 1024
        height = 1024
    model.setInputParams(size=(width, height), scale=1/255, swapRB=True)
    template_path = os.path.dirname(os.path.abspath(__file__)) + os.path.sep + template_name
    settings = reader.reset_runtime_settings() 
    error = reader.init_runtime_settings_with_file(template_path, EnumConflictMode.CM_OVERWRITE)
    # YOLO detection
    yolo_start = time.time()
    classes, scores, boxes = model.detect(frame, CONFIDENCE_THRESHOLD, NMS_THRESHOLD)
    yolo_end = time.time()
    print("YOLO detection time: %.2f s" % (yolo_end - yolo_start))
    index = 0
    dbr_found = 0
    total_dbr_time = 0
    for (classid, score, box) in zip(classes, scores, boxes):
        label = "%s : %f" % (class_names[classid], score) 
        tmp = frame[box[1]:box[1] + box[3], box[0]: box[0] + box[2]]
        # Set parameters for DBR
        settings = reader.get_runtime_settings()
        settings.expected_barcodes_count = 1
        settings.barcode_format_ids = EnumBarcodeFormat.BF_QR_CODE
        reader.update_runtime_settings(settings)
        before = time.time()
        results = reader.decode_buffer(tmp)
        after = time.time()
        total_dbr_time += after - before
        if results != None:
            found = len(results)
            for result in results:
                text = result.barcode_text 
                dbr_found += 1
                points = result.localization_result.localization_points
                data = np.array([[points[0][0], points[0][1]], [points[1][0], points[1][1]], [points[2][0], points[2][1]], [points[3][0], points[3][1]]])
                cv.drawContours(image=tmp, contours=[data], contourIdx=-1, color=(0, 0, 255), thickness=2, lineType=cv.LINE_AA)
                cv.putText(frame, text, (box[0], box[1] + 10), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
        else:
            found = 0
        index += 1
        cv.rectangle(frame, box, COLOR_BLUE, 2)
        cv.putText(frame, label, (box[0], box[1] - 10), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_BLUE, 2)
    cv.putText(frame, 'DBR+YOLO %.2f s, DBR found: %d, YOLO found: %d' % (yolo_end - yolo_start + total_dbr_time, dbr_found, len(classes)), (0, 15), cv.FONT_HERSHEY_SIMPLEX, 0.5, COLOR_RED)
    cv.imshow(template_name, frame)
decode("test.jpg", "l1.json")
decode("test.jpg", "l2.json")
decode("test.jpg", "l3.json")
decode("test.jpg", "l4.json")
decode("test.jpg", "l5.json")
cv.waitKey(0)

Производительность

L1 + YOLO

L2 + YOLO

L3 + + YOLO

L4 + YOLO

L5 + YOLO

Прежде чем использовать машинное обучение, чтобы найти 6 QR-кодов на этом тестовом изображении, нам нужно использовать шаблон l5 и это займет около 22,83 секунды. После использования обнаружения YOLO затраты времени сокращаются до 3,99 секунды.

Контрольный показатель для набора изображений Boofcv QR

Мы сравнили набор QR-изображений boofcv с шаблоном l3, который является наиболее сбалансированным. Скорость декодирования значительно повышается.

Исходный код

https://github.com/yushulx/barcode-qrcode-images/tree/main/darknet/sample/qr_decoding