YOLOv8 get predicted bounding box
Asked Answered
C

5

12

I want to integrate OpenCV with YOLOv8 from ultralytics, so I want to obtain the bounding box coordinates from the model prediction. How do I do this?

from ultralytics import YOLO
import cv2

model = YOLO('yolov8n.pt')
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)

while True:
    _, frame = cap.read()
    
    img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    results = model.predict(img)

    for r in results:
        for c in r.boxes.cls:
            print(model.names[int(c)])

    cv2.imshow('YOLO V8 Detection', frame)
    if cv2.waitKey(1) & 0xFF == ord(' '):
        break

cap.release()
cv2.destroyAllWindows()

I want to display the YOLO annotated image in OpenCV. I know I can use the stream parameter in model.predict(source='0', show=True). But I want to continuously monitor the predicted class names for my program, at the same time displaying the image output.

Cb answered 2/2, 2023 at 14:6 Comment(0)
S
26

This will:

  1. Loop through each frame in the video
  2. Pass each frame to Yolov8 which will generate bounding boxes
  3. Draw the bounding boxes on the frame using the built in ultralytics' annotator:

from ultralytics import YOLO
import cv2
from ultralytics.utils.plotting import Annotator  # ultralytics.yolo.utils.plotting is deprecated

model = YOLO('yolov8n.pt')
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)

while True:
    _, img = cap.read()
    
    # BGR to RGB conversion is performed under the hood
    # see: https://github.com/ultralytics/ultralytics/issues/2575
    results = model.predict(img)

    for r in results:
        
        annotator = Annotator(img)
        
        boxes = r.boxes
        for box in boxes:
            
            b = box.xyxy[0]  # get box coordinates in (left, top, right, bottom) format
            c = box.cls
            annotator.box_label(b, model.names[int(c)])
          
    img = annotator.result()  
    cv2.imshow('YOLO V8 Detection', img)     
    if cv2.waitKey(1) & 0xFF == ord(' '):
        break

cap.release()
cv2.destroyAllWindows()
Sabaean answered 3/2, 2023 at 7:43 Comment(7)
thanks.. @Mike B do you know how to turn off the printed output from model.predict?Cb
model.predict(img, verbose=False) @CbSabaean
I think the BGR2RGB conversion is a bug and should not be done, see github.com/ultralytics/ultralytics/issues/2575Brannan
also, what about r.plot() or results.plot() ? According to the documentation it should work, but it does not.Brannan
"I think the BGR2RGB conversion is a bug and should not be done". Thank you for this clarification. Updating my answer!Sabaean
The position of the box coordinates should be in this order: [left, top, right, bottom].Akene
Corrected comment. Thx @VARATBOHARA for pointing this out!Sabaean
J
14

You can get all the information using the next code:

for result in results:
    # detection
    result.boxes.xyxy   # box with xyxy format, (N, 4)
    result.boxes.xywh   # box with xywh format, (N, 4)
    result.boxes.xyxyn  # box with xyxy format but normalized, (N, 4)
    result.boxes.xywhn  # box with xywh format but normalized, (N, 4)
    result.boxes.conf   # confidence score, (N, 1)
    result.boxes.cls    # cls, (N, 1)

    # segmentation
    result.masks.masks     # masks, (N, H, W)
    result.masks.segments  # bounding coordinates of masks, List[segment] * N

    # classification
    result.probs     # cls prob, (num_class, )

you can read furthermore in the documentation.

Jarry answered 2/2, 2023 at 14:28 Comment(0)
S
0

Kindly find a way to retreive the coordinates. The boxe object uses torch tensor. The coordinates can be retreived with torch.Tensor.tolist.

from ultralytics import YOLO
import cv2

im1 = cv2.imread('/dir/im1.jpg')
im2 = cv2.imread('/dir/im2.jpg')

model = YOLO('yolov8n.pt')
results = model.predict(source=[im1, im2])

fig, axs = plt.subplots(1,2, figsize=(10, 6))
axs = axs.ravel()
plt.subplots_adjust(left=0.1,bottom=0.1, 
                    right=0.9, top=0.9, 
                    wspace=0.2, hspace=0.4)

fig.suptitle("images", fontsize=18, y=0.95)

for i, (r, im) in enumerate(zip(results, images)):

    image = cv2.imread('/dir/' + im)

    c = r.boxes.xywh.tolist()[0] # To get the coordinates.
    x, y, w, h = c[0], c[1], c[2], c[3] # x, y are the center coordinates.
    
    axs[i].imshow(image)
    axs[i].add_patch(Rectangle((x-w/2, y-h/2), w, h,
                     edgecolor='blue', facecolor='none',
                     lw=3))
Sensitize answered 11/11, 2023 at 16:46 Comment(0)
A
0

The bounding box with box.xyxy approach as mentioned above are not correct. The correct positioning of the box coordinates is: [left, top, right, bottom]. A snippet of my code is pasted below.

def results(self, img, results):
        for result in results:
            for box in result.boxes:
                left, top, right, bottom = np.array(box.xyxy.cpu(), dtype=np.int).squeeze()
                width = right - left
                height = bottom - top
                center = (left + int((right-left)/2), top + int((bottom-top)/2))
                label = results[0].names[int(box.cls)]
                confidence = float(box.conf.cpu())

                cv2.rectangle(img, (left, top),(right, bottom), (255, 0, 0), 2)

                cv2.putText(img, label,(left, bottom+20),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 1, cv2.LINE_AA)
        cv2.imshow('Filtered Frame', img)
        cv2.waitKey(0)
Akene answered 12/12, 2023 at 20:1 Comment(0)
L
0

Following is my way of getting the bounding box coordinates and using them to draw a rectangle with opencv-python.

for r in results:

    for box in r.boxes:

        coordinates = (box.xyxy).tolist()[0]

        print(coordinates) // returns the list of bounding box coordinates

        left, top, right, bottom = coordinates[0], coordinates[1], coordinates[2], coordinates[3]

        cv2.rectangle(img, (int(left), int(top)), (int(right), int(bottom)), (255, 0, 0), 2)

        cv2.imshow('window', img)
Lunt answered 29/6, 2024 at 6:45 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.