YOLOv8 get predicted bounding box

Asked 2/2, 2023 at 14:6 Answered 29/6, 2024 at 6:45

Solved python opencv object-detection yolo

I want to integrate OpenCV with YOLOv8 from ultralytics, so I want to obtain the bounding box coordinates from the model prediction. How do I do this?

from ultralytics import YOLO
import cv2

model = YOLO('yolov8n.pt')
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)

while True:
    _, frame = cap.read()
    
    img = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    results = model.predict(img)

    for r in results:
        for c in r.boxes.cls:
            print(model.names[int(c)])

    cv2.imshow('YOLO V8 Detection', frame)
    if cv2.waitKey(1) & 0xFF == ord(' '):
        break

cap.release()
cv2.destroyAllWindows()

I want to display the YOLO annotated image in OpenCV. I know I can use the stream parameter in model.predict(source='0', show=True). But I want to continuously monitor the predicted class names for my program, at the same time displaying the image output.

Cb answered 2/2, 2023 at 14:6 Comment(0)

This will:

Loop through each frame in the video
Pass each frame to Yolov8 which will generate bounding boxes
Draw the bounding boxes on the frame using the built in ultralytics' annotator:


from ultralytics import YOLO
import cv2
from ultralytics.utils.plotting import Annotator  # ultralytics.yolo.utils.plotting is deprecated

model = YOLO('yolov8n.pt')
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)

while True:
    _, img = cap.read()
    
    # BGR to RGB conversion is performed under the hood
    # see: https://github.com/ultralytics/ultralytics/issues/2575
    results = model.predict(img)

    for r in results:
        
        annotator = Annotator(img)
        
        boxes = r.boxes
        for box in boxes:
            
            b = box.xyxy[0]  # get box coordinates in (left, top, right, bottom) format
            c = box.cls
            annotator.box_label(b, model.names[int(c)])
          
    img = annotator.result()  
    cv2.imshow('YOLO V8 Detection', img)     
    if cv2.waitKey(1) & 0xFF == ord(' '):
        break

cap.release()
cv2.destroyAllWindows()

Sabaean answered 3/2, 2023 at 7:43 Comment(7)

thanks.. @Mike B do you know how to turn off the printed output from model.predict? – Cb 7/2, 2023 at 14:13

model.predict(img, verbose=False) @Cb – Sabaean 14/4, 2023 at 6:54

I think the BGR2RGB conversion is a bug and should not be done, see github.com/ultralytics/ultralytics/issues/2575 – Brannan 5/10, 2023 at 14:12

also, what about r.plot() or results.plot() ? According to the documentation it should work, but it does not. – Brannan 5/10, 2023 at 14:13

"I think the BGR2RGB conversion is a bug and should not be done". Thank you for this clarification. Updating my answer! – Sabaean 6/10, 2023 at 9:3

The position of the box coordinates should be in this order: [left, top, right, bottom]. – Akene 12/12, 2023 at 20:4

Corrected comment. Thx @VARATBOHARA for pointing this out! – Sabaean 13/12, 2023 at 8:58

You can get all the information using the next code:

for result in results:
    # detection
    result.boxes.xyxy   # box with xyxy format, (N, 4)
    result.boxes.xywh   # box with xywh format, (N, 4)
    result.boxes.xyxyn  # box with xyxy format but normalized, (N, 4)
    result.boxes.xywhn  # box with xywh format but normalized, (N, 4)
    result.boxes.conf   # confidence score, (N, 1)
    result.boxes.cls    # cls, (N, 1)

    # segmentation
    result.masks.masks     # masks, (N, H, W)
    result.masks.segments  # bounding coordinates of masks, List[segment] * N

    # classification
    result.probs     # cls prob, (num_class, )

you can read furthermore in the documentation.

Jarry answered 2/2, 2023 at 14:28 Comment(0)

Kindly find a way to retreive the coordinates. The boxe object uses torch tensor. The coordinates can be retreived with torch.Tensor.tolist.

from ultralytics import YOLO
import cv2

im1 = cv2.imread('/dir/im1.jpg')
im2 = cv2.imread('/dir/im2.jpg')

model = YOLO('yolov8n.pt')
results = model.predict(source=[im1, im2])

fig, axs = plt.subplots(1,2, figsize=(10, 6))
axs = axs.ravel()
plt.subplots_adjust(left=0.1,bottom=0.1, 
                    right=0.9, top=0.9, 
                    wspace=0.2, hspace=0.4)

fig.suptitle("images", fontsize=18, y=0.95)

for i, (r, im) in enumerate(zip(results, images)):

    image = cv2.imread('/dir/' + im)

    c = r.boxes.xywh.tolist()[0] # To get the coordinates.
    x, y, w, h = c[0], c[1], c[2], c[3] # x, y are the center coordinates.
    
    axs[i].imshow(image)
    axs[i].add_patch(Rectangle((x-w/2, y-h/2), w, h,
                     edgecolor='blue', facecolor='none',
                     lw=3))

Sensitize answered 11/11, 2023 at 16:46 Comment(0)

The bounding box with box.xyxy approach as mentioned above are not correct. The correct positioning of the box coordinates is: [left, top, right, bottom]. A snippet of my code is pasted below.

def results(self, img, results):
        for result in results:
            for box in result.boxes:
                left, top, right, bottom = np.array(box.xyxy.cpu(), dtype=np.int).squeeze()
                width = right - left
                height = bottom - top
                center = (left + int((right-left)/2), top + int((bottom-top)/2))
                label = results[0].names[int(box.cls)]
                confidence = float(box.conf.cpu())

                cv2.rectangle(img, (left, top),(right, bottom), (255, 0, 0), 2)

                cv2.putText(img, label,(left, bottom+20),cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 1, cv2.LINE_AA)
        cv2.imshow('Filtered Frame', img)
        cv2.waitKey(0)

Akene answered 12/12, 2023 at 20:1 Comment(0)

Following is my way of getting the bounding box coordinates and using them to draw a rectangle with opencv-python.

for r in results:

    for box in r.boxes:

        coordinates = (box.xyxy).tolist()[0]

        print(coordinates) // returns the list of bounding box coordinates

        left, top, right, bottom = coordinates[0], coordinates[1], coordinates[2], coordinates[3]

        cv2.rectangle(img, (int(left), int(top)), (int(right), int(bottom)), (255, 0, 0), 2)

        cv2.imshow('window', img)

Lunt answered 29/6, 2024 at 6:45 Comment(0)

Recommended topics

Hot tags