How to use orientated Bounding Boxes (obb) using Yolo

Roboflow

  1. create a dataset in https://roboflow.com/
  2. use the ‘smart polygon’ option to annotate the images. This way in the later process the orientation can be extracted
  3. Export the dataset. Format: YOLOv8 Orientated Bounding Boxes, download zip to computer

Ultralystics HUB
4. Go to https://hub.ultralytics.com/
5. Upload the downloaded dataset from Roboflow and use task: OBB
Warning: At the bottom it says ‘see sample dataset’. your downloaded dataset from roboflow probably doesn’t look like that. So you have to unzip your zip folder first to modify it and then make it a zip file again. The names must match exactly!
6. Go to the dataset and click on ‘train model’
7. Follow these steps and use google colab for training
Warning: If you get an error message because it cannot find a particular folder, it is probably because your dataset does not have the correct layout. Go back to step 5 and upload a new modified dataset

Google colab
8. After you have trained your model your model will show up on the left in the ‘files’. go to runs/obb/train/weights and download ‘best.pt’. That’s your trained model!

Python code
9. Run the following code with your trained model

import cv2
import numpy as np
import pyrealsense2 as rs
from ultralytics import YOLO

# Connect to the camera
pipeline = rs.pipeline()

# Configure streams
config = rs.config()
config.enable_stream(rs.stream.color, (640, 480), rs.format.bgr8, 30)
config.enable_stream(rs.stream.depth, (640, 480), rs.format.z16, 30)

# Align depth camera to color camera
align_to = rs.stream.color
align = rs.align(align_to)

# Load YOLO model
model = YOLO('YOUR-MODEL.pt', 'obb')

# Start camera stream
pipeline.start(config)

while True:
    # Wait until the streaming frames are available
    frames = pipeline.wait_for_frames()

    # Align the depth frame to color frame
    aligned_frames = align.process(frames)

    # Get aligned frames
    aligned_depth_frame = aligned_frames.get_depth_frame() # aligned_depth_frame is a 640x480 depth image
    aligned_color_frame = aligned_frames.get_color_frame()

    # Put all the streaming frame data in an array
    depth_image = np.asanyarray(aligned_depth_frame.get_data())
    color_image = np.asanyarray(aligned_color_frame.get_data())

    # Turn depth image in colormap
    depth_color_image = cv2.applyColorMap(cv2.convertScaleAbs(depth_image,
                                                              alpha=0.5), cv2.COLORMAP_JET)

    # YOLO looks at the image at what he sees and plots it
    results = model.track(color_image, persist=PERSIST, conf=CONFIDENCE_THRESHOLD, imgsz=IMAGE_SIZE)
    yolo_image = results[0].plot()

    cv2.imshow('frame', yolo_image)
    cv2.imshow('depth', depth_color_image)

    if cv2.waitKey(1) == 27:
        break

pipeline.stop()
cv2.destroyAllWindows()
  1. To extract the orientation and the middle of the object use the following function:
# Get position of the boundingboxes
pixel_value = Boundingbox(results, yolo_image)
print(pixel_value)

def Boundingbox(results, yolo_image):
    for result in results:
        obb = result.obb.cpu().numpy()

        results_xywhrs = obb.xywhr
        pixel_values = []

        for results_xywhr in results_xywhrs:
            X_middle = int(results_xywhr[0])
            Y_middle = int(results_xywhr[1])

            rotation = results_xywhr[4]
            rotation = int(rotation * 180 / math.pi)

            yolo_image = cv2.circle(yolo_image, (X_middle, Y_middle), 5, (0, 0, 255), -1)
            pixel_values.append([X_middle, Y_middle, rotation])

        return(pixel_values)