[Computer Vision] Object Detection and Bounding Boxes

이미지를 분류할 때에는 하나의 주요 대상만 식별하고 그것을 분류하는데에 초점을 맞춘다.

우리는 주요 대상이 무엇인지를 파악하기 원할때도 있지만, 어느 특정 위치에 있는지 알고 싶을때도 있다.(객체 탐지)

객체 탐지는 많은 분야에서 사용된다.

예를 들어 자율주행에서의 보행자, 도로, 장애물의 위치를 인식하거나

보안 분야에서도 비정상적인 대상을 감지하여 사전 위험에 대응할 수 있도록 사용할 수 있다.

차후에 업로드할 포스팅에서 객체 탐지에 대해서 배우기 전에 객체 위치를 표현하는 개념에 대해서 언급한다.

%matplotlib inline
from d2l import torch as d2l
import torch

d2l.set_figsize()
img = d2l.plt.imread('../img/catdog.jpg')
d2l.plt.imshow(img);

Bounding Box

객체를 감지할 때에는 보통 Bounding Box로 타겟의 위치를 표현한다.

Bounding Box은 사각형으로 표현되며, 위치는 2가지로 표현이 가능하다.

1) 왼쪽 상단의 (x, y) 좌표와 오른쪽 하단의 (x, y) 좌표를 이용해서 사각형을 표현
2) 사각형 중심의 (x, y) 좌표와 사각형의 너비와 높이를 이용해서 사각형을 표현

#@save
def box_corner_to_center(boxes):
    """Convert from (upper_left, bottom_right) to (center, width, height)"""
    x1, y1, x2, y2 = boxes[:, 0], boxes[:, 1], boxes[:, 2], boxes[:, 3]
    cx = (x1 + x2) / 2
    cy = (y1 + y2) / 2
    w = x2 - x1
    h = y2 - y1
    boxes = torch.stack((cx, cy, w, h), axis=-1)
    return boxes

#@save
def box_center_to_corner(boxes):
    """Convert from (center, width, height) to (upper_left, bottom_right)"""
    cx, cy, w, h = boxes[:, 0], boxes[:, 1], boxes[:, 2], boxes[:, 3]
    x1 = cx - 0.5 * w
    y1 = cy - 0.5 * h
    x2 = cx + 0.5 * w
    y2 = cy + 0.5 * h
    boxes = torch.stack((x1, y1, x2, y2), axis=-1)
    return boxes
    
# bbox is the abbreviation for bounding box
dog_bbox, cat_bbox = [60.0, 45.0, 378.0, 516.0], [400.0, 112.0, 655.0, 493.0]

boxes = torch.tensor((dog_bbox, cat_bbox))
print(box_center_to_corner(box_corner_to_center(boxes)) - boxes)

box_corner_to_center 함수는 1) corner 표현 방법을 2) center 표현 방법으로 Bounding Box 값을 변경하는 것이고,

box_center_to_corner 함수는 2) center 표현 방법을 1) corner 표현 방법으로 Bounding Box 값을 변경하는 것이다.

Bounding Box의 객체 탐지가 목적이 아닌 개념 설명이므로,

dog_bbox와 cat_bbox를 1) corner 표현 방법으로 직접 넣어주었다.

boxes로 합쳐준 뒤 1) corner 표현을 2) center 표현으로,

그리고 다시 1) corner 표현으로 바꾼 후 원래의 값과 같은지 비교를 하면 같은 것을 알 수 있다.

#@save
def bbox_to_rect(bbox, color):
    """Convert bounding box to matplotlib format."""
    # Convert the bounding box (top-left x, top-left y, bottom-right x,
    # bottom-right y) format to matplotlib format: ((upper-left x,
    # upper-left y), width, height)
    return d2l.plt.Rectangle(
        xy=(bbox[0], bbox[1]), width=bbox[2]-bbox[0], height=bbox[3]-bbox[1],
        fill=False, edgecolor=color, linewidth=2)

fig = d2l.plt.imshow(img)
fig.axes.add_patch(bbox_to_rect(dog_bbox, 'blue'))
fig.axes.add_patch(bbox_to_rect(cat_bbox, 'red'));

bbox_to_rect 함수는 matplotlib 모듈을 통해 사각형을 그려주는 함수이다.

matplotlib 모듈의 사각형 그리는 방법은 2) center 표현을 통해 그릴 수 있다.

ref. Dive into Deep Learning, Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola, 2020

'인공지능 > DeepLearning' 카테고리의 다른 글

[Computer Vision] Anchor Boxes(앵커 박스) - 2 (0)	2021.01.19
[Computer Vision] Anchor Boxes(앵커 박스) - 1 (0)	2021.01.19
[Computer Vision] Fine-Tuning (0)	2021.01.12
[Computer Vision] 확장된 Image를 이용한 훈련 모델 사용 (0)	2021.01.11
[Computer Vision] Image Augmentation (0)	2021.01.11

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

남규야 연구하자

[Computer Vision] Object Detection and Bounding Boxes

Bounding Box

'인공지능 > DeepLearning' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

[Computer Vision] Object Detection and Bounding Boxes

Bounding Box

'인공지능 > DeepLearning' 카테고리의 다른 글

'인공지능/DeepLearning' 관련글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역