Generate anchor boxes

Methods to generate anchor boxes of different aspect ratios.

To generate anchor boxes, we need three basic information:


source

bx

 bx (image_sz:(<class'int'>,<class'tuple'>),
     feature_sz:(<class'int'>,<class'tuple'>), asp_ratio:float=None,
     clip:bool=True, named:bool=True, anchor_sfx:str='a',
     min_visibility:float=0.25)

Calculate anchor box coords given an image size and feature size for a single aspect ratio.

Type Default Details
image_sz (<class ‘int’>, <class ‘tuple’>) image size (width, height)
feature_sz (<class ‘int’>, <class ‘tuple’>) feature map size (width, height)
asp_ratio float None aspect ratio (width:height), by default None
clip bool True whether to apply np.clip, by default True
named bool True whether to return (coords, labels), by default True
anchor_sfx str a suffix anchor label with anchor_sfx, by default “a”
min_visibility float 0.25 minimum visibility dictates the condition for a box to be considered
valid. The value corresponds to the ratio of expected area of an anchor box
to the calculated area after clipping to image dimensions., by default 0.25
Returns typing.Union[typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[typing.Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], typing.Sequence[typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], typing.Sequence[typing.Sequence[typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]], typing.Sequence[typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]]], typing.Sequence[typing.Sequence[typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]]]], typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]]]]]] anchor box coordinates in pascal_voc format
if named=True, a list of anchor box labels are also returned.
coords_1, labels_1 = bx(100, 10, 0.5)

Usually multiple anchor boxes with different feature_sz and asp_ratio are needed. This requirement arises in the case of multiscale object detection.

For multiscale object detection, feature maps from different convolution operations of the network are used to trace back into the input image, to generate anchor boxes. The bxs method of pybx provides this possibility.


source

bxs

 bxs (image_sz:(<class'int'>,<class'tuple'>), feature_szs:list=None,
      asp_ratios:list=None, named:bool=True, **kwargs)

Calculate anchor box coords given an image size and multiple feature sizes for mutiple aspect ratios.

Type Default Details
image_sz (<class ‘int’>, <class ‘tuple’>) image size (width, height)
feature_szs list None list of feature map sizes, each feature map size being an int or tuple, by default [(8, 8), (2, 2)]
asp_ratios list None list of aspect ratios for anchor boxes, each aspect ratio being a float calculated by (width:height), by default [1 / 2.0, 1.0, 2.0]
named bool True whether to return (coords, labels), by default True
kwargs
Returns typing.Union[typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[typing.Any]]]]], numpy.typing._array_like._SupportsArray[numpy.dtype], typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]], typing.Sequence[typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]], typing.Sequence[typing.Sequence[typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]], typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[numpy.typing._array_like._SupportsArray[numpy.dtype]]]]], bool, int, float, complex, str, bytes, typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]], typing.Sequence[typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]]], typing.Sequence[typing.Sequence[typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]]]], typing.Sequence[typing.Sequence[typing.Sequence[typing.Sequence[typing.Union[bool, int, float, complex, str, bytes]]]]]] anchor box coordinates in pascal_voc format
if named=True, a list of anchor box labels are also returned.
coords, labels = bxs(100, [10, 8, 5, 2], [1, 0.5, 0.3])
coords.shape, len(labels)
((579, 4), 579)

All methods work with asymetric image_sz (and or feature_szs as well):

coords, labels = bxs((100, 200), [10, 8, 5, 2], [1, 0.5, 0.3])
coords.shape, len(labels)
((654, 4), 654)