kick-it/documentation/perception.tex

\section{Ball Detection}
\label{p sec ball detection}

The very first task that needed to be accomplished was to detect the ball,
which is uniformly red-colored and measures about 6 cm in diameter. We decided
to use a popular algorithm based on color segmentation \cite{ball-detect}. The
idea behind this algorithm is to find the biggest red area in the image and
assume that this is the ball. First, the desired color needs to be defined as
an interval of HSV (Hue-Saturation-Value) \cite{hsv} values. After that, the
image itself needs to be transformed into HSV colorspace, so that the regions
of interest can be extracted into a \textit{binary mask}. The contours of the
regions can then be identified in a mask \cite{contours}, and the areas of the
regions can be calculated using the routines from the OpenCV library. The
center and the radius of the region with the largest area are then determined
and are assumed to be the center and the radius of the ball.

\begin{figure}[ht]
  \includegraphics[width=\textwidth]{\fig ball-detection}
  \caption{Ball detection. On the right is the binary mask}
  \label{p figure ball-detection}
\end{figure}

It is sometimes recommended \cite{ball-detect} to eliminate the noise in the
binary mask by applying a sequence of \textit{erosions} and \textit{dilations},
but we found, that for the task of finding the \textit{biggest} area the noise
doesn't present a problem, whereas performing erosions may completely delete
the image of the ball from the mask, if it is relatively far from the robot and
the camera resolution is low. For this reason it was decided not to process the
binary mask with erosions and dilations, which allowed us to detect the ball
even over long distances.

The advantages of the presented algorithm are its speed and simplicity. The
major downside is that the careful color calibration is required for the
algorithm to function properly. If the HSV interval of the targeted color is
too narrow, then the algorithm might miss the ball; if the interval is too
wide, then other big red-shaded objects in the camera image will be detected as
the ball. A possible approach to alleviate these issues to a certain degree
will be presented further in the section \ref{p sec field detect}. To
conclude, we found this algorithm to be robust enough for our purposes, if the
sensible color calibration was provided.

\section{Goal Detection}
\label{p sec goal detect}

The goal detection presented itself as a more difficult task. The color of the
goal is white, and there are generally many white areas in the image from the
robot camera, which have area larger than that of the image of the goal, for
example the white field lines and the big white wall in the room with the
field. To deal with the multitude of the possible goal candidates, we
propose the following heuristic algorithm.

\begin{figure}[ht]
  \includegraphics[width=\textwidth]{\fig goal-detection}
  \caption{Goal detection}
  \label{p figure goal-detection}
\end{figure}

First, all contours around white areas are extracted by using a procedure
similar to that described in the section \ref{p sec ball detection}. Unlike in
the ball detection, the resulting binary mask undergoes some slight erosions
and dilations, since in the goal shape detection the noise is undesired. Next,
the \textit{candidate preselection} takes place. During this stage only $N$
contours with the largest areas are considered further (in our experiments it
was empirically determined that $N=5$ provides good results). Furthermore, all
convex contours are rejected, since the goal is a highly non-convex shape.
After that, a check is performed, how many points are necessary to approximate
the remaining contours. The motivation behind this is the following: it is
clearly visible that the goal shape can be perfectly approximated by a line
with 8 straight segments. On an image from the camera, the approximation is
almost perfect when using only 6 line segments, and in some degenerate cases
when the input image is noisy, it might be necessary to use 9 line segments to
approximate the shape of the goal. Any contour that requires a different number
of line segments to be approximated is probably not the goal. The preselection
stage ends here, and the remaining candidates are passed to the scoring
function.

The scoring function calculates, how different are the properties of the
candidates are from the properties, that an idealized goal contour is expected
to have. The evaluation is happening based on two properties. The first
property is based on the observation, that the area of the goal contour is much
smaller than the area of its \textit{enclosing convex hull} \cite{convex-hull}.
The second observation is that all points of the goal contour must lie close to
the enclosing convex hull. The mathematical formulation of a corresponding
scoring function can then look like the following:

\begin{equation*}
  S(c)=\frac{A(c)}{A(Hull(c))}+\displaystyle\sum_{x_i \in c}\min_{h \in Hull(c)
  }(||x_i-h||)
\end{equation*}

The contour, that minimizes the scoring function, while keeping its value under
a certain threshold is considered the goal. If no contour scores below the
threshold, then the algorithm assumes that no goal was found. An important note
is that the algorithm is designed in such a way, that the preselection and
scoring are modular, which means that the current simple scoring function can
later be replaced by a function with a better heuristic, or even by some
function that employs machine learning models.

Our tests have shown, that when the white color is calibrated correctly, the
algorithm can detect the goal almost without mistakes, when the goal is present
in the image. Most irrelevant candidates are normally discarded in the
preselection stage, and the scoring function improves the robustness further.
Figure \ref{p figure goal-detection} demonstrates the algorithm in action. On
the right is the binary mask with all found contours. On the left are the goal,
and one contour that passed preselection but was rejected during scoring.

One downside of this algorithm, is that in some cases the field lines
might appear to have the same properties, that the goal contour is expected to
have, therefore the field lines can be mistaken for the goal. We will describe,
how we dealt with this problem, in the section \ref{p sec field detect}.

\section{Field Detection}
\label{p sec field detect}

The algorithm for the field detection is very similar to the ball detection
algorithm, but some concepts introduced in the section \ref{p sec goal detect}
are also used here. This algorithm extracts the biggest green area in the
image, finds its enclosing convex hull, and assumes everything inside the hull
to be the field. In here, when we extract the field, we apply strong Gaussian
blurring and erosions-dilations combination to the binary mask, so that the
objects on the field are properly consumed.

\begin{figure}[ht]
  \includegraphics[width=\textwidth]{\fig field-detection}
  \caption{Field detection}
  \label{p figure field-detection}
\end{figure}

Such rather simple field detection has two important applications. The first
one is that the robot should be aware, where the field is, so that it doesn't
try to walk away from the field. Due to time constraints, we didn't implement
this part of the behavior. The second application of field detection is the
improvement of the quality of goal and ball recognition. As was mentioned in
the section on ball detection, the current algorithm might get confused, if
there are any red objects in the robot's field of view. However, there
shouldn't be any red objects on the field, except the ball itself. So, if
everything that's not on the field is ignored, when trying to detect the ball,
the probability of identifying a wrong object decreases. On the other hand, the
problem with the goal detection algorithm was that it could be distracted by
the field lines. So, if everything on the field is ignored for goal
recognition, then the accuracy can be improved.

\begin{figure}[ht]
  \includegraphics[width=\textwidth]{\fig combined-detection}
  \caption{Using field detection to improve ball and goal detection}
  \label{p figure combined-detection}
\end{figure}