assigned the portions
This commit is contained in:
21
documentation/Pavel/colorpicker.tex
Normal file
21
documentation/Pavel/colorpicker.tex
Normal file
@@ -0,0 +1,21 @@
|
||||
\section{Color Calibration}
|
||||
|
||||
All our detection algorithms require color calibration, and when the lighting
|
||||
conditions on the field change, colors might have to be recalibrated. For us
|
||||
this meant that a tool was necessary, that could simplify this process as far
|
||||
as possible. For this reason, we implemented a small OpenCV-based program, that
|
||||
we called \verb|Colorpicker|. This program can access various video sources, as
|
||||
well as use still images for calibration. The main interface contains the
|
||||
sliders for adjusting the HSV interval, as well as the video area,
|
||||
demonstrating the resulting binary mask. The colors can be calibrated for three
|
||||
targets: ball, goal and field; and the quality of detection, depending on the
|
||||
chosen target is demonstrated in the tool's video area. When the program is
|
||||
closed, the calibration values are automatically saved to the settings file
|
||||
\verb|nao_defaults.json|. The interface of the Colorpicker is demonstrated in
|
||||
the figure \ref{p figure colorpicker}.
|
||||
|
||||
\begin{figure}[ht]
|
||||
\includegraphics[width=\textwidth]{\fig colorpicker}
|
||||
\caption{Interface of the Colorpicker}
|
||||
\label{p figure colorpicker}
|
||||
\end{figure}
|
||||
40
documentation/Pavel/details.tex
Normal file
40
documentation/Pavel/details.tex
Normal file
@@ -0,0 +1,40 @@
|
||||
\chapter{Implementation Details}
|
||||
|
||||
\section{Code Organization}
|
||||
|
||||
Our code is organized as a standard Python package. The following command can
|
||||
be used to make the robot run the whole goal scoring sequence:
|
||||
|
||||
\begin{verbatim}
|
||||
python -m pykick
|
||||
\end{verbatim}
|
||||
|
||||
Alternatively, individual modules can be run with the following command:
|
||||
|
||||
\begin{verbatim}
|
||||
python -m pykick.[filename_without_.py]
|
||||
\end{verbatim}
|
||||
|
||||
The main logic of our implementation can be found in the following files:
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
\item \verb|__main__.py| contains the state machine described in the section
|
||||
\ref{p sec overview}.
|
||||
|
||||
\item \verb|striker.py| contains implementation of higher level behaviors,
|
||||
such as aligning the ball and the goal, or turning to ball.
|
||||
|
||||
\item \verb|finders.py| contains implementations of our detection algorithms.
|
||||
|
||||
\item \verb|imagereaders.py| contains some convenience classes for capturing
|
||||
video output from various video sources, such as Nao cameras, web-cameras
|
||||
or video files.
|
||||
|
||||
\item \verb|movements.py| implements convenience movements-related function,
|
||||
such as walking and also the kick.
|
||||
|
||||
\item \verb|nao_defaults.json| stores all project-global settings, such as
|
||||
the IP-address of the robot, or color calibration results.
|
||||
|
||||
\end{itemize}
|
||||
25
documentation/Pavel/overview.tex
Normal file
25
documentation/Pavel/overview.tex
Normal file
@@ -0,0 +1,25 @@
|
||||
\section{Strategy Overview}
|
||||
\label{p sec overview}
|
||||
|
||||
\begin{figure}
|
||||
\includegraphics[width=\textwidth]{\fig striker-flowchart}
|
||||
\caption{Overview of the goal scoring strategy}
|
||||
\label{p figure strategy-overview}
|
||||
\end{figure}
|
||||
|
||||
Now that all of the milestones are completed, we will present a short overview
|
||||
of the whole goal scoring strategy, the block diagram of which can be found in
|
||||
the figure \ref{p figure strategy-overview}. At the very beginning the robot
|
||||
will detect the ball and turn to ball, as described in the section \ref{j sec
|
||||
turning to ball}. After that, the distance to the ball will be calculated,
|
||||
the goal will be detected, and the direction to goal will be determined. If the
|
||||
ball is far away \textit{and} the ball and the goal are strongly misaligned,
|
||||
then the robot will try to approach the ball from the appropriate side,
|
||||
otherwise the robot will approach the ball directly. These approach steps will
|
||||
be repeated until the robot is close enough to the ball to start aligning to
|
||||
the goal, but in the practice one step of approach from the side followed by a
|
||||
short direct approach should suffice. When the ball is close, the robot will
|
||||
check if it is between the goalposts, and will perform necessary adjustments if
|
||||
that's not the case. After the ball and the goal are aligned, the robot will
|
||||
align its foot with the ball and kick the ball. For now, we assumed that the
|
||||
ball will reach the goal and so the robot can finish execution.
|
||||
106
documentation/Pavel/perception.tex
Normal file
106
documentation/Pavel/perception.tex
Normal file
@@ -0,0 +1,106 @@
|
||||
\section{Goal Detection}
|
||||
\label{p sec goal detect}
|
||||
|
||||
The goal detection presented itself as a more difficult task. The color of the
|
||||
goal is white, and there are generally many white areas in the image from the
|
||||
robot camera, which have area larger than that of the image of the goal, for
|
||||
example the white field lines and the big white wall in the room with the
|
||||
field. To deal with the multitude of the possible goal candidates, we
|
||||
propose the following heuristic algorithm.
|
||||
|
||||
\begin{figure}[ht]
|
||||
\includegraphics[width=\textwidth]{\fig goal-detection}
|
||||
\caption{Goal detection}
|
||||
\label{p figure goal-detection}
|
||||
\end{figure}
|
||||
|
||||
First, all contours around white areas are extracted by using a procedure
|
||||
similar to that described in the section \ref{p sec ball detection}. Unlike in
|
||||
the ball detection, the resulting binary mask undergoes some slight erosions
|
||||
and dilations, since in the goal shape detection the noise is undesired. Next,
|
||||
the \textit{candidate preselection} takes place. During this stage only $N$
|
||||
contours with the largest areas are considered further (in our experiments it
|
||||
was empirically determined that $N=5$ provides good results). Furthermore, all
|
||||
convex contours are rejected, since the goal is a highly non-convex shape.
|
||||
After that, a check is performed, how many points are necessary to approximate
|
||||
the remaining contours. The motivation behind this is the following: It is
|
||||
clearly visible that the goal shape can be perfectly approximated by a line
|
||||
with 8 straight segments. On an image from the camera, the approximation is
|
||||
almost perfect when using only 6 line segments, and in some degenerate cases
|
||||
when the input image is noisy, it might be necessary to use 9 line segments to
|
||||
approximate the shape of the goal. Any contour that requires a different number
|
||||
of line segments to be approximated is probably not the goal. The preselection
|
||||
stage ends here, and the remaining candidates are passed to the scoring
|
||||
function.
|
||||
|
||||
The scoring function calculates, how different the properties of the
|
||||
candidates are from the properties, that an idealized goal contour is expected
|
||||
to have. The evaluation is happening based on two properties. The first
|
||||
property is based on the observation, that the area of the goal contour is much
|
||||
smaller than the area of its \textit{enclosing convex hull} \cite{convex-hull}.
|
||||
The second observation is that all points of the goal contour must lie close to
|
||||
the enclosing convex hull. The mathematical formulation of a corresponding
|
||||
scoring function can then look like the following:
|
||||
|
||||
\begin{equation*}
|
||||
S(c)=\frac{A(c)}{A(Hull(c))}+\displaystyle\sum_{x_i \in c}\min_{h \in Hull(c)
|
||||
}(||x_i-h||)
|
||||
\end{equation*}
|
||||
|
||||
The contour, that minimizes the scoring function, while keeping its value under
|
||||
a certain threshold is considered the goal. If no contour scores below the
|
||||
threshold, the algorithm assumes that no goal was found. An important note
|
||||
is that the algorithm is designed in such a way, that the preselection and
|
||||
scoring are modular, which means that the current simple scoring function can
|
||||
later be replaced by a function with a better heuristic, or even by some
|
||||
function that employs machine learning models.
|
||||
|
||||
Our tests have shown, that when the white color is calibrated correctly, the
|
||||
algorithm can detect the goal almost without mistakes, when the goal is present
|
||||
in the image. Most irrelevant candidates are normally discarded in the
|
||||
preselection stage, and the scoring function improves the robustness further.
|
||||
Figure \ref{p figure goal-detection} demonstrates the algorithm in action. On
|
||||
the right is the binary mask with all found contours. On the left are the goal,
|
||||
and one contour that passed preselection but was rejected during scoring.
|
||||
|
||||
One downside of this algorithm is that in some cases the field lines
|
||||
might appear to have the same properties, that the goal contour is expected to
|
||||
have, therefore the field lines can be mistaken for the goal. We will describe,
|
||||
how we dealt with this problem, in the section \ref{p sec field detect}.
|
||||
|
||||
\section{Field Detection}
|
||||
\label{p sec field detect}
|
||||
|
||||
The algorithm for the field detection is very similar to the ball detection
|
||||
algorithm, but some concepts introduced in the section \ref{p sec goal detect}
|
||||
are also used here. This algorithm extracts the biggest green area in the
|
||||
image, finds its enclosing convex hull, and assumes everything inside the hull
|
||||
to be the field. In here, when we extract the field, we apply strong Gaussian
|
||||
blurring and erosions-dilations combination to the binary mask, so that the
|
||||
objects on the field are properly consumed.
|
||||
|
||||
\begin{figure}[ht]
|
||||
\includegraphics[width=\textwidth]{\fig field-detection}
|
||||
\caption{Field detection}
|
||||
\label{p figure field-detection}
|
||||
\end{figure}
|
||||
|
||||
Such rather simple field detection has two important applications. The first
|
||||
one is that the robot should be aware, where the field is, so that it doesn't
|
||||
try to walk away from the field. Due to time constraints, we didn't implement
|
||||
this part of the behavior. The second application of field detection is the
|
||||
improvement of the quality of goal and ball recognition. As was mentioned in
|
||||
the section on ball detection, the current algorithm might get confused, if
|
||||
there are any red objects in the robot's field of view. However, there
|
||||
shouldn't be any red objects on the field, except the ball itself. So, if
|
||||
everything that's not on the field is ignored, when trying to detect the ball,
|
||||
the probability of identifying a wrong object decreases. On the other hand, the
|
||||
problem with the goal detection algorithm was that it could be distracted by
|
||||
the field lines. So, if everything on the field is ignored for goal
|
||||
recognition, then the accuracy can be improved.
|
||||
|
||||
\begin{figure}[ht]
|
||||
\includegraphics[width=\textwidth]{\fig combined-detection}
|
||||
\caption{Using field detection to improve ball and goal detection}
|
||||
\label{p figure combined-detection}
|
||||
\end{figure}
|
||||
21
documentation/Pavel/pov.tex
Normal file
21
documentation/Pavel/pov.tex
Normal file
@@ -0,0 +1,21 @@
|
||||
\section{Video Recording from the Nao Camera}
|
||||
|
||||
For the purposes of debugging and also for the final presentation, we wanted to
|
||||
record what the robot sees during the program execution. NAOqi SDK provides a
|
||||
function to write the camera video to a file, but has a limitation of allowing
|
||||
the capture from only one camera at a time, which was not optimal for us. We
|
||||
overcame this limitation, by exploiting the fact, that the NAOqi SDK didn't
|
||||
impose any restrictions on reading individual frames from the cameras into the
|
||||
memory. So, during the test runs we started a separate thread, where the camera
|
||||
frames from both cameras were read into memory one by one, and after the robot
|
||||
has completed the execution of his task, the recorded frame sequences were
|
||||
written to video files with the help of OpenCV. This approach has a downside,
|
||||
that the frames can only be read at irregular and unpredictable intervals, so
|
||||
the framerate of the resulting video couldn't be calculated, which means that
|
||||
the playback speed of the videos needed to be adjusted afterwards using video
|
||||
editing programs. Furthermore, due to computational resource limitations of the
|
||||
Nao, the frames could have been captured only in low resolution. However, the
|
||||
quality of the resulting videos was sufficient for successful debugging and
|
||||
also for the presentation. Some of the illustrations for this report, such as
|
||||
the figure \ref{p figure direct-approach} for example, were created with the
|
||||
help of those videos.
|
||||
Reference in New Issue
Block a user