worked on the report some more
This commit is contained in:
@@ -14,12 +14,17 @@
|
||||
\usepackage{todonotes}
|
||||
\usepackage{hyperref}
|
||||
|
||||
\usepackage{fancyhdr}
|
||||
\pagestyle{fancy}
|
||||
\rhead{\thepage}
|
||||
\lhead{Humanoid Robotic Systems}
|
||||
|
||||
\def\BibTeX{{\rm B\kern-.05em{\sc i\kern-.025em b}\kern-.08em
|
||||
T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}}
|
||||
|
||||
\begin{document}
|
||||
|
||||
\title{Humanoid Robotic Systems - ``Teleoperating NAO''}
|
||||
\title{TUM ICS Humanoid Robotic Systems \\ ``Teleoperating NAO''}
|
||||
|
||||
\author{Pavel Lutskov, Luming Li, Lukas Otter and Atef Kort}
|
||||
|
||||
@@ -66,12 +71,12 @@ with a current estimation of the operator's pose, a sensor feedback based robot
|
||||
pose, as well as with the camera feed from both NAO's cameras and with the
|
||||
webcam view of the operator. In order for the user to be able to give explicit
|
||||
commands to the robot, such as a request to open or close the hands or to
|
||||
temporarily suspend the operation, we implemented a simple voice command system.
|
||||
Finally, to be able to accommodate different users and to perform control in
|
||||
different conditions, a small calibration routine was developed, which would
|
||||
quickly take a user through the process of setting up the teleoperation.
|
||||
We elaborate on the tools and approaches that we used for implementation of the
|
||||
user-facing features in \autoref{ssec:interface}.
|
||||
temporarily suspend the operation, we implemented a simple voice command
|
||||
system. Finally, to be able to accommodate different users and to perform
|
||||
control in different conditions, a small calibration routine was developed,
|
||||
which would quickly take a user through the process of setting up the
|
||||
teleoperation. We elaborate on the tools and approaches that we used for
|
||||
implementation of the user-facing features in \autoref{ssec:interface}.
|
||||
|
||||
An example task, that can be done using our teleoperation package might be the
|
||||
following. The operator can safely and precisely navigate the robot through an
|
||||
@@ -126,13 +131,21 @@ the transforms of the markers with respect to the \verb|odom| frame
|
||||
|
||||
\subsection{Interface}\label{ssec:interface}
|
||||
|
||||
\paragraph{Speech State Machine}
|
||||
\paragraph{Speech Commands}
|
||||
|
||||
Based on NAOqi API and NAO built-in voice recognition
|
||||
Based on NAOqi API and NAO's built-in voice recognition, we built a Python
|
||||
speech recognition server, providing a ROS action as a means of accessing it.
|
||||
It was possible to reuse the results of the HRS Tutorial 7, where a speech
|
||||
recognition node was already implemented. Those results, however, were not
|
||||
flexible enough for our purposes, and making the necessary adjustments was more
|
||||
time-consuming than implementing a node in Python from scratch. It was our
|
||||
design constraint, that the robot only accepts commands which lead to state
|
||||
changes that are reachable from the current state. We will provide further
|
||||
detail on how the state dependency is implemented and how the speech
|
||||
recognition is integrated with our system in \autoref{sec:integration}.
|
||||
|
||||
\begin{table}
|
||||
\caption{Commands of the speech recognition module}
|
||||
\begin{center}
|
||||
\begin{table}[h]
|
||||
\centering
|
||||
\begin{tabular}{|c|c|c|}
|
||||
\hline
|
||||
\textbf{Command}&\textbf{Action}&\textbf{Available in state} \\
|
||||
@@ -150,10 +163,18 @@ Based on NAOqi API and NAO built-in voice recognition
|
||||
``Close'' & Close hands & Idle, Imitation \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
\label{tab_speech_states}
|
||||
\end{center}
|
||||
\caption{Commands of the speech recognition module}
|
||||
\label{tab:speech-states}
|
||||
\end{table}
|
||||
|
||||
The \autoref{tab:speech-states} depicts the list of available commands,
|
||||
depending on the state of the system. We tried to make those as short and
|
||||
distinguishable as possible in order to minimize the number of misunderstood
|
||||
commands. As a confirmation, the NAO repeats the recognized command, or says
|
||||
``nope'' if it detected some speech but couldn't recognize a valid command.
|
||||
Such brevity greatly speeds up the speech-based interaction, compared to the
|
||||
case if NAO would talk in full sentences.
|
||||
|
||||
\paragraph{Teleoperation Interface}
|
||||
|
||||
In order to make it possible to operate
|
||||
@@ -165,7 +186,7 @@ The NAO-part contains video streams of the top and bottom cameras on the robots
|
||||
head. These were created by subscribing to their respective topics (FIND NAME)
|
||||
using the \textit{rqt\_gui} package. Moreover, it also consists of a rviz
|
||||
window which gives a visual representation of the NAO. For this, the robot's
|
||||
joint positions are displayed by subscribing to the topic tf where the
|
||||
joint positions are displayed by subscribing to the topic \verb|tf| where the
|
||||
coordinates and the different coordinate frames are published. We further used
|
||||
the \textit{NAO-meshes} package to create the 3D model of the NAO.
|
||||
|
||||
@@ -191,20 +212,24 @@ the \textit{NAO-meshes} package to create the 3D model of the NAO.
|
||||
|
||||
\subsection{Navigation}\label{ssec:navigation}
|
||||
|
||||
One of the two main feature in our robot is an intuitive navigation tool, which
|
||||
allows the robot to navigate the environment by tracking the user movements.
|
||||
Next, our system needed a way for the operator to command the robot to a
|
||||
desired location. Furthermore, the operator has to be able to adjust the speed
|
||||
of the robot's movement. To achieve this we use the approach that we call the
|
||||
``Human Joystick''. We implement this approach in a module called
|
||||
\verb|walker|.
|
||||
|
||||
By fixing an ArUco marker on the user's chest, we can continuously track its
|
||||
position and orientation in a three dimensional space and so capture its
|
||||
motion.
|
||||
|
||||
In order to simplify the task we define a buffer zone where the robot can only
|
||||
track the orientation of the user then depending on which direction the user
|
||||
will exit the zone the robot will either go forward, backward, left or right.
|
||||
Also the covered distance will influence the speed of the robot, the further
|
||||
the user is from the center of the buffer zone the faster the movement of the
|
||||
robot will be. The extent of the movement and buffer zone are determined
|
||||
automatically through calibration.
|
||||
Through the calibration procedure we determine the initial position of the
|
||||
operator. Furthermore, we track the position of the operator by locating the
|
||||
ArUco marker on the operator's chest. Then, we can map the current position of
|
||||
the user to the desired direction and speed of the robot. For example, if the
|
||||
operator steps to the right from the initial position, then the robot will be
|
||||
moving to the right until the operator returns back into the initial position.
|
||||
The further the operator is from the origin, the faster will the robot move. In
|
||||
order to control the rotation of the robot, the operator can slightly turn the
|
||||
body clockwise or counterclockwise while being in the initial position so that
|
||||
the marker can still be detected by the webcam. The speed of the rotation can
|
||||
also be controlled by the magnitude of the operator's rotation. The process is
|
||||
schematically illustrated in \autoref{fig:joystick}.
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
@@ -213,6 +238,28 @@ automatically through calibration.
|
||||
\label{fig:joystick}
|
||||
\end{figure}
|
||||
|
||||
There is a small region around the original position, in which the operator
|
||||
can stay without causing the robot to move. As soon as the operator exceeds the
|
||||
movement threshold into some direction, the robot will slowly start moving in
|
||||
that direction. We use the following relationship for calculating the robot's
|
||||
speed:
|
||||
|
||||
$$v = v_{min} + \frac{d - d_{thr}}{d_{max} - d_{thr}}(v_{max} - v_{min})$$
|
||||
|
||||
Here, $d$ denotes the operator's distance from the origin in that direction,
|
||||
$d_{thr}$ is the minimum distance required for starting the movement and the
|
||||
$d_{max}$ is the boundary of the control zone; $d_{thr}$ and $d_{max}$ are
|
||||
determined through the calibration process. Currently, there can only be
|
||||
movement in one direction at a time, so in case the operator exceeds the
|
||||
threshold in more than one direction, the robot will move in the direction with
|
||||
the higher precedence. The forwards-backwards motion has the highest
|
||||
precedence, then goes the sideways motion and, finally, the rotation.
|
||||
|
||||
Our test have shown, that having the control over the speed is crucial for the
|
||||
success of the teleoperation. The alignment to an object is impossible if the
|
||||
robot is walking at its maximum speed, on the other hand walking around the
|
||||
room at a fraction of the maximum speed is too slow.
|
||||
|
||||
\subsection{Imitation}\label{ssec:imitation}
|
||||
|
||||
One of the main objectives of our project was the imitation of the operator
|
||||
@@ -303,17 +350,17 @@ joint motions need to be calculated by the means of Cartesian control.
|
||||
At first, we tried to employ the Cartesian controller that is shipped with the
|
||||
NAOqi SDK. We soon realized, however, that this controller was unsuitable for
|
||||
our task, because of the two significant limitations. The first problem with
|
||||
the NAO's controller is that it freezes, if the target is being updated too
|
||||
the NAO's controller is that it freezes if the target is being updated too
|
||||
often: the arms of the robot start to stutter, and then make a final erratic
|
||||
motion once the program is terminated. However, arm teleoperation requires
|
||||
smoothness and therefore frequent updates of the target position, and the NAO
|
||||
controller didn't fit these requirements. A possible reason for such behavior
|
||||
is a bug in the implementation, and it might be possible that this problem was
|
||||
fixed in the later versions of the NAOqi SDK.
|
||||
could be a bug in the implementation, and it might be possible that this
|
||||
problem was fixed in the later versions of the NAOqi SDK.
|
||||
|
||||
Secondly, the controller of the NAO is not robust against
|
||||
\textit{singularities}. Singularities occur, when the kinematic chain loses one
|
||||
of the degrees of freedom, and so in order to reach a desired position, the
|
||||
or more degrees of freedom, and so in order to reach a desired position, the
|
||||
joint motors must apply infinite torques. Practically, for the imitation task
|
||||
this would mean that once the robot has its arms fully stretched, the arms
|
||||
would execute violent erratic motions which would hurt the robot or cause it to
|
||||
@@ -330,9 +377,9 @@ $$\dot{\theta} = J^{-1}\dot{r}$$
|
||||
|
||||
In this formula $\dot{r}$ denotes the 3D speed of the target, which is the
|
||||
result of the posture retargeting, namely $r_{hand,NAO}^{torso,NAO}$. $J$ is
|
||||
the Jacobian matrix. The Jacobian matrix gives the relationship between
|
||||
the joint angle speed and the resulting speed of the effector
|
||||
on the end of the kinematic chain which the Jacobian matrix describes.
|
||||
the Jacobian matrix \cite{jacobian}. The Jacobian matrix gives the relationship
|
||||
between the joint angle speed and the resulting speed of the effector on the
|
||||
end of the kinematic chain which the Jacobian matrix describes.
|
||||
|
||||
We now apply a common simplification and state that
|
||||
|
||||
@@ -374,19 +421,18 @@ The other method that we employed was to calculate the Jacobian matrix
|
||||
analytically. Since only rotational joints were available, the approximation
|
||||
for the Jacobian matrix, which is the tangent in rotational joints, can be
|
||||
calculated using the cross product between the rotational axis of a joint,
|
||||
denoted by $e_j$, and the rotational vector \\ $r_{end}-r_{j}$, where $r_{end}$
|
||||
denoted by $e_j$, and the rotational vector $r_{end}-r_{j}$, where $r_{end}$
|
||||
is the position of the end effector (i.e.\ hand) and $r_{j}$ is the position of
|
||||
the joint. The following relation gives us one column of the Jacobian matrix.
|
||||
We can get the rotational axis of a joint and the position of the joint in the
|
||||
torso frame through NAOqi API.
|
||||
|
||||
$$
|
||||
J_j = \frac{\partial r_{end}}{\partial\theta_j} =
|
||||
(e \times (r_{end}-r_j))
|
||||
$$
|
||||
|
||||
This can be repeated for each rotational joint until the whole matrix is
|
||||
filled.
|
||||
We can get the rotational axis of a joint and the position of the joint in the
|
||||
torso frame through NAOqi API. This can be repeated for each rotational joint
|
||||
until the whole matrix is filled.
|
||||
|
||||
The next step for the Cartesian controller is to determine the inverse Jacobian
|
||||
matrix for the inverse kinematic. For this singular value decomposition is
|
||||
@@ -410,18 +456,22 @@ Then we can avoid the singularity behavior by setting to $0$ the entries in
|
||||
$\Sigma^{-1}$ that are above a threshold value $\tau = 50$, which we determined
|
||||
through experimentation.
|
||||
|
||||
Our test have shown, that our controller doesn't have the freezing behavior,
|
||||
which is present in the NAO's own controller, and therefore the target of
|
||||
the control can be updated with arbitrary frequency. Furthermore, our controller
|
||||
shows no signs of producing violent arm motions, which means that our strategy
|
||||
for handling singularities was effective.
|
||||
The final control objective for the current loop iteration can be stated as:
|
||||
|
||||
\section{System Implementation and Integration}
|
||||
$$\theta_{targ} = \theta_{cur} + \Delta\theta$$
|
||||
|
||||
Our test have shown, that our controller doesn't have the freezing behavior,
|
||||
which is present in the NAO's own controller, and therefore the target of the
|
||||
control can be updated with arbitrary frequency. Furthermore, our controller
|
||||
shows no signs of producing violent arm motions, which means that our strategy
|
||||
for handling singularities was effective. The implementation for the whole
|
||||
imitation routine resides in the \verb|imitator| module of our system.
|
||||
|
||||
\section{System Implementation and Integration}\label{sec:integration}
|
||||
|
||||
Now that the individual modules were designed and implemented, the whole system
|
||||
needed to be assembled together. It is crucial that the states of the robot and
|
||||
the transitions between the states are well defined and correctly executed. The
|
||||
state machine, that we designed, can be seen in the \autoref{fig:overview}.
|
||||
needed to be assembled together. The state machine that we designed can be
|
||||
seen in the \autoref{fig:overview}.
|
||||
|
||||
The software package was organized as a collection of ROS nodes, controlled by
|
||||
a single master node. The master node keeps track of the current system state,
|
||||
@@ -447,7 +497,7 @@ until the fall recovery is complete.
|
||||
|
||||
We will now illustrate our architecture by using interaction between the walker
|
||||
node and the master node as an example. This interaction is depicted in the
|
||||
\autoref{fig:master-walker}. The walker node subscribes to the TF
|
||||
\autoref{fig:master-walker}. The walker node subscribes to the \verb|tf|
|
||||
transform of the chest ArUco marker, and requests a position update every 0.1
|
||||
seconds. If in the current cycle the marker happens to be outside of the buffer
|
||||
zone (see \autoref{fig:joystick}), or the rotation of the marker exceeds the
|
||||
@@ -487,14 +537,14 @@ A final piece of our system is the speech-based command interface. Since in our
|
||||
system the acceptable commands vary between states, the speech recognition
|
||||
controller must be aware of the current state of the system, therefore the
|
||||
master node is responsible for this functionality. The master node runs an
|
||||
auxiliary loop, in which a recognition target is sent to the speech
|
||||
server node. If a relevant word is detected, master receives the result and
|
||||
updates the state accordingly and then sends a new recognition target. If a
|
||||
state change occurred before any speech was detected, then the master sends a
|
||||
cancellation request to the speech server for the currently running objective
|
||||
and, again, sends a new target.
|
||||
auxiliary loop, in which a recognition target is sent to the speech server
|
||||
node, described in \autoref{ssec:interface}. If a relevant word is detected,
|
||||
master receives the result and updates the state accordingly and then sends a
|
||||
new recognition target. If a state change occurred before any speech was
|
||||
detected, then the master sends a cancellation request to the speech server for
|
||||
the currently running objective and, again, sends a new target.
|
||||
|
||||
\section{Conclusion and possible drawbacks}
|
||||
\section{Conclusion and Possible Drawbacks}
|
||||
|
||||
Upon completion of this project, our team successfully applied the knowledge
|
||||
that we acquired during the HRS lectures and tutorials to a complex practical
|
||||
|
||||
Reference in New Issue
Block a user