worked on the report some more

This commit is contained in:
2019-03-01 16:43:12 +01:00
parent fb04a32788
commit 700d01ecef

View File

@@ -14,12 +14,17 @@
\usepackage{todonotes} \usepackage{todonotes}
\usepackage{hyperref} \usepackage{hyperref}
\usepackage{fancyhdr}
\pagestyle{fancy}
\rhead{\thepage}
\lhead{Humanoid Robotic Systems}
\def\BibTeX{{\rm B\kern-.05em{\sc i\kern-.025em b}\kern-.08em \def\BibTeX{{\rm B\kern-.05em{\sc i\kern-.025em b}\kern-.08em
T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}} T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}}
\begin{document} \begin{document}
\title{Humanoid Robotic Systems - ``Teleoperating NAO''} \title{TUM ICS Humanoid Robotic Systems \\ ``Teleoperating NAO''}
\author{Pavel Lutskov, Luming Li, Lukas Otter and Atef Kort} \author{Pavel Lutskov, Luming Li, Lukas Otter and Atef Kort}
@@ -66,12 +71,12 @@ with a current estimation of the operator's pose, a sensor feedback based robot
pose, as well as with the camera feed from both NAO's cameras and with the pose, as well as with the camera feed from both NAO's cameras and with the
webcam view of the operator. In order for the user to be able to give explicit webcam view of the operator. In order for the user to be able to give explicit
commands to the robot, such as a request to open or close the hands or to commands to the robot, such as a request to open or close the hands or to
temporarily suspend the operation, we implemented a simple voice command system. temporarily suspend the operation, we implemented a simple voice command
Finally, to be able to accommodate different users and to perform control in system. Finally, to be able to accommodate different users and to perform
different conditions, a small calibration routine was developed, which would control in different conditions, a small calibration routine was developed,
quickly take a user through the process of setting up the teleoperation. which would quickly take a user through the process of setting up the
We elaborate on the tools and approaches that we used for implementation of the teleoperation. We elaborate on the tools and approaches that we used for
user-facing features in \autoref{ssec:interface}. implementation of the user-facing features in \autoref{ssec:interface}.
An example task, that can be done using our teleoperation package might be the An example task, that can be done using our teleoperation package might be the
following. The operator can safely and precisely navigate the robot through an following. The operator can safely and precisely navigate the robot through an
@@ -126,13 +131,21 @@ the transforms of the markers with respect to the \verb|odom| frame
\subsection{Interface}\label{ssec:interface} \subsection{Interface}\label{ssec:interface}
\paragraph{Speech State Machine} \paragraph{Speech Commands}
Based on NAOqi API and NAO built-in voice recognition Based on NAOqi API and NAO's built-in voice recognition, we built a Python
speech recognition server, providing a ROS action as a means of accessing it.
It was possible to reuse the results of the HRS Tutorial 7, where a speech
recognition node was already implemented. Those results, however, were not
flexible enough for our purposes, and making the necessary adjustments was more
time-consuming than implementing a node in Python from scratch. It was our
design constraint, that the robot only accepts commands which lead to state
changes that are reachable from the current state. We will provide further
detail on how the state dependency is implemented and how the speech
recognition is integrated with our system in \autoref{sec:integration}.
\begin{table} \begin{table}[h]
\caption{Commands of the speech recognition module} \centering
\begin{center}
\begin{tabular}{|c|c|c|} \begin{tabular}{|c|c|c|}
\hline \hline
\textbf{Command}&\textbf{Action}&\textbf{Available in state} \\ \textbf{Command}&\textbf{Action}&\textbf{Available in state} \\
@@ -150,10 +163,18 @@ Based on NAOqi API and NAO built-in voice recognition
``Close'' & Close hands & Idle, Imitation \\ ``Close'' & Close hands & Idle, Imitation \\
\hline \hline
\end{tabular} \end{tabular}
\label{tab_speech_states} \caption{Commands of the speech recognition module}
\end{center} \label{tab:speech-states}
\end{table} \end{table}
The \autoref{tab:speech-states} depicts the list of available commands,
depending on the state of the system. We tried to make those as short and
distinguishable as possible in order to minimize the number of misunderstood
commands. As a confirmation, the NAO repeats the recognized command, or says
``nope'' if it detected some speech but couldn't recognize a valid command.
Such brevity greatly speeds up the speech-based interaction, compared to the
case if NAO would talk in full sentences.
\paragraph{Teleoperation Interface} \paragraph{Teleoperation Interface}
In order to make it possible to operate In order to make it possible to operate
@@ -165,7 +186,7 @@ The NAO-part contains video streams of the top and bottom cameras on the robots
head. These were created by subscribing to their respective topics (FIND NAME) head. These were created by subscribing to their respective topics (FIND NAME)
using the \textit{rqt\_gui} package. Moreover, it also consists of a rviz using the \textit{rqt\_gui} package. Moreover, it also consists of a rviz
window which gives a visual representation of the NAO. For this, the robot's window which gives a visual representation of the NAO. For this, the robot's
joint positions are displayed by subscribing to the topic tf where the joint positions are displayed by subscribing to the topic \verb|tf| where the
coordinates and the different coordinate frames are published. We further used coordinates and the different coordinate frames are published. We further used
the \textit{NAO-meshes} package to create the 3D model of the NAO. the \textit{NAO-meshes} package to create the 3D model of the NAO.
@@ -191,20 +212,24 @@ the \textit{NAO-meshes} package to create the 3D model of the NAO.
\subsection{Navigation}\label{ssec:navigation} \subsection{Navigation}\label{ssec:navigation}
One of the two main feature in our robot is an intuitive navigation tool, which Next, our system needed a way for the operator to command the robot to a
allows the robot to navigate the environment by tracking the user movements. desired location. Furthermore, the operator has to be able to adjust the speed
of the robot's movement. To achieve this we use the approach that we call the
``Human Joystick''. We implement this approach in a module called
\verb|walker|.
By fixing an ArUco marker on the user's chest, we can continuously track its Through the calibration procedure we determine the initial position of the
position and orientation in a three dimensional space and so capture its operator. Furthermore, we track the position of the operator by locating the
motion. ArUco marker on the operator's chest. Then, we can map the current position of
the user to the desired direction and speed of the robot. For example, if the
In order to simplify the task we define a buffer zone where the robot can only operator steps to the right from the initial position, then the robot will be
track the orientation of the user then depending on which direction the user moving to the right until the operator returns back into the initial position.
will exit the zone the robot will either go forward, backward, left or right. The further the operator is from the origin, the faster will the robot move. In
Also the covered distance will influence the speed of the robot, the further order to control the rotation of the robot, the operator can slightly turn the
the user is from the center of the buffer zone the faster the movement of the body clockwise or counterclockwise while being in the initial position so that
robot will be. The extent of the movement and buffer zone are determined the marker can still be detected by the webcam. The speed of the rotation can
automatically through calibration. also be controlled by the magnitude of the operator's rotation. The process is
schematically illustrated in \autoref{fig:joystick}.
\begin{figure} \begin{figure}
\centering \centering
@@ -213,6 +238,28 @@ automatically through calibration.
\label{fig:joystick} \label{fig:joystick}
\end{figure} \end{figure}
There is a small region around the original position, in which the operator
can stay without causing the robot to move. As soon as the operator exceeds the
movement threshold into some direction, the robot will slowly start moving in
that direction. We use the following relationship for calculating the robot's
speed:
$$v = v_{min} + \frac{d - d_{thr}}{d_{max} - d_{thr}}(v_{max} - v_{min})$$
Here, $d$ denotes the operator's distance from the origin in that direction,
$d_{thr}$ is the minimum distance required for starting the movement and the
$d_{max}$ is the boundary of the control zone; $d_{thr}$ and $d_{max}$ are
determined through the calibration process. Currently, there can only be
movement in one direction at a time, so in case the operator exceeds the
threshold in more than one direction, the robot will move in the direction with
the higher precedence. The forwards-backwards motion has the highest
precedence, then goes the sideways motion and, finally, the rotation.
Our test have shown, that having the control over the speed is crucial for the
success of the teleoperation. The alignment to an object is impossible if the
robot is walking at its maximum speed, on the other hand walking around the
room at a fraction of the maximum speed is too slow.
\subsection{Imitation}\label{ssec:imitation} \subsection{Imitation}\label{ssec:imitation}
One of the main objectives of our project was the imitation of the operator One of the main objectives of our project was the imitation of the operator
@@ -303,17 +350,17 @@ joint motions need to be calculated by the means of Cartesian control.
At first, we tried to employ the Cartesian controller that is shipped with the At first, we tried to employ the Cartesian controller that is shipped with the
NAOqi SDK. We soon realized, however, that this controller was unsuitable for NAOqi SDK. We soon realized, however, that this controller was unsuitable for
our task, because of the two significant limitations. The first problem with our task, because of the two significant limitations. The first problem with
the NAO's controller is that it freezes, if the target is being updated too the NAO's controller is that it freezes if the target is being updated too
often: the arms of the robot start to stutter, and then make a final erratic often: the arms of the robot start to stutter, and then make a final erratic
motion once the program is terminated. However, arm teleoperation requires motion once the program is terminated. However, arm teleoperation requires
smoothness and therefore frequent updates of the target position, and the NAO smoothness and therefore frequent updates of the target position, and the NAO
controller didn't fit these requirements. A possible reason for such behavior controller didn't fit these requirements. A possible reason for such behavior
is a bug in the implementation, and it might be possible that this problem was could be a bug in the implementation, and it might be possible that this
fixed in the later versions of the NAOqi SDK. problem was fixed in the later versions of the NAOqi SDK.
Secondly, the controller of the NAO is not robust against Secondly, the controller of the NAO is not robust against
\textit{singularities}. Singularities occur, when the kinematic chain loses one \textit{singularities}. Singularities occur, when the kinematic chain loses one
of the degrees of freedom, and so in order to reach a desired position, the or more degrees of freedom, and so in order to reach a desired position, the
joint motors must apply infinite torques. Practically, for the imitation task joint motors must apply infinite torques. Practically, for the imitation task
this would mean that once the robot has its arms fully stretched, the arms this would mean that once the robot has its arms fully stretched, the arms
would execute violent erratic motions which would hurt the robot or cause it to would execute violent erratic motions which would hurt the robot or cause it to
@@ -330,9 +377,9 @@ $$\dot{\theta} = J^{-1}\dot{r}$$
In this formula $\dot{r}$ denotes the 3D speed of the target, which is the In this formula $\dot{r}$ denotes the 3D speed of the target, which is the
result of the posture retargeting, namely $r_{hand,NAO}^{torso,NAO}$. $J$ is result of the posture retargeting, namely $r_{hand,NAO}^{torso,NAO}$. $J$ is
the Jacobian matrix. The Jacobian matrix gives the relationship between the Jacobian matrix \cite{jacobian}. The Jacobian matrix gives the relationship
the joint angle speed and the resulting speed of the effector between the joint angle speed and the resulting speed of the effector on the
on the end of the kinematic chain which the Jacobian matrix describes. end of the kinematic chain which the Jacobian matrix describes.
We now apply a common simplification and state that We now apply a common simplification and state that
@@ -374,19 +421,18 @@ The other method that we employed was to calculate the Jacobian matrix
analytically. Since only rotational joints were available, the approximation analytically. Since only rotational joints were available, the approximation
for the Jacobian matrix, which is the tangent in rotational joints, can be for the Jacobian matrix, which is the tangent in rotational joints, can be
calculated using the cross product between the rotational axis of a joint, calculated using the cross product between the rotational axis of a joint,
denoted by $e_j$, and the rotational vector \\ $r_{end}-r_{j}$, where $r_{end}$ denoted by $e_j$, and the rotational vector $r_{end}-r_{j}$, where $r_{end}$
is the position of the end effector (i.e.\ hand) and $r_{j}$ is the position of is the position of the end effector (i.e.\ hand) and $r_{j}$ is the position of
the joint. The following relation gives us one column of the Jacobian matrix. the joint. The following relation gives us one column of the Jacobian matrix.
We can get the rotational axis of a joint and the position of the joint in the
torso frame through NAOqi API.
$$ $$
J_j = \frac{\partial r_{end}}{\partial\theta_j} = J_j = \frac{\partial r_{end}}{\partial\theta_j} =
(e \times (r_{end}-r_j)) (e \times (r_{end}-r_j))
$$ $$
This can be repeated for each rotational joint until the whole matrix is We can get the rotational axis of a joint and the position of the joint in the
filled. torso frame through NAOqi API. This can be repeated for each rotational joint
until the whole matrix is filled.
The next step for the Cartesian controller is to determine the inverse Jacobian The next step for the Cartesian controller is to determine the inverse Jacobian
matrix for the inverse kinematic. For this singular value decomposition is matrix for the inverse kinematic. For this singular value decomposition is
@@ -410,18 +456,22 @@ Then we can avoid the singularity behavior by setting to $0$ the entries in
$\Sigma^{-1}$ that are above a threshold value $\tau = 50$, which we determined $\Sigma^{-1}$ that are above a threshold value $\tau = 50$, which we determined
through experimentation. through experimentation.
Our test have shown, that our controller doesn't have the freezing behavior, The final control objective for the current loop iteration can be stated as:
which is present in the NAO's own controller, and therefore the target of
the control can be updated with arbitrary frequency. Furthermore, our controller
shows no signs of producing violent arm motions, which means that our strategy
for handling singularities was effective.
\section{System Implementation and Integration} $$\theta_{targ} = \theta_{cur} + \Delta\theta$$
Our test have shown, that our controller doesn't have the freezing behavior,
which is present in the NAO's own controller, and therefore the target of the
control can be updated with arbitrary frequency. Furthermore, our controller
shows no signs of producing violent arm motions, which means that our strategy
for handling singularities was effective. The implementation for the whole
imitation routine resides in the \verb|imitator| module of our system.
\section{System Implementation and Integration}\label{sec:integration}
Now that the individual modules were designed and implemented, the whole system Now that the individual modules were designed and implemented, the whole system
needed to be assembled together. It is crucial that the states of the robot and needed to be assembled together. The state machine that we designed can be
the transitions between the states are well defined and correctly executed. The seen in the \autoref{fig:overview}.
state machine, that we designed, can be seen in the \autoref{fig:overview}.
The software package was organized as a collection of ROS nodes, controlled by The software package was organized as a collection of ROS nodes, controlled by
a single master node. The master node keeps track of the current system state, a single master node. The master node keeps track of the current system state,
@@ -447,7 +497,7 @@ until the fall recovery is complete.
We will now illustrate our architecture by using interaction between the walker We will now illustrate our architecture by using interaction between the walker
node and the master node as an example. This interaction is depicted in the node and the master node as an example. This interaction is depicted in the
\autoref{fig:master-walker}. The walker node subscribes to the TF \autoref{fig:master-walker}. The walker node subscribes to the \verb|tf|
transform of the chest ArUco marker, and requests a position update every 0.1 transform of the chest ArUco marker, and requests a position update every 0.1
seconds. If in the current cycle the marker happens to be outside of the buffer seconds. If in the current cycle the marker happens to be outside of the buffer
zone (see \autoref{fig:joystick}), or the rotation of the marker exceeds the zone (see \autoref{fig:joystick}), or the rotation of the marker exceeds the
@@ -487,14 +537,14 @@ A final piece of our system is the speech-based command interface. Since in our
system the acceptable commands vary between states, the speech recognition system the acceptable commands vary between states, the speech recognition
controller must be aware of the current state of the system, therefore the controller must be aware of the current state of the system, therefore the
master node is responsible for this functionality. The master node runs an master node is responsible for this functionality. The master node runs an
auxiliary loop, in which a recognition target is sent to the speech auxiliary loop, in which a recognition target is sent to the speech server
server node. If a relevant word is detected, master receives the result and node, described in \autoref{ssec:interface}. If a relevant word is detected,
updates the state accordingly and then sends a new recognition target. If a master receives the result and updates the state accordingly and then sends a
state change occurred before any speech was detected, then the master sends a new recognition target. If a state change occurred before any speech was
cancellation request to the speech server for the currently running objective detected, then the master sends a cancellation request to the speech server for
and, again, sends a new target. the currently running objective and, again, sends a new target.
\section{Conclusion and possible drawbacks} \section{Conclusion and Possible Drawbacks}
Upon completion of this project, our team successfully applied the knowledge Upon completion of this project, our team successfully applied the knowledge
that we acquired during the HRS lectures and tutorials to a complex practical that we acquired during the HRS lectures and tutorials to a complex practical