An Integrated Decision Making Approach for Adaptive Shared Control of Mobility Assistance Robots

Mobility assistance robots provide support to elderly or patients during walking. The design of a safe and intuitive assistance behavior is one of the major challenges in this context. We present an integrated approach for the context-specific, on-line adaptation of the assistance level of a rollator-type mobility assistance robot by gain-scheduling of low-level robot control parameters. A human-inspired decision-making model, the drift-diffusion Model, is introduced as the key principle to gain-schedule parameters and with this to adapt the provided robot assistance in order to achieve a human-like assistive behavior. The mobility assistance robot is designed to provide (a) cognitive assistance to help the user following a desired path towards a predefined destination as well as (b) sensorial assistance to avoid collisions with obstacles while allowing for an intentional approach of them. Further, the robot observes the user long-term performance and fatigue to adapt the overall level of (c) physical assistance provided. For each type of assistance a decision-making problem is formulated that affects different low-level control parameters. The effectiveness of the proposed approach is demonstrated in technical validation experiments. Moreover, the proposed approach is evaluated in a user study with 35 elderly persons. Obtained results indicate that the proposed gain-scheduling technique incorporating ideas of human decision-making models shows a general high potential for the application in adaptive shared control of mobility assistance robots.


Introduction
A sufficient motor performance that allows performing physical daily activities is a critical requirement for maintaining mobility and vitality, especially for elderly people and patients. Changes due to aging or disease may result in the limitation of human motor performance, sensing capabilities and cognitive functions, and thus reduce the ability to perform activities of daily living such as walking, transferring or performing personal hygiene. This again often leads to less autonomy and a decreased quality of life and self-esteem. Thus, the constantly increasing elderly population, especially in industrialized countries, has led to a strong demand for healthcare specialists and assistive devices. Mobility assistance robots (MARs) can partly cover this demand by providing physical, sensorial, and cognitive assistance [31,44,55].
How to adapt the provided assistance depending on the actual context is a major challenge in the controller design of assistive robots. An assistive robot under direct user control can have difficulties guaranteeing acceptable performance and safety due to cognitive, sensorial and physical weaknesses of target users being elderly or disabled people. On the other hand, a fully autonomous system that ignores the user's intention can result in user dissatisfaction and dangerous situations in case of human and robot disagreement. The latter can highly affect acceptability of such systems by their end-users (elderlies and patients) [1,3,12,14,20]. Therefore, a shared control approach allowing human and robot to share the control over resulting actions is typically employed. Shared control has been studied for different applications of human-machine interaction: For example [2,4,28,40,53] investigated shared control for teleoperation, space and aviation systems, [35][36][37][38] explored similar principles for surgery applications, while [7,54] report on shared control for powered wheelchairs.
In literature most adaptive shared control mechanisms attempt to tune the level of assistance to improve metrics related to the task. Thus, an inherent difficulty lies in deciding on suitable metrics and adaptation strategies such that the overall robot assistance results in a natural behavior to the user. In this context natural refers to an intuitive cooperative control scheme that considers human and robot to collaborate as peers, meaning that the robot is allowed to make own decisions to online adjust its level of assistance taking current and past information on the user and environment into account. We believe that an intuitive and natural behavior can be achieved if the robot can decide on the provided level of assistance in a similar way to humans. Thus, we formulate the problem of the allocation of control authority as a decision-making problem and employ human-inspired decision-making models. We use the driftdiffusion (DD) model, firstly proposed by [9], that describes the decision-making mechanism in humans as a process in which decisions are based on past decisions and the decision criteria are continuously adjusted in order to maximize the reward obtained throughout task execution. Following the principles of the DD model, we propose a mathematical formulation for an integrated control architecture to adapt the parameters of the shared control system of a rollator-type MAR. The proposed architecture allows to intuitively adapt the short-term (a) cognitive assistance helping the user to follow a desired path towards a predefined destination, the robot (b) sensorial assistance to avoid collisions with obstacles and to allow an intentional approach of them, and the more long-term adaptation of the robot (c) physical assistance based on measured user performance and fatigue. We illustrate the effectiveness of the proposed architecture in experiments and evaluate its performance by conducting a user study with elderly. Obtained results indicate an acceptable user satisfaction and show a general high potential of the proposed adaptive shared control architecture for MARs. This paper is organized as follows: Section 2 reviews related work. Section 3 introduces the MAR and the imple-mented admittance control approach. The integrated adaptive shared control architecture is presented in Sect. 4, while Sect. 5 provides details on the implementation of the adaptation policies for the sensorial, cognitive and physical assistance. Finally, Sect. 6 discusses the experimental setup and reports on technical validation experiments and the performed user study with elderly users. Section 7 concludes the work.

Related Work
This section reviews literature on adaptive shared control of MARs as well as studies on decision making in humans and related models.

Adaptive Shared Control for MARs
Variable admittance control is the most common control scheme in MARs. An admittance model defines the sensitivity of the device to the applied human forces according to a specified desired mass and damping that should be rendered by the device. The behavior of the system can be modified by adapting this admittance, or by manipulation of the force applied by the user. In [32,33,57] the authors for example improve maneuverability by applying a transformation on the user force that allows to online modify the center of rotation of the mobility assistant. In [24,26,27] authors propose to include also a braking force to the admittance law and to achieve the robot desired behavior such as fall prevention, gravity compensation on slopes or step avoidance by proper activation of the brakes. Different environment-adaptive approaches, mainly based on the inclusion of additional forces/torques to the admittance model for obstacle avoidance and goal-seeking (generated based on environment information) can be found in [23][24][25]34,48,49,56]. These approaches can result in an active robot behavior which can lead to dangerous situations, for example in case the human releases the handles and the robot continuous to move or the human plans to walk on a straight path, while the system accidently turns to circumvent an obstacle.
Only few works consider the history of the human performance during the interaction with the robot in the adaptation law of the admittance controller. In [64] the author proposes a cost function with forgetting factor evaluating the user's performance by combining multiple criteria like the proximity to obstacles, the deviation from the planned trajectory and human stability criteria. This allows to realize an adaptive shared control with varying force gains, which provides more authority to the human or the robot assistant depending on the accumulated human performance. Similarly, in [62] the authors propose to shift authority from the human user to the robotic system or vice versa depending on the specific context and logical rules allowing e.g. for the implementation of a no assist mode, an assist mode (human and robot share the execution of the task), a safety mode (robot acts fully autonomous) or an override mode (robot is under full control of the human). In [30,61] again a logical rule-based method is proposed that evaluates the interaction force to estimate the human intentional direction which is defined as "the direction into which a person intends to move" and then select the admittance parameters among some defined values. Different admittance parameters are studied to provide the user a comfortable feeling while walking and to avoid manoeuvres in unintended directions.
Apart from the use of variable admittance control, few other approaches exist that address the problem of shared control. A Bayesian network approach that combines sensor information with user inputs (read by an interface with three buttons for moving forward, turn left or right) and that activates respective autonomous robot behavior is proposed in [41]. An autonomous path planning and obstacle avoidance approach is discussed in [15][16][17]42] that lets the user decide on the robot velocity leaving partial authority of modifying the path with the user. The author employs advanced methods for dynamic path planning (e.g. elastic bands [51]) to allow for dynamic obstacle avoidance and smooth path planning and modifications according to user inputs. In [56] three robot guiding behaviours including obstacle avoidance, wall following, and goal seeking are designed for an omnidirectional mobile robot by evaluating laser sensor data and by fusing these three behaviors by means of a Fuzzy Kohonen Clustering Network. In [29] the authors use forces and moments a user applies to a walker's handle in addition to information on the local environment and the walker's state to derive the most likely human intention, respectively path to follow. Depending on the identified intention, the angle of the robot front wheel is set by the mobility assistant, leaving the user the freedom to decide on the velocity to move on the identified path. Finally, a switching controller to avoid human forward fall and human-robot collision is proposed in [13].
Summarizing, although a series of adaptive shared control approaches for mobility assistance robots were studied in literature as mentioned above, to the best of the authors knowledge none of the aforementioned approaches used human-inspired decision making models to define adaptation policies for the provided level of assistance, which is expected to result in a natural and safe human-robot interaction. Thus, for the first time we study human decision making models as mechanism to gain-schedule low-level control parameters and with this to vary the level of assistance provided and evaluate the effectiveness of this approach for real end-users.

Human Decision Making Models
In cognitive science, human decision making has been widely studied in so called two-alternative forced-choice (TAFC) tasks. TAFC tasks require a human to make a sequence of choices between two predefined alternatives. After every choice, the subject is given a reward based on the current choice and the previous N choices. The subject's goal is to maximize the accumulated reward over a sequence of choices. TAFC tasks were used to study optimal decision strategies, see [9,47], or sub-optimal strategies, see [21,22]. In human subject experiments, it was observed that for a majority of human subjects working with particular reward structures, decisions are centered around particular points, termed matching points, where the reward return curves for the two options cross.
Mathematical investigations focusing on potential underlying mechanisms of human decision making have involved among others Markov decision processes (MDP) and driftdiffusion (DD) models. Authors in [58] consider TAFC tasks and a DD model together as a Markov process and show that, under certain assumptions, the DD model analytically exhibits matching behavior as observed in human subjects. In [5], convergence to a matching point is proven for a particular task called the matching-shoulders-type task and using the DD model with a time decay extension. In [47,59], a combination of a DD model and MDP is used to address empirical and analytical effects of social context (decisions and rewards of other people) on decision making.
Although several extensions to the concept of decision making based on the DD model in TAFC tasks exist, see for example [50,63], its application to assistive robotics has not received lots of attention. In this work we extend our previous work [8] and explore the applicability of the DD model to MARs supporting elderly and patients.

System Description
Our rollator-type MAR consists of rear and front wheels, chassis, supportive handle bars and a range of sensors to measure environment and human data, see Fig. 1. The prototype has two actuated rear wheels and two front castors and is equipped with two 6 DoF JR3 force-torque sensors at the handles, a Hokuyu laser range finder at the front to monitor the environment, one at the back to observe human gait patterns and two Kinect to monitor the human posture. The system is further equipped with an Inertia Measurement System (IMU), XSens MTi-G-700 GPS/INS, in order to estimate the robot angular acceleration and two 2 DoF arms to  support sit-to-stand transfers. The rollator is of active and non-holonomic type, meaning that the translational motion of the robot along the heading direction as well as rotational motion along its center of rotation are possible, while motions in lateral direction are restricted. With reference to Fig. 2, the non-holonomic constraint is given bẏ and therefore the kinematic model can be written as follows, where v and ω are two available control inputs for the linear and angular velocities around the vertical axis and q = [x r , y r , θ r ] T the states of the robot.

Admittance Control
Two force/torque sensors mounted at the handles of the rollator are used to drive the differential drive MAR. Force components along and around the heading direction are used for motion control 1 . An admittance control is implemented, which allows to design the desired dynamic behavior of the system with respect to the user's applied force by selecting proper admittance parameters. The admittance controller emulates a dynamic system and gives the user a feeling as if he/she were interacting with the system specified by the admittance model. A mass-damper system for the linear and angular motion is considered where M d and D d are the desired inertia and damping matrices, respectively, and F h = [ f h x , f h y , τ h ] the driving forces applied by the user. Therefore, the desired reference velocity for the robot is specified by the desired admittance parameters and is based on the human input in terms of applied force. The robot reference velocity is then controlled by a low-level controller.

Shared Control Architecture
We propose an integrated architecture that allows to adapt the robot's short-term cognitive and sensorial assistance as well as the long-term physical assistance provided. The cognitive assistance provides required support to the user in path following situations guiding the user from an initial to a desired destination. The sensorial assistance reduces the risk of the robot colliding with obstacles and allows for the intentional approach of obstacles. The physical assistance tunes the robot contribution according to the long-term user performance, which may be affected due to fatigue. The latter is particularly important since considerable changes in performance are observed due to user fatigue after continuous activity, which may render performing daily activities at a desired level of performance difficult , see [10,52]. With reference to Fig. 3, we propose an integrated adaptive shared control framework for MARs. Three decision-maker blocks for sensorial, cognitive and physical assistance are responsible for online adapting the parameters of the admittance controller in order to achieve the desired system behavior. The Decision on cognitive assistance block evaluates the planned path towards the goal which is generated by the path planner block, the human navigational intention in form of force and torque applied to the robot handles as well as the actual human performance. The Decision on sensorial assistance block uses human input and the information provided by the Environment state block, which provides information on the position of obstacles around the robot. Finally, the Decision on physical assistance block processes all inputs and adjusts the level of active support provided accordingly.
The concept of the robot assistance is implemented by manipulating the admittance control parameters. We decompose and extend the admittance controller (2) as follows: where the parameters m x , d θ and f h x are the mass, damping and human force components along the heading direction of the robot (in alignment with the unitary vector x of the robot in Fig. 2). The variables I θ , d θ and τ h are the inertia, damping and human torque components. The parameters d x , d θ and k 2 are tuned to satisfy the aforementioned sensorial, cognitive and physical assistance. Increasing the value of d x decelerates the robot motion in heading direction and knowing that the robot is of non-holonomic nature this effect can be used for the purpose of robot sensorial assistance. Manipulation of d θ influences the felt resistance when aiming to change the robot orientation and thus, can help preventing deviations from the desired path towards the destination. Finally, an increase of k 2 increases the robot active contribution to the control of the orientation of the robot. This effect is used for varying the physical assistance provided by the robot. The adaptation of the d x and d θ parameters results in a passive and thus, intrinsically safe support strategy. The advantage of active support is used to tune the parameter k 2 , whenever the passive support strategy alone cannot provide the desired system behavior, e.g. when the user is exhausted and can hardly guide the robot towards his/her desired destination. The decision making systems that decide on the specific tuning of these parameters are discussed in the following sections.

Decision Making for MARs
The individual decision making policies that decide on the specific level of robot assistance provided are formulated based on the DD model to achieve an intuitive online adaptation of the robot assistance. In the following sections, we first introduce the DD model, and then detail its application for designing an adaptive robot assistance for a MAR.

Decision Making Principle Based on DD Model
In a two-alternative forced-choice (TAFC) task a human has to take a decision between two alternative choices and is asked to continuously choose between them. Each choice is associated with a specific reward. The human not knowing about the underlying reward structures typically explores the options and gradually optimizes the overall intake. Different reward structures have been proposed in literature to study human decision-making behavior. In this paper, we mainly focus on the matching shoulder reward structure. The matching shoulder structure consists of two reward functions with inverse relationships as encountered for example whenever two goals are conflicting and a decision has to be taken for either improving the one or the other. The specific form of the two crossing reward functions is a design factor and allows to program different kind of behaviors allowing to favour one goal over another in some situations, while favouring the other in other situations. Thus, in general the matching shoulder structure consists of two intersecting curves that diminish with increasing/decreasing performance. Consider p A and p B human performance measures associated with the choices A and B and the associated rewards r A and r B . Further, and only assumed in the context of this manuscript, the general relationship of a reward r and a performance measure p should be given by: where p of f set,z , r 0,z , k z , and n z are the user and taskdefined tunable variables for each specific reward structure (z ∈ A, B).
The drift diffusion (DD) model has proven to implement the optimal mechanism for TAFC decision making tasks and accounts for an impressive amount of behavioral and neuroscientific data. The DD model characteristic can be formulated as soft-max model firstly introduced by [9] to describe human decision making in TAFC tasks. The softmax model as a main component in human decision-making processes was also shown by [45] and formulated using a sigmoidal function According to this model, the probability of the human preference for choice A at time t + 1 is P A (t + 1) which is computed using (7), where w A (t) and w B (t) are the accumulated evidences for choosing option A or B, respectively. The parameter μ is used to manipulate the slope of the sigmoid function, and therefore the level of certainty in making a decision.
The values w A (t) and w B (t) are updated with the help of a learning rule. Authors in [46] have proposed a discrete-time linear update rule. Considering the decision set z ∈ [A, B] at each time t, then where z is the decision just made, r z (t) the obtained reward for z, λ ∈ [0 1] a forgetting factor and T the sample time in the system. We consider the same initial value for the weightings w z which implies no preference for each of the two choices.
In the following sections we employ the DDM as a key element for the gain-scheduling of low-level control parameters resulting in varying levels of physical, sensorial and cognitive assistance. Doing so, the problem of fulfilling two conflicting goals is formulated for each type of assistance studied. Then, associated performance metrics are defined and the corresponding matching shoulder reward structures are introduced. Next, the level of the provided assistance is decided upon by evaluating the DDM (7), which finally determines which of the two conflicting goals should be prioritized according to the accumulated evidence to improve the overall intake. Finally, a linear homotopy is applied for gain scheduling respective low-level control parameters c between a pre-defined minimum and maximum value based on the determined probabilities for deciding on either of the two choices:

Decision on Cognitive Assistance
In this section, we formulate the problem of providing adaptive, passive cognitive assistance as a human deci-sion making problem. We employ the DD model for gain-scheduling of the low-level control parameter d θ to online adjust the level of the provided robot cognitive assistance.

Problem Formulation
An important functionality of the MAR is guiding the user from an initial to a target destination, especially for users who are cognitively impaired and have thus, difficulties in locating themselves and finding their way. An ideal robot assistance makes the user feel comfortable by giving him/her enough control over the platform, while the user is safely guided towards the desired destination. In particular, we aim at improving human-robot agreement by providing the user enough freedom in controlling the platform as long as the deviation from the desired path stays within acceptable limits and at shifting priority towards improving task performance by reducing the human control authority in case the task deviation is slowly approaching its allowed maximum, but the user performs no proper reaction to prevent this. This tradeoff is formulated as decision-making problem. The assistance is realized by a passive guidance that prevents movements in directions perpendicular to the desired path and giving the user freedom to control the robot when moving along the reference path. Consider a task of path following from an initial to a final location where the desired path is known for the robot assistant. The human forces ( f h = [ f h x , f h y ] T ), represented by the linear components (two first entries) of F h in (2) are used to control the linear robot motion along the robot reference frame. They can be split into two main components, the human force along the reference path ( f ) and perpendicular to it ( f ⊥ ). With reference to Fig. 2, the magnitudes of these forces are given as follows, where θ e = θ re f − θ r and θ re f is the desired orientation between the reference path and the global x-axis. We believe that the proper control of the robot orientation error is satisfactory for the purpose of providing cognitive assistance. To ensure a safe robot behavior, we propose a passive assistance by adapting the damping parameter d θ and thus, indirect manipulation of the robot angular velocity and orientation error while giving the user the freedom to move freely along the path. This reduces the problem to the adaptation of only one parameter, namely the damping parameter d θ . The adaptation law for this parameter is formulated as a decision making problem using the DD model.

Performance Measures
Task performance is measured using the rotational and translational tracking error formulated with respect to the desired path over an observation windows N C where the subscript i refers to the value of the variable at the sample i and e is the robot position error given by and p T,C means the normalized task performance computed over N C samples, and k C,e and k C,θ e are two user-defined factors distributing the weightings between orientation and translation. The max value is initialized with the maximum acceptable error with respect to the task and is updated if a larger value is observed during the interaction process. Disagreement is assumed to occur when the user and robot assistant apply forces in opposite directions leading to so called internal forces. These internal forces provide important information on haptic interaction, see e.g. [18]. Minimizing disagreement can enhance the quality of humanrobot interaction as the robot then behaves according to human expectations. Considering the task of providing cognitive assistance described in the previous section, we define the internal moment τ int as follows where l f is a variable representing the Euclidean distance between the robot position and the reference point on the desired path. The value of τ robot can be computed by any orientation controller, similar to the one proposed for τ assis in (26) and is only used as virtual input to calculate a potential human-robot disagreement, but is not applied to the real robot as we aim for a fully passive cognitive assistance. The disagreement metric is then computed over N C samples and is further normalized to define the following agreement performance p A,C , The final performance set to be considered for each decision is p C ∈ [p T,C , p A,C ].

Reward Structure and Decision Making
Following ideas of the DD model in TAFC tasks, a reward function is associated with each performance measure. For the considered decision making problem, we propose a matching shoulder structure with two intersecting reward functions as depicted in Fig. 4 and both functions expressed using (6).
The proposed reward structure is designed to fit to the requirements introduced in Sect. 5.2.1. The assistant faces a trade-off between providing low assistance to improve human-robot agreement and providing high assistance to improve task performance. When the user is following the desired path, high agreement (agreement measure at its maximum) and high task performance (task performance measure at its minimum) are typically observed and thus, the maximum corresponding rewards are associated for both choices. The maximum reward associated to human-robot agreement is designed to be larger than the maximum reward for improving task performance. This implies an assistant's preference for improving agreement over task performance whenever the user's deviation from the reference path is acceptable. When both performances are decreasing, the reward for task performance decreases with a slower rate than the one for human-robot agreement. This implies a change of the preference from improving agreement to task performance. On the other hand, when rewards are again improving, even a small increase of human-robot agreement results in a quick change of the preference towards improving human-robot Reward structure for adapting the cognitive assistance. The blue function is the reward r T,C associated to the task performance measure p T,C and the red function is the reward r A,C associated to the agreement performance measure p A,C agreement because of the higher rate of change in the reward associated to it (except phases of really low task and interaction performance, where task performance dominates).
The probability to assist the human to improve humanrobot agreement at time t+1 is calculated using the DD model represented by (7) and considering P A = P A,C , w A = w A,C and w B = w T,C and μ = μ C . The values of w A,C and w T,C are updated according to (8) Finally, the level of the provided cognitive assistance is adapted with the help of a linear homotopy defined as follows (15) where d θ,min and d θ,max are the minimum and maximum considered values of the damping factor.

Decision on Sensorial Assistance
The formulation of the sensorial assistance problem and the proposed adaptation policy for gain-scheduling of the lowlevel control parameter d x based on the described decision making approach is discussed in the following sections.

Problem Formulation
Although typically a collision-free path is planned for robot assistants, reducing the risk of colliding with dynamic obstacles unknown at the time of planning the path has to be considered in the design of the robot control architecture. Further, an intentional approach to objects (detected as obstacle by the robot) can be desirable, e.g, when aiming to approach a table to grasp an object. This requires the robot to determine the user's intention and to decide on a proper support taking the specific context into account. Specifically, we aim at improving task performance in terms of collision avoidance by reducing the human control authority as well as allowing the intentional approach of objects by shifting the control authority to the human if large human-robot disagreement is detected. This is formulated as decision-making problem.
Since the most critical collisions occur between obstacles and the front part of the robot, we aim for collision avoidance by adapting the robot heading velocity towards obstacles. Considering the distance between robot and a detected obstacle, virtual forces/moments can be generated based on an artificial potential field, see [39]. We consider the following artificial potential field (U (q)), where d obs is defined as the shortest distance between the nearest obstacle in front of the robot to a representative point Reward structure for adapting the sensorial assistance. The blue function is the reward r T,S associated to the task performance measure p T,S and the red function is the reward r A,S associated to the agreement performance measure p A,S on the robot, see Figs. 5, 6, d obs,max the radius of the area in which the potential field becomes active and k a positive constant gain. Therefore, the value of U (q) is increased whenever the robot is approaching an obstacle, and its value is zero if d obs (q) is larger than d obs,max .
Artificial forces applied by the robot are defined as F(q) = −∇(U (q)) where ∇U is the gradient vector of U . Then F(q) is transformed to the robot frame to determine virtual forces and moments F obs = [ f obs , τ obs ] applied by the obstacle to the center of rotation of the MAR.
In a fully autonomous system, forces F obs are typically used to actively drive the MAR and avoid collision with obstacles. However, in a shared control system where the robot is (at least partially) under human control and knowing that we aim for a passive support, direct usage of F obs can result into an active and unsafe behavior and thus, we aim for only evaluating it and passively tuning the robot heading velocity v. Here this problem is simplified to the decision on the adaptation of d x , which allows decelerating the robot whenever an obstacle is detected.

Performance Measures
Considering the task of collision avoidance, task performance is defined according to the distance to the nearest obstacle in front of the robot over an observation window of N S samples where d obs,i is the respective vector for sample i. Similar to Sect. 5.2.2, internal forces are considered to provide important information on the quality of interaction during collision avoidance. Internal forces f int , which represent the level of disagreement between the force applied by a human ( f h ) as well as the repulsive force generated by the detected obstacle ( f obs ), are computed as follows whereby human-robot agreement A S is determined over N S samples and is normalized as follows where f int,i refers to sample i. Thus, the set of performances to be considered for the sensorial assistance is p S ∈ [p T,S , p A,S ]. Again the DD model is adopted for decision making. The probability to improve human-robot agreement P A,S is calculated by (7) where w A = w A,S and w B = w T,S are the evidences for choosing to improve human-robot agreement or task performance (as defined in Sect. 5.3.2). The evidences are calculated using (8) and considering the set of decisions z ∈ [T S , A S ] for each time t. Finally, the level of the robot sensorial assistance is modified by means of the following homotopy for the damping parameter d x d x (t) = P A,S (t + 1)d x,min + (1 − P A,S (t + 1))d x,max (19) where d x,min and d x,max are the minimum and maximum considered values of the damping factor.

Reward Structure and Decision Making
We believe that the proposed reward structure satisfies the objectives for providing sensorial assistance as introduced in Sect. 5.3.1. When no obstacle is detected in front of the robot, the task performance measure is at its minimum [see (16)] and therefore a high reward is associated to it. On the other hand, no obstacles implies no disagreement between human and robot (based on the definition of the performance measures), which results in a large value for the measure of human-robot agreement and therefore a high reward. The maximum value of the reward for human-robot agreement has been decided to be slightly larger than the maximum value of the reward for task performance, which implies a preference to improve human-robot agreement whenever no risk of collision is detected. In other words, the value of P A,S is close to one due to the fact that the evidence Δw S = w A,S − w T,S is at its maximum according to the rewards defined.
As soon as an obstacle is detected, the reward for improving task performance decreases with a slower rate with respect to the reward for human-robot agreement. This allows a faster change from preferring human-robot agreement to task performance, the value of Δw S decreases, which results in an increase of the level of assistance. Finally, if the human insists on continuing the motion forward despite the provided resistance of the robot (which can imply the user's interest to approach the obstacle), the task performance measure tends to its maximum value (corresponding to the lowest reward), while the human-robot agreement measure tends to its lowest value (also corresponding to a low reward). In this case the overall preference turns back again towards improving human-robot agreement since its minimum reward is larger than the minimum reward for task performance. This results in an increase of Δw S allowing the user to approach the obstacle. However, approaching the obstacle has very low risk of collision since the robot velocity has been reduced significantly and the human remains under partial robot assistance.

Decision on Physical Assistance
Individualization of the robot support is considered by adapting the physical robot assistance by gain-scheduling the parameter k 2 as detailed in the following sections.

Problem Formulation
The demand for assistance of elderly and patients may increase with continuing activity due to fatigue. An assis-tance strategy that adapts to the current physiological state can meet the aforementioned demand and thus, can result in a higher user satisfaction during interaction with the robot. This requires that the MAR not only evaluates the user performance with respect to the desired task, but also estimates the physiological state of the user in order to decide on the level of the provided robot assistance. Specifically, we aim at shifting the control authority to the robot if task performance is low and human fatigue high and at gradually returning authority to the user when task performance improves and human fatigue decreases. Again, this is formulated as decision-making problem.
We propose an active support by applying an assistive torque to the admittance model. Considering (4) and (5), the input torque can be manipulated by a proper selection of the parameter k 2 .

Performance Measures
In general two different types of human fatigue are studied in literature: mental and physical. Physical fatigue, which we focus on in this paper, presents the maximum level of exhaustion at which the human cannot exert any more work. 2 In literature, medical indicators of human fatigue are mostly discussed based on heart rate or the total performed work. Since the former requires an external monitoring system, e.g. heart rate sensor, we mainly focus on the latter. Physical fatigue is directly related to the total power consumed in the human muscles and therefore total work performed as presented by [11]. The total work performed by a person during walking is related to the user's walking velocity and the total weight of the user. Authors in [6] propose the following formula that relates consumed calories per kilogram per hour l cal to the user's velocity v h during walking We use the aforementioned formula to formulate the level of the human fatigue during walking. Considering a person with total weight of M pushing a MAR with apparent mass m x and moving with linear velocity of v h = v(t) at time t, the normalized level of human fatigue is estimated as 2 Please note that the natural definition of mental and physical fatigue are closely related and it is commonly known that physical fatigue impairs mental fatigue. However, [43] has only recently shown that mental fatigue can also imply physical fatigue. Therefore, we just consider the effect of physical fatigue since this is the most probable cause of fatigue in a mobility assistance scenario.
where F represents the level of human fatigue, Δt the sampling time of the system and l cal, f at the maximum possible consumed calories resulting in human fatigue. 3 We define to be the performance measure correlating with the estimated human fatigue. The overall task performance is defined based on the tracking error of the desired path as well as the distance to the nearest obstacle in front of the robot which is computed as follows where δ i is defined as a measure of total task performance at sample i, δ max the maximum value of δ, p T,O the observed task performance over the observation window with length N O . We consider a larger value for N O than N S and N C (defined in Sects. 5.2.2 and 5.3.2 respectively) for a better estimation of the more long-term changes in human task performance rather than specific reactions to a given situation. The values of k O,θ e , k O,e and k O,obs are weighting factors, which can be tuned according to the importance of following the path or avoiding obstacles.

Reward Structure and Decision Making
The reward structure for the two performance measures is shown in Fig. 7.
The linear structure has been chosen as there is no specific preference on improving the overall task performance or increasing the support because of human fatigue. This structure allows to change the decision (gradually) whenever human fatigue or performance changes are detected.
The level of the physical assistance is finally tuned according to the DD model. The estimated level of the robot physical assistance P O is computed using (7) with w A = w F,O and w B = w T,O . The evidences are computed using (8) and assuming the decision set z ∈ [F O , T O ] at each time t. Thus, the level of the robot overall assistance is adapted by tuning the weighting factor k 2 presented in (4) as follows, where k 2,min and k 2,max are the minimum and maximum considered values for k 2 . We propose a very smooth soft-max function by considering a small value for the μ parameter in (7). This allows to gradually shift the preference between the human or assistant to control the robot steering velocity.
Finally, to recover the orientation error a robot assistive moment can be generated using the following control law where K p1 , K p2 are user-specific defined gains.

Experimental Results
This section illustrates the effectiveness of the proposed approach, first by means of experiments aiming for a technical validation with a healthy user interacting with the platform and then by means of a user study involving 35 elderly persons.

Technical Validation
In the following sections we technically validate the proposed decision making algorithm realizing adaptive shared control in MARs.

Experimental Setup
The robotic platform as shown in Fig. 1 was used for validation of the presented adaptive shared control approach. The controller of the robot mobile base was implemented using MATLAB/Simulink Real-Time Workshop. The robot velocity was controlled using a low-level high gain PD controller.
The control loop was set to run at T = 1 ms sampling time. The robot handles were not actuated and kept at a constant height during the whole experiments. A static map of the experimental room was build in the Robot Operating System (ROS) using the OPENSLAM Gmapping library package based on captured laser scanner, IMU and robot's odometry data. A path planner as part of the move_base package in ROS was implemented that provides a fast interpolated path planning function used to create plans for the mobile base.
For determining the closest point, we used a planner that assumes a circular robot and operates on a cost map, which produces a global path from a starting robot pose to an end pose in a grid. Then, an algorithm was used that searches iteratively on the global path to find the closest points to the current robot position. To solve ambiguity in case two or more closest points are found, we implemented a look-ahead checker, which processes past closest points and returns the next closest point which is located ahead of the robot and has the maximum orientation alignment with the current robot pose.
Robot localization was performed using an Adaptive Monte Carlo Localization (amcl) approach, which was implemented in ROS as part of the nav_stack package and provides an estimate of the robot's pose against a known map. It continuously registers the robot pose on the map and corrects possible odometry errors.
An obstacle map based on the front laser scanner was constructed in order to provide information about the closest obstacle in defined zones around the robot. We splitted the area in front of the robot into 5 zones and computed the distance of the nearest obstacle in each zone to the robot, see Fig. 8 for a snapshot.

Test Scenarios
The presented approach was tested using two scenarios. In the first scenario the integration of the cognitive, sensorial and physical assistance was tested, while in the second scenario we specifically investigated the performance of the realized sensorial assistance and its ability to avoid obstacles or allow their intentional approach.
Scenario I The user was asked to define a desired destination on the map of the experimental area shown on the screen mounted on the robot frame. According to the user's choice, a reference path was automatically generated to the final destination. The user was asked to follow the path while trying to deviate from the path at least once. At half way, another human was asked to pass in front of the robot simulating a dynamic obstacle. The user was instructed to not pay attention to this dynamic obstacle, pretending of not having noticed it. Towards the end of the path the user was asked to Fig. 8 Snapshots taken during human-robot cooperation in scenario I. The map of the area is depicted in gray, while the dark gray areas show the occupied static obstacles found during the map building. The yellow points indicate the location of observed obstacles during the experiment. The blue point clouds are clusters around each obstacle in the vicinity of the robot (this is only for presentation purposes and has no application in the presented approach). The area in front of the robot is divided into 5 zones as shown in thick red lines. The generated reference path is presented by thin red, while the path the robot passed is shown with yellow line (can be seen near the reference path behind the robot). Each snapshot presents the following information from left to right, 1 initial phase of walking where no obstacles are detected and the user is well following the path, 2 a dynamic obstacle moves in front of the robot, 3 the user is deviating from the reference path, 4 increase of the user's deviation is restricted by the robot and therefore the user comes back to the path, 5 the user keeps an orientation error at the end of the experiment, and 6 the robot overall assistance recovers the orientation error. (Color figure online) Table 1 Defined reward functions for robot assistance

Reward function
Cognitive assistance keep the robot orientation slightly off the reference path to test the effect of the robot physical assistance. The parameters used for realizing the cognitive assistance were as follows: N C = 2500, k C,e = 5 and k C,θ e = 10. We considered μ C = 0.6 in order to increase certainty in the decision making and to avoid chattering. For the sensorial assistance functionality, we set d obs,max = 0.85m, N S = 2500 and μ S = 10. For the overall assistance we exaggerated the value of l cal, f at = 10 4 for the sake of presentation to be able to detect human fatigue after a short duration of walking, although the real value of l cal, f at is much higher and can be determined from literature. We mostly focused on the error of the robot orientation with respect to the reference path in order to actively point the human towards the destination. Therefore, we set k O,θ e = 8, k O,e = 5 and k O,obs = 1. Further, the values of N O = 10 4 and μ O = 12 were selected. The value of the forgetting factor λ = 0.6 was considered for all cases. To fulfill the requirements of the desired robot assistance in all three cases, the reward functions were defined as presented in Table 1. Moreover, the parameters for the desired inertia of the admittance controller were considered to be m x = 15 kg and I θ = 5 kgm 2 . Results of the sensorial assistance during human-robot cooperation in scenario I Figure 8 shows some snapshots taken during the experiment. The map of the experimental area, the robot and defined obstacle zones, detected obstacles at the front and around the robot as well as the desired and traveled path are shown.
At the beginning of the experiment a dynamic obstacle (another person) was passing in front of the robot (≈ 30 < t < 32s). As depicted in Fig. 9, when the robot approaches the obstacle the task performance measure increases. Moreover, since the user was asked to not react to the obstacle, the agreement measure between the robot being interested in avoiding the obstacle and the human not reacting properly decreases. Taking into account the defined reward structure, the human receives a quite low reward which results in triggering the robot decision to increase the robot assistance which was achieved by automatically increasing the damping factor and therefore reducing the robot approaching velocity to the obstacle. As soon as the dynamic obstacle passed the robot and the risk of collision reduced again, the robot decided to return the authority of controlling the motion of the robot to the user, which happened quite smooth, but fast (with respect to the first decision of increasing the assistance) in order to avoid the user pushing against a blocked robot while there is no obstacle in front of it. When trying to deviate from the path (≈ 35 < t < 37 s) as shown in Fig. 10 the task performance measure increases, while the agreement measure decreases as the robot preferred to stay on the path, while the human was deviating from it. Therefore the robot assistance hindering the user from further deviating from the path is activated and the value of the damping d θ is increased. This notifies the user that the current direction of motion is not aligned with the desired reference path. However, as soon as the user adapts his input and aligns the robot with the desired path, the robot assistance quickly returns the authority to control the platform to the user.
For the last part of the path when the user was simulating fatigue, we considered a value of l cal, f at = 10 4 in order to visualize the effect of the realized algorithm even after only 50 s of walking, see Fig. 11. With increasing duration of the human walking, the estimation of the human fatigue, and thus the corresponding performance measure, increased, while the overall human task performance measure varies according to the distance of the human to obstacles and the overall deviation from the path and orientation error. 4 By increasing the orientation error in the last phase of the experiment, the corresponding performance measure was influenced and therefore a lower reward was associated. This resulted in a change of the decision towards increasing the level of active 4 Please note that emphasizing mostly on the orientation error in the overall task performance measure was assumed only for the sake of presentation. However, one may associate different values for the contribution of each of the terms to the overall task performance.  Fig. 11 Results of the physical assistance during human-robot cooperation in scenario I assistance by increasing the robot contribution to the control of the robot's orientation. Therefore the value of k 2 was increased to its maximum which we considered to be 0.6 for the sake of safety.
Scenario II In this scenario we focused on the evaluation of the robot sensorial assistance and tested the functionality of distinguishing between approaching obstacles either intentionally or accidentally. To be able to focus on the sensorial assistance functionality, the cognitive and overall assistance were deactivated to prevent the results being influenced by these other assistances. Figure 12 shows the snapshots taken during the experiment. Two static obstacles were positioned in front of the robot, one after the other in heading direction. A third obstacle (table) was further considered as an intentional goal. The user was asked to approach the table and grasp an object located on it assuming the two obstacles are initially not detected due to e.g. bad sight. As shown in Fig. 13, when approaching the first two obstacles (the first at ≈ 36 < t < 37.5 s and the second at ≈ 40 < t < 43 s), the robot task performance measure is increased while the agreement measure is decreased, which implies a risk of collision. The robot correctly decides to prevent the collision with obstacles as the value of the damping factor d x is increased and only returns the authority to the human once he/she changed the orientation of the robot and thus, the risk of collision decreased (damping factor d x was decreased fast). However, in the third case where the human pushed the robot towards the intentional obstacle (at ≈ 46 < t < 52 s), the robot initially reduced the approaching velocity (value of the damping factor d x was increased), but then it returned the authority to the human to allow for further safe approach to the intentional obstacle (value of the damping factor d x was reduced to 30). This change in the authority allocation happened even though task performance was low (task performance measure high) as the robot was in a very close distance to the obstacle. Fig. 12 Snapshots taken during human-robot cooperation in scenario II. The map of the area is depicted in gray, while the dark gray areas show the occupied static obstacles found during the map building. The yellow points indicate the location of observed obstacles during the experiment. The blue point clouds are clusters around each obstacle in the vicinity of the robot (this is only for presentation purposes and has no application in the presented approach). The area in front of the robot is divided into 5 zones as shown in thick red lines. The path that the robot passed is shown with yellow line behind the robot. Each snapshot presents the following information from left to right, 1 initial phase of walking where an obstacle is detected in front of the robot, 2 close distance between the robot and obstacle which increases the risk of collision resulting in the robot reaction to avoid collision, 3 the second obstacle is detected and the robot reacts to avoid collision, 4 the user is guiding the robot towards a new obstacle he wants to approach intentionally, 5 the robot allows for a very close approach of the intentional obstacle, and 6 the user leaves the intentional obstacle.

User Study
An intensive evaluation with 35 elderly subjects was performed to assess the effectiveness of the proposed adaptive shared control approach. Thirty one women and four men participated in the evaluation which took place for six weeks at the rehabilitation centre of the Agaplesion Bethanien Hospital/Geriatric Centre at the University of Heidelberg.

Test Conditions
The adaptive shared control approach for sensorial assistance has been implemented on the robotic platform and was compared with an existing approach in literature. We considered three different conditions: -C1: Walking assistance without obstacle avoidance functionality implementing a constant virtual inertia and damping. -C2: Walking assistance with obstacle avoidance based on the approach presented by [24]. -C3: Walking assistance with obstacle avoidance based on the decision-making algorithm presented in this manuscript.
The main reason for focusing on the evaluation of the sensorial assistance in the user study is that beside the baseline C1 there is hardly any directly comparable algorithm available for the other two modes.
For a fair comparison, base values of m x = 15 kg and I θ = 5 kgm 2 , and of d x = 10 Ns/m and d θ = 10 Nms/rad were considered for each condition. These values were selected after discussion with rehabilitation experts. Although the above mentioned values were considered constant for condition C1, the value of d x and d θ were adapted up to their maximum of d x,max = 110 Ns/m and d θ,max = 80 Nms/rad in C2 and C3. The maximum values were selected following discussions with rehabilitation experts as well as tests to achieve a good maneuverability of the device with respect to a standard non-motorized walker. We considered 70 cm distance between the robot and obstacles as the activation distance, i.e. the base values were considered in C2 and C3 only for distances larger than 70 cm, while the adaptation laws were applied for distances less than 70 cm.

Experimental Setup
A special test environment was prepared within the Bethanien rehabilitation center to test the proposed adaptive shared control approach. Figure 14 shows the map of the test environment and a representative example of a test path. The test environment covered an area of about 10 × 9 m with an approximate length of 40 meters of test path starting from an initial position, passing through the narrow corridor by avoiding obstacles and coming back to the same initial position. The height of obstacles varied in different sections of the area. The considered round trip allowed us to record the same number of left and right turns. Over the whole trial the user was faced to 17 obstacles, and a minimum amount of 16 turns either to avoid collisions with obstacles or to perform turns along the path. No reference path was marked on the ground during tests.

Evaluation Method
Before participants completed the test trials, each of them was asked to drive freely through the course. For this first run, no instructions concerning obstacle avoidance and walking speed were given by the test supervisor, and no sensorial assistance was provided by the robot platform. This trial was intended to familiarize the participants with the device and course.
Each participant then completed the obstacle course under three different conditions mentioned in Sect. 6.2.1. The order of the conditions tested with each participant was randomized to exclude learning effects. The participants were not told which condition was used during the three different trials. Before starting each trial, the participants were instructed to complete the course as fast as possible. After each trial, a sufficient recovery phase was provided to the participants in order to prevent fatigue.

Evaluation Results
Two performance metrics were considered in order to verify the effectiveness of the proposed sensorial assistance: number of collisions (with the front of the robotic platform) and task completion time.
Differences in the number of collisions and task completion time between the three conditions were statistically analysed by a one-way analyses of variance (ANOVA) and obtained results are shown in Figs. 15, 16 and 17. No significant differences between conditions C1, C2 and C3 were identified in terms of task completion time. However, significant differences were found for the number of collisions and approaching velocity to obstacles. Post-hoc tests (Bonfer- roni corrected) showed a reduced number of collisions and reduced approaching velocity for C3 (sensorial assistance based on decision making algorithm) compared to condition C1 (p < .05), but no significant differences between other conditions (C2 vs. C1 / C3: p = .07/.99). The lowest approaching velocity to obstacles was found for C3.

Discussion
The technical performance of the proposed approach was tested in two scenarios and resulted in the desired robot behavior as the robot cognitive, sensorial and physical assistance were activated as needed. The effectiveness of the proposed approach was demonstrated in the performed user study with end-users. The lowest number of collisions, alongside with the lowest approaching velocity to obstacles was found when the user was passing the obstacle course using our newly proposed algorithm. However, similar task completion times for all conditions indicated that the proposed sensorial assistance approach does not interfere with the normal activity of the patients and furthermore guarantees a safe intentional approach to obstacles if needed.
One of the main practical challenges in the presented work was tuning basic and maximum values of adjustable parameters. We finally agreed on the chosen values based on discussions with experts. Further, the selection of suitable performance metrics and reward structures strongly affects the performance of the algorithm and a series of alternative performance metrics and related reward structures could have been chosen instead. We don't argue that our selection is the best, but that it fulfills the desired purpose of improving sensorial, cognitive and physical assistance.

Conclusion
An integrated approach for the context-specific, on-line adaptation of the assistance provided by a rollator-type MAR was presented. The shared control architecture distinguishes between short-term adaptations providing (a) cognitive assistance to support the user to follow a desired path towards a predefined destination and (b) sensorial assistance to avoid collisions with obstacles and to allow for an intentional approach of them. Further, it considers a long-term adaptation of (c) the physical assistance based on long-term user performance and observed fatigue. To achieve an intuitive and human-like adaptation policy of the provided assistance, a decision making model explored in cognitive science, the Drift-Diffusion model, was employed.
We illustrated the effectiveness of the proposed architecture by means of experiments technically validating each of the three aforementioned functionalities of the architecture. Moreover, the performance of the algorithm with real endusers was demonstrated by conducting a user study with 35 elderly focusing specifically on the sensorial assistance functionality. Obtained results indicate that the required functionalities can be realized with the proposed decision making algorithm showing a general high potential of the proposed adaptive shared control architecture for MAR.