“If only I had taken the other road...”: Regret, risk and reinforced learning in informed route-choice

This paper presents a study of the effect of regret on route choice behavior when both descriptional information and experiential feedback on choice outcomes are provided. The relevance of Regret Theory in travel behavior has been well demonstrated in non-repeated choice environments involving decisions on the basis of descriptional information. The relation between regret and reinforced learning through experiential feedbacks is less understood. Using data obtained from a simple route-choice experiment involving different levels of travel time variability, discrete-choice models accounting for regret aversion effects are estimated. The results suggest that regret aversion is more evident when descriptional information is provided ex-ante compared to a pure learning from experience condition. Yet, the source of regret is related more strongly to experiential feedbacks rather than to the descriptional information itself. Payoff variability is negatively associated with regret. Regret aversion is more observable in choice situations that reveal risk-seeking, and less in the case of risk-aversion. These results are important for predicting the possible behavioral impacts of emerging information and communication technologies and intelligent transportation systems on travelers’ behavior.


Introduction
In recent years there has been a growing interest in design and deployment of intelligent transportation systems and especially advanced traveler information services. These systems use information and communication technology (ICT) to inform, monitor, control and even charge travelers (Bonsall 2008). It is commonly assumed that providing travelers with more reliable information will improve the individual traveler's route-choice decisions and consequently the networks performance and safety (European Commission 2008). However, improving our understanding of travelers' response to information is still a key issue to obtain the full benefits from such applications. This response is dependent on travelers' decision making behavior under conditions associated with risk and uncertainty.
Expected Utility Theory (Bernoulli 1738;Luce and Raiffa 1957;Von Neumann and Morgenstern 1944) has been the dominant paradigm in analyzing travel behavior under risk and uncertainty and particularly in route-choice (Arentze and Timmermans 2005;De Palma and Picard 2005). It suggests that maximization of a linear combination of end states and probabilities of these states normatively represents choice behavior. Random utility models have been widely developed using various specifications to predict route choice decisions providing valuable behavioral insights (see Prashker and Bekhor 2004 for a detailed review). Chorus et al. (2009) demonstrate the use of a Bayesian EUT framework to assess the effects of travel information on route-choice.
Behavioral decision research has empirically revealed systematic violations of some of the assumptions of Expected Utility Theory (EUT). Some researchers have even raised concern over its validity in forecasting travel behavior (Gärling and Young 2001). The most common behavioral theory to substitute EUT is Prospect Theory (Kahneman and Tversky 1979;Tversky and Kahneman 1992). Prospect Theory (PT) asserts that decision makers frame possible outcomes as gains or losses based on a subjective point of reference and not according to final-states as the classic interpretation of EUT suggests. Whereas in EUT, decision makers are usually assumed as risk-averse, in PT, people will usually reveal risk-averse behavior in the case of gains and risk-seeking behavior in the case of losses. In addition, PT postulates that people are more sensitive to a loss compared to an equivalent gain, implying loss aversion. Furthermore, unlike EUT probabilities are not treated linearly; rather an S-shaped weighting function is applied, whereby small probabilities are overweighed and large probabilities underweighted. PT has also been tested in routechoice contexts and found to have added explanatory value (e.g. Avineri and Prashker 2004;Avineri and Bovy 2008;Gao et al. 2010;Katsikopoulos et al. 2002). However, its main caveat is the selection of the perceived reference point which poses a considerable 'headache' for modeling purposes-since it is not well defined in the literature.
Regret Theory (RT) is another behavioral decision theory that has been discussed in the literature. Interestingly, RT was originally developed by Loomes and Sugden (1982) as an alternative theory to PT and particularly to the difficulty in contending with the problem of defining a reference point. RT postulates that choice behavior is affected not only by the attractiveness of a considered alternative as in EUT and PT but also from the anticipation of regretting not choosing a foregone alternative (i.e. non-chosen). The theory postulates regret aversion, i.e., the greater the feeling of regret the less attractive is the chosen alternative, i.e., Contrary to PT, RT has a non-arbitrary reference point dependent on the choice set rather than the choice context. Like PT which has been extended to multinomial choice situations with the formulation of Cumulative PT (Tversky and Kahneman 1992), Quiggin (1994) has made a similar extension to RT. However, unlike PT, RT still treats probabilities of states-of-theworld linearly. Compared to EUT, RT only makes use of an additional regret aversion parameter, making it more parsimonious than PT which necessitates identifying four additional parameters. Although attracting quite a lot of attention in behavioral decision research (Kahneman and Riepe 1998;Starmer 2000), there has been less attention to RT' in travel behavior research as discussed by Chorus et al. (2006Chorus et al. ( , 2008 and Chorus (2010).
The three aforementioned behavioral theories implicitly assume situations involving one-shot decisions where the outcomes of the choice are not explicitly revealed ex-post. In reality travelers' behavior is influenced not only by descriptional information regarding possible alternative routes but also by experiential information gained through a process of Reinforced Learning (RL) based on feedbacks. Studies based on RL (Busemeyer and Townsend 1993;Erev and Barron 2005) assert that experience leads to adaptive learning but, at the same time, this is also a function of sampling available information on the basis of past experience from memory. Moreover as also demonstrated for route-choice by Avineri and Prashker (2003) and Ben-Elia et al. (2008), the choice behavior in RL is quite sensitive to the degree of uncertainty in the environment.
EUT has been adapted to repeated travel choice situations using the notion of utility updating over time (Horowitz 1984). Here, route-choice is based on a process of adaptive learning whereby all sources of information either descriptive and/or experiential are applied to update the level of knowledge over the road network (e.g., Cascetta and Cantarella 1991;Mahmassani and Liu 1999;Srinivasan and Mahmassani 2003;Watling and van Vuren 1993). PT has also been tested in dynamic contexts. However, unlike EUT, the basic assumptions of PT do not necessarily hold when moving from one-shot to sequences of choices. For example, Barron and Erev (2003) found risk attitude reversals when feedback is introduced in repeated choice experiments. Contrary to one shot decisions, they showed that in repeated choice situations with feedbacks, participants tend to avoid risks when faced with losses and accept more risks when faced with prospective gains. In route-choice, Ben-Elia and Shiftan (2010) showed that risk seeking behavior is apparent mainly in the short run when knowledge over the network's performance is relatively limited; whereas in the long run, when learning is sufficiently reinforced, the trend is towards risk aversion. Moreover, they did not find, in the context of a choice model, behavior completely consistent with PT. Although differences relative to a reference point (mean travel time in this case) seem to have some significance in explaining route-choice behavior, neither was a real difference between gains and losses identified (i.e., no evidence for loss aversion) nor was the PT-based specification better in terms of model fit compared to an EUT specification. However, given that only one reference point definition was tested, it is difficult to generalize from their findings on the appropriateness of PT in dynamic decisions. A recent behavioral study by Erev et al. (2008) asserts that in repeated choice situations with immediate feedback, behavioral tendencies previously related to loss aversion in decisions from experience, are better described as consequences of diminishing sensitivity to absolute payoffs. These studies put a question mark on the appropriateness of PT to explain choice behavior in repetitive situations.
In relation to RT, like in the case of EUT and PT, behavioral research regarding regret aversion-the theory's principle behavioral factor-also demonstrates the importance of expected feedback on the perception of regret. According to the original version of RT (Loomes and Sugden 1982), before choosing, the decision-maker compares 'what is' under a particular state with 'what might have been' for an alternative under the same state-which results in anticipated regret or rejoice (the opposite of regret). However, as argued by Zeelenberg (1999), in order to evaluate an alternative by comparing 'what is' with 'what might have been', the decision maker must learn, ex-post choice, 'what might have been' implies. In other words, both the chosen and foregone alternatives must be resolved for anticipated regret or rejoice to influence behavior. The original version of RT does not explicitly account for the resolution of the outcomes of foregone alternatives in stimulating anticipated regret. A different view is expressed by Humphrey (2004) suggesting that the resolution of foregone alternatives is less cardinal than the ability of the decision maker to learn exactly which state-of-the world has occurred. This is especially relevant in situations where only the outcome of a chosen alternative is revealed expost (i.e., experiential feedback) but not those of foregone alternatives, and the decision maker is not fully informed which state-of-the world has actually occurred. In comparing the importance of expected regrets that will be experienced ex-post with those that will not, Larrick (1993) suggests that it seems reasonable to assume that feedback about what definitely would have occurred could well have a greater potential for regret than abstract knowledge of what was statistically likely to occur as assumed in original RT. This assertion forms the rationale of a revised theory of RT called Feedback-conditional RT (Humphrey 2004). This theory postulates that the effect of feedback (following a choice) on the anticipated emotion of rejoice or regret depends on whether the state-of-the world is revealed (i.e., is the foregone alternative resolved). More specifically, it predicts for any outcomes x and y, where the utility of x is larger than the utility of y, rejoice for x is greater when having x fully reveals the state-of-the world than when it is not, whereas regret for y is smaller when having y does not fully reveal the state of the world.
Returning to the transportation realm, in most situations, it is highly likely that travelers receive feedback on their chosen route but not necessarily on the non-chosen routes. Feedback on a chosen route is almost immediate-e.g., the travel time experienced to have reached the destination, whereas discovering what were the travel times on non-chosen routes requires active search for information and is not immediately available. Since RT has shown a considerable potential for explaining travel behavior, there is added value in investigating its salience realistically as possible, such as experiential feedback on the chosen route.
In order to test the impact of RT on route choice we reinvestigate the route-choice behavior data collected in the experiment conducted by Ben-Elia et al. (2008). We apply a RT-based modeling framework as suggested by Chorus (2010) and incorporate the effect of experiential feedbacks in the specification of regret based on the rationale of Feedback-Consistent RT (Humphrey 2004). The rest of the paper is organized in the following way: Experiment and data section presents the experimental method, Behavior modeling section describes the modeling frameworks and the tested specifications, Results and discussion section presents the results and a discussion, and in the Conclusions section, we discuss the practical implications of the findings and present several future research directions.

Experiment and data
Design A route-choice experiment was designed on the basis of a simple binary network and one origin-destination pair (work and home). Route A is on average faster than route B. The faster route has a mean of 25 min and the slower one-30 min. Three traffic scenarios were designed by manipulating the routes' travel time ranges (i.e., deviation around the mean value). These ranges are ±5 or ±15 min for each route. Table 1 presents the three travel time scenarios applied in the experiment. The experiment consists of 100 choice trials in each scenario. Each trial simulates a daily trip. The order of the scenarios follows a counterbalanced (blocked randomization) design. The treatment condition (here: informed) in the experiment consisted of the provision of ex-ante travel information in the form of a travel time range corresponding to a particular traffic scenario simulating a simple variable message sign (VMS) presented to travelers before a route diversion. This information was not provided in the control condition (here: non-informed).

Participants and procedure
A total of 49 participants (undergraduate Technion students-30 men and 19 women) arriving in random order to the lab were divided randomly into two groups between the treatment condition (N = 24) and the control group (N = 25). Each participant was also allocated randomly to one of the six (that is 3! blocks) possible orders. Table 2 presents the descriptive statistics of the participant sample. Each participant was seated in front of a computer terminal and provided with written on-screen instructions about the task ahead. The instructions were also read out loud by the assistant. The task was to choose (by selecting a radio button) among two routes to return home after a day's work. They were explained that this task is to be repeated several times for different commuting days and in several different scenarios. Participants were not informed in advance how many 'days' or how many scenarios they are expected to complete. However, they were told when one scenario would end and a new scenario is about to begin. In the treatment condition only, participants were also told that they will receive travel information before each daily choice. No other explanation was given as to the nature of the experimental task. Each participant had a budget of 100 ILS (Israeli Shekel, where 1 ILS equals about 0.26 USD) and for each minute spent travelling 0.01 ILS is deducted from the budget. If he or she saves time during the experiment then they can keep the money left over. An additional flat rate of 20 ILS was paid after completing the experiment as a participation reward. Participants were instructed to complete the task by themselves and were forbidden to communicate with each other during the experiment.
Before commencing the experiment, participants filled in a simple onscreen questionnaire regarding their socio-demographic characteristics and their usual travel behavior patterns to campus. Once the experiment started, in each trial, participants in the treatment condition received ex-ante information about the travel time range (the minimum and maximum travel times) predicted for each of the two routes according to the design. However, a small degree of random variation was programmed (between 0 and 5 min around the daily mean) so that the information was not seen constant throughout the entire scenario. This information was unavailable in the control condition. In addition, all the participants, after confirming a choice, were shown onscreen the 'experienced' travel time (in minutes) for that day on the chosen route. This travel time was randomly drawn from the distribution of the chosen route's travel time range according to the particular scenario. This also guaranteed that participants in the treatment condition would have confidence in the accuracy of the provided ex-ante information. Foregone payoffs (i.e., feedback on the non-chosen route) were not provided. After the real travel time was revealed, participants were asked to press a button to go to the route-choice for the next day. When the last scenario was completed participants were revealed how much time they spent travelling in total and what was the total monetary cost of their travel time. Overall the average duration of a typical session lasted no more than 15 min per participant. For further details on the experiment design, see Ben-Elia et al. (2008).

Approach
The data collected by Ben-Elia et al. (2008) consists of a series of choices under different conditions of risk. This data were not designed with the objective of a priori testing RT or any other behavioral theory. Therefore, if regret appears as a significant effect, it provides a strong indication to the relevance of regret in similar route-choice decisions.
Since the data contains a panel of choices for each participant we use a modified version of the mixed logit discrete choice model. Mixed Logit (MXL and also referred to as Logit Kernel or Mixed Multinomial Logit) is an advanced and highly flexible discrete choice model. MXL accommodates random taste variation, substitution patterns, and correlation in unobserved factors unrestricted over time (McFadden and Train 2000) and can be derived under a variety of different specifications (Ben Akiva and Bolduc 1996;Bhat 1998) It is also easily generalized to allow for repeated choices, i.e., panel data, as well as lagged variables (Bhat 1999;Revelt and Train 1998;Train 1999).
For our purposes two types of models are specified: the first for expected utility (EU) and the second for expected modified utility (EMU) which includes the regret effect based on the formulation of Chorus (2010). We use the term 'modified utility' (MU) to distinguish the utility function according to RT from the term 'utility' (U) according to EUT.
Formally the utility (U) of alternative i for person n in response t is (Eq. 1): where b is a vector of fixed and random coefficients for alternatives' attributes-X; e is a vector of independently, identically distributed (iid) extreme-value type one error term. b has some distribution f (b 0 mean and a covariance matrix R b ). This term also captures the panel effects-varying between participants but remaining constant within the observation Accordingly, the expected utility (EU) of alternative i for person n in response t is (Eq. 2): where p j [0,1] is the probability that state-of-the-world j will occur at response t out of the set of J possible states of the world-S. Conversely, MU depends on both the considered and foregone alternatives. Following Chorus (2010), the modified utility (MU) of alternative i for person n in response t is (Eq. 3): where b, X and e are similar to Eq. 1 and the term in curly brackets represents the effect of regret towards alternative k when considering i. That is, in considering i, person n accounts also for the utility difference attributed to X for the foregone alternative k. q 2 0; þ1 ½ is a regret aversion parameter. Higher values imply that person n will become more and more sensitive to regret compared to an equivalent rejoice. In other words a higher value suggests that if for attribute X, k is outperforming i (i.e., a regret emotion) this will decrease the attractiveness of i more than in the reverse case where i outperforms k (i.e., a rejoice emotion).
Similarly, the expected modified utility (EMU) of alternative i for person n in response t is (Eq. 4): Assumptions and considerations The purpose of the model estimation here is to test whether regret influences route-choice behavior as it appears in the data by comparing various model specifications. To accomplish this several simplifications were allowed and further assumptions were made: First, given both the small (49 participants) and homogenous nature of the sample used in the experiment (undergraduates) it is not possible to include individual-specific factors (see also Discussion in Ben-Elia and Shiftan 2010).
Second, to allow a smooth comparison between alternative specifications (with and without Regret) we decided to include travel time as the only attribute explaining the route choice. The data provides us with two sources of travel time: ex-ante travel time information (description) and ''actual'' travel time (feedback). The latter is specified as a lagged variable. To keep the specifications simple a generic coefficient is used for all sources of travel time. That is, in relation to Eqs. 1 and 3, b corresponds to the travel time coefficient and is specified as the same coefficient for both routes and for all sources. Initially we also tested different coefficients for the two sources of travel time; however, they were not found to be significantly different from each other suggesting that the generic form is sufficient. For examples of more comprehensive models using the same data, see Ben-Elia and Shiftan (2010).
Third, in the treatment condition, the information received by the participants, in each trial of the experiment, simulates a simple Variable Message Sign (VMS) presenting a description of the expected travel time in a range from a minimum to a maximum value. This range creates the possible states-of-the-world that a participant would anticipate in his or her decision making process. Although the inherent assumption in both EUT and RT is that the decision maker can mentally produce the matrix of state-contingent outcomes even if it is not explicitly provided in the description of the decision problem, it is unlikely that a human mind would be able to mentally account for a large number of states-of-the-world. Likewise, given a range of possible outcomes, it is also unlikely that only the mean value (the mid range) would be considered as the only state-of-the world accounted for. Hence, it is assumed that participants would regard as a minimum two points on the range as being identified with the possible states-of-the-world-one below (i.e., the first quarter) and the other above (i.e., the third quarter) the mean value (see Fig. 1). These are referred as meanhigh (MH i ) and mean-low (ML i ) whereby Mean i \ MH i \ Max i and Min i \ ML i \ Mean i . For example, if for a certain trial Route A has a travel time mean of 25 min with a range of 10 min (as in Scenario 1), then MH A = 27.5 and ML A = 22.5, which are exactly the lower and upper quartiles of the travel time range. Naturally, any assumption regarding these or other sets of points suggested by the modeler is valid. However, it is reasonable to assume that participants would view extreme outcomes as less likely than the middle one. It should also be noted that the participants were not aware that the travel time distribution was in fact uniform meaning that all outcomes had the same probability to occur. Moreover, using extreme values e.g. such as the best and worst travel time on the range might well lead to inflating the estimates of regret we are looking for which seems counterproductive. Therefore, if these two mid points reveal significant evidence for regret this should provide an indication regarding what most participants consider as a base for comparing possible outcomes. Cognitive limitations would likely inhibit the number of combinations that travelers could mentally reproduce. For example, splitting the range by an additional point for each percentile increases the number of possible states to 8 for each considered route making the number of combinations quite difficult to contend with.
Fourth, as presented in the Introduction, the behavioral literature suggests the plausibility that emotions of regret or rejoice can also be triggered by the expected feedback received after a choice is made. It is then likely that in anticipating regret, participants would factor in some way both the experiential feedbacks and the information describing the expected outcomes. In this respect, Bar-Hillel and Neter (1996) have shown that in some cases regret effects generated by counterfactual thinking can be as strong as those generated by actual feedback. In each choice trial the participant receives the actual experienced travel time on the chosen route as an ex post feedback, but not simultaneously that of the foregone route. Consequently, he or she cannot know for certain which state-ofthe-world occurred at a specific trial on the route not chosen. Therefore, it is assumed that participants can compare the outcome of the considered route in the last trial with the memory of what had been experienced the last time that the alternative route was actually chosen. Accordingly, one can assume that regret emotions can be triggered not only by the differences attributed to descriptive information (as in the original version of RT) but also by the comparison between what was experienced the last time the considered route was chosen (i.e., experiential feedback) and the memory of what had occurred when the alternative route had been chosen. This can be regarded as a kind of variant on feedback conditional (RT). Weights can also be specified for descriptional and experiential information to account for the difference in the cognitive importance given to expected and experienced regret (or rejoice) in the choice behavior.
The fifth consideration relates to the treatment of risk perceptions (i.e., risk aversion or risk seeking) and how risk is related to regret. In his formulation, Chorus (2010) accounts for constant risk aversion by assuming a non-linear convex EU function. We initially tested the effect of constant risk aversion on EU, but found it not significant with our data. Consequently all our models applied a linear specification of utility. The literature suggests that risk aversion and regret aversion are often confounded in many experimental settings (Zeelenberg 1999). This can make the differentiation between the two effects quite difficult. Moreover, Zeelenberg et al. (1996) report an experiment where regret aversion can induce both risk-averse and risk-seeking choices depending on the type of feedbackexperiential or foregone. The latter induces more risk seeking behaviors than the former. Therefore, we decided to test indirectly for a relation between risk and regret by specifying different coefficients of regret aversion for each of the three travel time scenarios. By design, each scenario frames the two routes either as risky or as reliable depending on the level of variability represented in the travel time range (see Table 1).
Last, in all the discussions of Regret Theory, the inherent assumption is that the decision maker is presented with a description of the possible alternatives one can choose from. As noted above, feedback received following a choice can also be considered, but so far only in addition to the initial description. However, there is no apparent reason why regret cannot be triggered by an outcome of a choice which is not based on a complete description of the alternatives but rather on a gradual process of sampling and reinforced learning based on experiential feedback information. One can assume that ex-post regret could well occur regardless of the type of information provided, especially when the choice environment allows participants to test, more than once, each of the two alternative routes. Hence, there is added value to verify whether there is a real difference in the strength of regret emotions triggered by exposure to descriptional versus experiential feedback information. Given that the experimental design consists of two groups, i.e., conditions with and without descriptional information, it is possible to jointly estimate the strength of regret emotions under both conditions simultaneously.

Specifications
Based on the above discussion, six models are specified. Model A through D are based on the descriptional information (i.e., travel time ranges) and, therefore, are only applicable for the group of participants in the informed condition (N = 24). Models E and F are based on the full dataset and include a joint estimation of regret under both the informed and noninformed conditions (N = 49).
Model A: description-based expected utility Model A corresponds to a simple EU model where only the considered route influences its perceived attractiveness and utility is based on the provided descriptional information. This model is estimated as a control for comparing to more sophisticated specifications based on regret. The two points corresponding to two states of the world assumed for a given route i were described above (and see Fig. 1). Since EUT assumes the decision makers treat probabilities linearly (unlike, e.g., PT which uses subjective weights) and as the distribution of travel times in the experiment is uniform, the probabilities of the states of the world are assumed to be equally distributed. Therefore, there is a probability of 0.5 to be in the high or low state-of-the world for each route. Consequently the appropriate specification (for simplification we removed the person and trial notations) for Route-A (Route B is similar only with subscript B) is (Eq. 5): where b and e are as defined in Eq. 1.

Model B: description-based regret
Model B corresponds to RT under the assumptions of the original theory (Loomes and Sugden 1982). In this case, the modified utility function, MU, is influenced by both the attributes of the considered route and the alternative one. The choice between the two routes is influenced only by the description of the alternatives, i.e., the information presented by the VMS prior to the actual choice. Each route is assumed to have two possible outcomes (MH i and ML i ) and four states of the world are generated (according to the 2 9 2 combination of high and low values). Each state of the world has an equal probability of 0.25 to occur. This combination is also illustrated in Fig. 1. Consequently, the appropriate specification for Route-A is (Eq. 6): where b, q and e are as defined in Eq. 3.

Model C: description and experienced-based regret
As presented in the previous section, it is quite possible that participants can be influenced by both descriptional information as presented by the VMS and by the experiential feedback information provided following each route choice. Accordingly, for each state-ofthe-world as defined in Model B, we can specify the regret function as composed of the differences in the descriptional information (the four points on the travel time ranges) and the difference in the feedback information. The latter is based on the assumption that participants can recall the recent outcomes the last time each of the two routes was chosen.
Weights are assigned to both types of information to capture difference in cognitive importance. The appropriate specification for Route-A is (Eq. 7): where 0 \ w \ 1 is the weight attributed to the descriptive information (MH i , ML i ) and 1w is the weight for feedbacks; F i is the feedback received for Route i the last time i is chosen; b, q and e are as defined in Eq. 3. Note that w = 1 would mean that only descriptional information influences regret and in this case the formulation would be the same as Model B. Conversely, w = 0 would mean that only feedbacks influence regret-emotions but descriptional information provided ex-ante does not. In this case, the descriptive information enters the utility but does not appear in the regret function and the formulation collapses to two states of the world. The appropriate specification, for Route-A, in Model C, is now (Eq. 8): where b, q and e are as defined in Eq. 3.

Model D: description and experienced-based regret with risk perception
Model D expands the specification to include perception of risk by specifying different regret coefficients for each of the travel time scenarios (and see Table 1 for the definition) (Eq. 9): 0:25 bMH As þ 1 À e Àq s wðbMH As ÀbMH Bs Þþð1ÀwÞðbF As ÀbF Bs Þ ½ f g h þ 0:25 bMH As þ 1 À e Àq s wðbMH As ÀbML Bs Þþð1ÀwÞðbF As ÀbF B sÞ ½ f g þ 0:25 bML As þ 1 À e Àq s wðbML As ÀbMH Bs Þþð1ÀwÞðbF As ÀbF Bs Þ ½ f g þ0:25 bML As þ 1 À e Àq s wðbMLAsÀbMLBsÞþð1ÀwÞðbFAsÀbFBsÞ where b and e are as defined in Eq. 7, and q s (s = 1,2,3) is the coefficient of regret aversion in scenario s. Here, significantly different values estimated for q s would imply that regret aversion is not risk neutral.

Model E: regret aversion with and without descriptional information
Model E uses the full dataset to estimate the effect of regret for each of the experiment's groups-treatment and control-i.e., with and without (descriptional) information. Here, the motivation is not to identify if the source triggering regret is description or experience based as in the previous models. Rather, it is to verify if by exposing travelers to descriptional information as simulated in the VMS as well as to information gained from experience, would result in different degrees of regret aversion. Accordingly the model utilizes a scale parameter to estimate the group effect on the rest of the parameter estimates. This scale multiplies all the estimates relating to the non-informed group. It is similar to the approach used in a joint estimation of a discrete choice model based on revealed and stated preference data sources (Ben-Akiva and Morikawa 1990;Bhat and Castelar 2002). Moreover, whereas the modified utility of informed participants remains similar to Model C (as in Eq. 7), the modified utility of the non-informed participants is specified as composed only of the experiential feedback information and since here there is no possible way to identify the state-of-the-world occuring the only effect is that of the recent outcomes the last time one of the two routes was chosen. The appropriate specification, for Route-A, in Model E, is now (Eq. 10) where F i , b, q and e are as defined in Eq. 7. Superscript NI indicates this corresponds to the non-informed group.
Model F: joint estimation of regret with risk perception Model F expands Model E to account for the perception of risk as in Model D. The only change is the modified utility for the non-informed condition which is now specified for Route A as (Eq. 11): bF As þ 1 À e Àq s bF As ÀbF Bs where F i , b, q s and e are as defined in Eq. 9. NI indicates this corresponds to the noninformed group.

Model estimation
In all the models, EU and EMU are estimated using a log-likelihood (LL) maximization procedure. The EMU model's LL function (for EU replace EMU with EU) for the probability (P) to choose route i is (Eq. 12): is a normally distributed vector of random coefficients with b 0 mean and r 2 b variance for the travel time attribute; q is the regret aversion coefficient to be estimated; k is the non-informed group's scale and k nt = [(1 -d nt,I ) 9 k] ? d nt,I , d nt = 1 if person n and trial t belong to an observation from group I (i.e., informed) and 0 otherwise; N = 49 is the number of participants (= 24 in Models A-D), T = 300 is the number of route-choice trials, K = 2 is the number of alternative routes. As the unconditional probability is obtained by integration over the random coefficients and this integrand has no closed form, simulated log likelihood (SLL) is applied using random draws (Bhat 1999;Train 2003) (Eq. 13): where R is the number of draws (r). We used BIOGEME version 2.1 (Bierlaire 2003(Bierlaire , 2009) for model estimation. Simulated log likelihoods of all models were estimated with 1,000 Halton draws (Halton 1960) which significantly reduce the number of draws required compared to pseudo-random draws (Bhat 2003;Train 2000). The models were estimated with 100, 500 draws and 1,000 draws. The differences between the last two sets were negligible. The results presented here are for the set of 1,000 draws. We also applied appropriate guidelines to assure proper identification (Walker et al. 2004). The CFSQP optimization algorithm was used (Lawrence et al. 1997). Since the weight parameter in Model C can be confounded with the attribute coefficient (b), they cannot be estimated simultaneously. Therefore, the weights were specified as constants with a linear constraint equal to 1. Different sets of weights were tested in increments of 0.1 through a trial and error process.

Estimation results
The estimation results are presented in Tables 3 and 4. Goodness of fit (final log-likelihood) is measured with the log likelihood ratio test. When computed for the informed group only (see Table 3)-Models A through D-it shows that models B, C and D, which account for regret, are significantly better than Model A-the simple EU model (v B,A 2 = 310.06, p \ 0.001; v C,A 2 = 152.83, p \ 0.001; v D,A 2 = 341.36, p \ 0.001). The goodness of fit of Model D, which also accounts for risk perception effects is the best of the four models and the likelihood ratio test in relation to Model B, the second best, is significant (v D,B 2 = 31.3, p \ 0.001). When comparing the joint estimated models for both groups (see Table 4)-Model E and F, the goodness of fit of Model F accounting for the risk effects is better (v E,F 2 = 258.87, p \ 0.001). Naturally the goodness of fit of the joint models cannot be compared with the single group models.
All the coefficients in all the six models are significant. The coefficient r b is significant (p \ 0.001) implying the specification of the longitudinal panel is appropriate for the data structure. The coefficient for the mean travel time (b) is negative as expected and significant in all the models (p \ 0.001). The coefficients for general regret (q) are all significant (p \ 0.001). However, in Model B the sign of the coefficient is negative and incorrect according to the assertions of RT. A negative sign for q implies that a considered alternative is preferred when it is outperformed by a foregone alternative, which seems unreasonable.
In contrast, the estimate obtained for the regret aversion coefficient in Model C has the correct sign and seems reasonable and comparable with the values used by Chorus (2010). Transportation (2013)   However, in terms of weighting of the descriptive and the feedback information, the best result was obtained with w = 0. This implies that the descriptive information does not seem to influence regret; but rather the feedbacks are apparently responsible for generating the emotion of regret aversion. This result suggests that regret aversion is important and has a substantial effect in the experimental data. Moreover, it is evident that here regret aversion is more associated with the ex-post experiential feedback information compared to the ex-ante descriptive information. When accounting only for the descriptive information (Model B) the wrong sign of the regret parameter indicates that RT, in its original formulation, is not the appropriate theory to account for the observed behavior in this case. The high t stat of the coefficient suggests it is capturing some variability in the data, but with a wrong specification.
However, when regret aversion is specified to the feedbacks experienced by the participants according to the actual travel time payoffs (as in Model C), the results suggest that it is really the feedback information that better explains the choice behavior. This leads us to assert that emotions of regret are likely generated by the experiential feedback information rather then the descriptional information. A possible explanation for this result is that the feedbacks are more closely related in the traveler's mind with the objective of minimizing travel costs and less so to the description of the alternatives themselves. To our best knowledge, this result has not been demonstrated before in an empirical travel behavior study. As noted by Ben-Elia and Shiftan (2010), the effect of information was mostly relevant for the short run, when participants lacked experience and had little knowledge about the payoff distribution of each route, whereas over time the effect of feedbacks and experience became more dominant. The results regarding the effect of regret seem to concur with these findings as well.
The results obtained for Model D suggest that regret aversion is evident but changes among the different scenarios. Recall, that each participant concludes all three scenarios (in different orders). The estimates obtained for q indicate that regret aversion is stronger when the risk associated with the choice environment is low, as demonstrated in scenario 3 where both routes have low variability. Conversely, in scenarios 1 and 2, where one of the two routes is associated with more risk, regret seems to be weaker. This suggests that increasing the variability in the choice environment (what psychologists have referred to as the effect of payoff variability) decreases regret aversion. Regret seems stronger when it is more certain to occur. Low variability makes regret appear more certain to the participant In contrast high variability makes the loss of not choosing the alternative route appear less obvious. It is likely that this is attributed to hampering of learning as also demonstrated by Ben-Elia et al. (2008). That is, as variability in the choice environment increases, the ability of learning which route provides on average a better payoff, decreases.
In addition, the results seem to suggest that risk seeking might correspond to more regret aversion compared to risk aversion. In the case of scenario 1, where Route A, which is also on average faster, is associated with low variability and the slower Route B with high variability-the estimate of regret aversion is not significant. This suggests that when the alternative resulting in better payoffs, on average, is also regarded as less risky, regret is not observed. Nonetheless, it is also possible that the effect of risk aversion here is also confounding regret aversion. In comparison, in Scenario 2 where the faster route (A) is associated with greater risk, regret aversion is significantly higher. We recall that Ben-Elia and Shiftan (2010) demonstrate that attitudes towards risk in scenario 2 reveal on average more risk seeking tendencies. One possible explanation is that when facing a choice in a domain of losses (which also induces risk seeking behavior, i.e., gambling), the emotional amplitude of regret is greater when contending with a negative affective state, i.e., an outcome that leads to a possible loss. Conversely, when choosing the safer alternative also results in good outcomes (as in scenario 1) negative affect is not induced and regret is likely to be much weaker and even masked by risk aversion. In sum, these results assert that payoff variability in the choice environment appears to be negatively associated with the strength of regret aversion. Moreover, attitudes towards risk related to regret appear to be quite relevant as demonstrated by Zeelenberg et al. (1996) and especially in the case of risk seeking.
The results of the joint estimated models do not contradict the results above and present the same trends for the estimated coefficients. In particular the assertions that regret is associated more closely with feedback information and with the level of payoff variability appears to hold for both groups. However, an additional result is demonstrated by the estimate obtained for the non-informed group scale (k). k is significant in both Models E (p \ 0.001) and F (p \ 0.001). The estimates for k suggest that without descriptional information as in (the non-informed group), regret is significantly weaker. This means that regret aversion can be triggered even without any available description of the travel time distributions (i.e., without the VMS) simply from a gradual trial and error sampling of available alternatives and learning reinforced through experiential feedback information. However, in the presence of descriptional information (as in the informed group) regret emotions become much stronger. This leads us to the assertion that informed travelers are more likely to experience higher degrees of regret aversion than non-informed ones.
In terms of theory, though not a concrete proof, the results seem to indicate the relevance of the recent theoretical contributions such as feedback-conditional regret theory (Humphrey 2004). However, given that the experiment did not allow for foregone payoffs, it is not possible with the data we hold to completely investigate FCRT. This is left for future research endeavors. In addition the results obtained for models D and F demonstrate that risk perception and corresponding attitudes are likely to be correlated with regret.
Reducing possible threats to validity A common concern in longitudinal designs is the problem of participant fatigue confounding treatment effects or alternatives' attributes thus threatening the validity of the obtained estimates and results. Fatigue occurs when participants tire over time causing performance to deteriorate in later conditions or assessments. Some marketing and psychology studies find that the precision of respondents' choices declines moderately with repeated choice tasks because they become fatigued (Elrod et al. 1992). Conversely, learning effects manifest themselves in participants becoming better the more often they do the experimental task.
Proper experiment design is the first step in reducing the magnitude of fatigue problems, e.g., by counterbalancing treatment orders so that order effects can be assessed. If order effects are not distinguished the problem of carryover effects can be regarded as less detrimental on the internal validity of the results (Shadish et al. 2002). The second step is careful analysis of the obtained results. The approach typically used to distinguish between the two effects is based on the assertion that learning implies less noise to signal ratio from observed choices whereas fatigue results in larger noise to signal ratio. A typical measure is to examine the change over time in the variability of the response (e.g., changes in the standard deviation). Third, in choice modeling terms, learning is usually observed by a decrease (increase) in the magnitude of the variance (scale) parameter as the respondent progresses through the sequence of questions or (at least) until fatigue sets in. Fatigue, in contrast, is evident in an increasing value for the variance of the error term in later choices, Transportation (2013) 40:269-293 285 or equivalently, by decreasing its scale (Bateman et al. 2008). Several studies have been carried out to investigate the magnitude of fatigue and or learning in choice models. However, the evidence remains inconclusive. Bradley and Daly (1994) find fatigue effects in stated preference (SP) choice experiments involving a small number of repetitions. In contrast, Brazell et al. (1995) suggest fatigue effects may be minimal, whereas learning may sometimes occur as respondents are exposed to more replications. Furthermore, Brazell and Louviere (1997) reveal equivalent survey response rates and parameter estimates for respondents answering 12, 24, 48 and 96 choice questions in a particular choice task. Swait and Adamowicz (1996) show that task complexity is inversely related to fatigue. Savage and Waldman (2008) find that delivery formats whether online surveys or mail back questionnaires result in different scales with fatigue more apparent in online formats. Hess et al. (2012)  As fatigue or boredom is always a plausible alternative explanation that could confound the treatment effects, we have conducted a separate analysis of fatigue threats based on the state-of-the-art.

Analysis of fatigue threats
As noted earlier, a common threat in repeated choice designs is the threat of fatigue or boredom confounding the results and threatening their validity. To verify whether fatigue might have interfered with our estimates, we applied the methods suggested in the literature. First, an analysis of the robustness of the design. Second, we measured the signal to noise ratio obtained in the results by plotting the mean standard deviation (SD) of the maximization rate (i.e., the share of the Fast route in each trial. Third, following the debate in the choice modeling literature, we estimate the logit scale for different stages of the experiment, per group, in blocks of ten trials. Beginning with an evaluation of the design, Ben-Elia et al. (2008) who studied the same dataset, did not find significant order effects in their analysis. This asserts that the withinsubjects repeated design was successful in counterbalancing the treatment orders, therefore minimizing the risk of carryover effects threatening validity. This implies that the participants did in fact relate to each scenario independently and hence the risk that fatigue and learning were carried over from one scenario to the next is relatively small.
Next, regarding the signal to noise ratio, Fig. 2 shows the mean standard-deviation of the maximization rate over 100 trials (averaged out for all three scenarios in blocks of 10 trials). The results show that for both groups, informed and non-informed, the signal to noise ratio is decreasing as the experiment progresses. This indicates that learning is indeed taking place, at a faster rate with the informed group, whereas fatigue is much less evident. In fact we can assert that after the first ten trials on average, participants' become quite experienced in making the correct route choice that minimizes their time penalties. As demonstrated by Ben-Elia and Shiftan (2010), the learning curve, as seen in the mean maximization rate suggests that providing descriptional information does expedite the learning rate for the informed group while the trial and error learning of the non-informed group takes a longer time. The graphs of the SD demonstrate similar trends.
Last, we estimated a very simple mixed logit model similar to Model A (i.e., EU) without regret effects. For consistency considerations, in both groups, the only attribute include in this tested specification is the obtained travel time payoff (i.e., the experential feedback in each route F i ). The expected utility function for Route A is (Eq. 14): Ten scale parameters are specified for each experimental group (in total 20 parameters) each of these corresponding to a block of ten trials out of 100. For normalization purposes the first block, i.e., for trials 1-10 in each group is set to 1. Scenarios are ordered according to the treatment orders initially assigned for each participant. Like all the previous estimations, the model is estimated with 1,000 Halton draws. The simulated log likelihood function is (Eq. 15): where k gnt is the group scale for group g out of G = 20 groups; k 1nt= k 11nt = 1 and all other parameters are as in Eqs 12 and 13. Figure 3 presents the scale estimates (detailed results can be obtained from the authors by request). Scale estimates that are not significantly different from 1 (i.e., p [ 0.05) are marked with empty markers and the values in italics. To facilitate understanding of the results polynomial regression lines are plotted alongside the raw estimates (R 2 = 0.91, 0.92, respectively). The scales' estimates show that the non-informed group scales are gradually increasing as the experiment progresses indicating a learning effect. Where there is a decrease it appears in most of the blocks quite small and does not change the overall trend. It may be that there is some element of fatigue towards the end of the session. The informed group has a rapid increase in scale (indicating expedited learning) and then a period where scales are going up and down but with no clear trend. This stage is likely indicative of neither learning nor fatigue. It suggests informed participants are relying on the descriptional information to make their choices, whereas non-informed participants are still learning from trial and error. Towards the final blocks of trials the scale increases once more indicating further learning. Here, informed participants have gained sufficient confidence based on the combined effects of description and experience to choose efficiently as can also be seen in Fig. 2.
To summarize, the analysis of the fatigue threat does not provide sufficient evidence to suggest a significant threat to the validity of the results. Moreover, the analysis here shows similar patterns to those already demonstrated by Ben-Elia and Shiftan (2010) and Ben-Elia et al. (2008) who discuss the key role of learning in informed and feedback-based route-choice situations. The estimated scales raise another interesting issue related to how regret is influenced by learning. One possible hypothesis is that learning mitigates the amplitude of regret emotions as participants' subjective confidence in their choices gains strength. Our results on regret aversion show that on average, regret does seem to be an issue that arises under certain conditions. However, with the current data limited to 49 participants there is not enough variation to allow a proper analysis of this issue (i.e., to estimate regret aversion parameters for the different learning stages). We leave this for other researchers to ponder on.

Conclusions
Regret Theory (RT) has been recently suggested as a viable behavioral theory, in addition to traditional Expected Utility Theory and the well documented Prospect Theory, to explain travel behavior phenomena including route-choice. These three theories have also been adapted or at least tested in situations involving sequences of repeated choices where the decision makers can learn by being provided with experiential feedbacks. Repeated choices also characterize the day to day dynamics of travelling such as commuting.
In this study we made use of an existing dataset collected by Ben-Elia et al. (2008) in a relatively simple binomial repeated route-choice experiment where participants could  Fig. 3 Logit scale estimates and corresponding polynomial regressions in blocks of ten trials make their decision based both on descriptional information and experience. This dataset was not designed a priori to account for the occurrence of regret. Different model specifications accounting for different sources of regret were applied and compared to a simple choice model based on expected utility. In addition a joint estimation was conducted for comparing the strength of regret with and without descriptional information.
The results assert that emotions of regret do appear in the observed data and that regret aversion is likely generated by the experiential travel time feedbacks received by the participants following their route choices rather than the descriptional information provided to them before choosing. This result also concurs with the assertions of the more recent theories involving regret which account for feedbacks, such as conditional feedback-based RT (Humphrey 2004). However, regret aversion is much more evident when participants are provided with descriptional information whereas without such information, regret aversion exists but is significantly weaker. Therefore it is the combination of both descriptional and experiential information that results in higher levels of regret aversion. These results suggest that with the proliferation of emerging ICT for intelligent transport systems on road networks, it likely that travelers will experience more regret with their route choices. Increasing emotions of regret aversion can have significant impacts on network equilibrium as also demonstrated theoretically by Chorus (2010). This needs to be further investigated in a congested network like experimental setting which accounts for equilibrium (e.g. Lu et al. 2011). Furthermore, in accounting for perception of risk, it seems that regret aversion is more apparent in situations involving less risk, whereas riskier choices seem to inhibit regret. Perhaps this is due to the difficulty in perceiving the differences in outcomes (the payoff variability effect) and due to other emotional effects related to affective states related to risk attitudes.
Notwithstanding several limitations and future research directions to this study should be noted. First, it is necessary to obtain further evidence for the importance of reinforced learning in route choice behavior in experimental settings that also provide feedback on foregone (i.e., non-chosen) alternatives. This would allow a better comparison with the feedback-theoretical stream in Regret Theory such as FCRT. It would also provide an indication to the behavioral effects of future intelligent ICT that could well provide immediate foregone feedback. In addition, although fatigue does not seem to play a major issue in repeated route choice, learning effects and their influence in partially informed choice environments, such as transportation, are clearly an important topic worth further research. Moreover, it is of added value to understand how regret and risk perceptions are influenced by long-term learning. It is possible that with learning these effects might decline. Currently we can demonstrate that regret (and to certain extent risk perception) is, on average, an emotion which is likely to rise when both descriptional and experiential information are provided. However whether and how regret changes over time is still an open question. A study involving a larger panel of participants would make it possible to investigate the hypothesis that learning could well mitigate the amplitude of regret aversion.
Second, in this study descriptional information was presented to participants as a travel time range. Though useful to allow a visualization of travel time variability this is not necessarily the only way to describe expected travel times. The framing effect illustrated by Kahneman and Tversky (1979) suggests that different forms of presenting information will likely affect how choices are made. Recently, Waygood and Avineri (2011) have also observed framing effects in mode choice when provided with different information formats regarding their environmental-friendliness (CO 2 emissions). Moreover, we used a relatively strong assumption regarding how the information of travel time ranges would be processed (the upper and lower quartiles) and how this in turn corresponds to regret aversion estimates. However, there is nothing to preclude from other possible assumptions such as the best and worst travel times on the range or even a greater degree of heterogeneity in how travelers are likely to view travel time ranges. There is, therefore, a place to study more flexible travel information representations that do not result in cognitive overload and how these could assist perhaps in mitigating regret.
Third, as shown by Gao et al. (2010) travelers could well anticipate the provision of information on a route downstream resulting in more strategic behavior involving routing policies. There is added value to investigate how emotions of regret could be related to choosing among routing strategies and how this corresponds to the evolution of equilibrium in simulated networks.
Nevertheless, our study provides additional empirical support to warrant further investigations of regret in other travel behavior settings and especially in relation to the possible behavioral impacts of intelligent transportation systems.
Robert Ishaq is a Research Associate in the Transportation Research Institute, Faculty of Civil and Environmental Engineering at the Technion-Israel Institute of Technology. His research interests include travel behavior and demand models. He has extensive experience in transportation master plans, transit network planning, and traffic and transit assignments.
Yoram Shiftan is the Head of the Transportation and Geo-Information Department in the Faculty of Civil and Environmental Engineering at the Technion, the Israel Institute of Technology. He is the editor of Transport Policy and the vice chair of the International Association of Travel Behavior Research (IATBR). Transportation (2013) 40:269-293 293