Erratum to: Generalized Thompson sampling for sequential decision-making and causal inference
© Ortega and Braun; licensee Springer. 2014
Received: 24 June 2014
Accepted: 7 August 2014
Published: 1 October 2014
Decisions in the presence of latent variables
We correct errors in equations (14), (15) and (19) of the main text.
Equations (14) and (15)
and the normalization constant and an outer variational problem as described by equation (16) in the main text. Note that deliberation renders the two variables x and θ dependent.
where we have expanded the fraction by .
with normalization constant . In the limit α → ∞ and β → 0, the Thompson sampling agent is determined by the solutions and p(θ)=p0(θ). Sampling an action from is much cheaper than sampling an action from equation (18) because of the reversed causal order in θ and x, which implies that β/α→ 0 in equation (ii) instead of β/α→∞ as in equation (17).
which is exactly equivalent to p(x) in equation (19). To sample from equation (19), we draw θ~p0(θ) and accept if u≤eα U(x,θ)/e α T , where .
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.