18 August 2017

Surprise and belief update

In a previous post, I started discussing a paper [1] on the (un)surprising nature of a long streak of heads in a coin toss. My conclusion was that the surprise is not intrinsic to the particular sequence of throws, but rather residing in its relation with our prior information. I will detail this reasoning here, before returning to the paper itself.

Let us accept as prior information the null hypothesis \(H_0\) "the coin is unbiased". The conditional probabilities of throwing heads or tails are then equal: \(P(H|H_0) = P(T|H_0)=1/2\). With the same prior, the probability of any sequence \(S_k\) of 92 throws is the same: \(P(S_k|H_0) = 2^{-92}\), where \(k\) ranges from \(1\) to \(2^{92}\).

Assum now that the sequence we actually get consists of all heads: \(S_1 = \lbrace HH \ldots H\rbrace\) What is the (posterior) probability of getting heads on the 93rd throw? Let us consider two options:
  1.  We can hold steadfast to our initial estimate of lack of bias \(P(H|H_0) = 1/2\).
  2. We can update our "belief value" and say something like: "although my initial assessment was that the coin is unbiased [and the process of throwing is really random and I'm not hallucinating etc.], having thrown 92 heads in a row is good evidence to the contrary and on next throw I'll probably also get heads". Thus, \(P(H|H_0 S_1) > 1/2\) and in fact much closer to 1. How close exactly depends on the strength of our initial confidence in \(H_0\), but I will not do the calculation here (I sketched it in the previous post).
I would say that most rational persons would choose option 2 and abandon \(H_0\); holding on to it (choice 1) would require an extremely strong confidence in our initial assessment.

Note that for a sequence \(S_2\) consisting of 46 heads and 46 tails (in any order) the distinction above is moot, since \(P(H|H_0 S_2) =P(H|H_0) = 1/2\). The distinction between \(S_1\) and \(S_2\) is not their prior probability [2] but the way they challenge (and update) our belief.

Back to Martin Smith's paper now: what makes him adopt the first choice? I think the most revealing phrase is the following:

When faced with this result, of course it is sensible to check [...] whether the coins are double-headed or weighted or anything of that kind. Having observed a run of 92 heads in a row, one should regard it as very likely that the coins are double-headed or weighted. But, once these realistic possibilities have been ruled out, and we know they don’t obtain, any remaining urge to find some explanation (no matter how farfetched) becomes self-defeating.[italics in the text]

As I understand it, he implicitly distinguishes between two kinds of propositions: observations (such as \(S_1\)) and checks (which are "of the nature of" \(H_0\), although they can occur after the fact) and bestows upon the second category a protected status: these types of conclusions, e.g. "the coin is unbiased" survive even in the face of overwhelming evidence to the contrary (at least when it results from observation.)

There is however no basis for this distinction: checks are also empirical findings: by visual inspection, I conclude that the coin does indeed exhibit two different faces; by more elaborate experiments I deduce that the center of mass is indeed in the geometrical center of the coin, within experimental precision; by some unspecified method I conclude that the "throwing process" is indeed random; by pinching myself I decide that I am not dreaming etc. At this point, however, the common sense remark is: "if you want to check the coin against bias, the easiest way would be to throw it about 92 times and count the heads".

If we estimate the probability of the observations (given our prior belief) we should also update our belief in light of the observations. Recognizing this symmetry gives quantitative meaning to the "surprise" element, which is higher for some sequences than for others.



1. Martin Smith, Why throwing 92 heads in a row is not surprising, Philosophers' Imprint (forthcoming) 2017.
2. We only considered here the probabilities before and after the 92 throws. One might also update one's belief after each individual throw, so that \(P(H)\) would increase gradually.

17 August 2017

How surprising is it to throw 92 heads in a row?

Martin Smith (from the University of Edinburgh) discusses the relation between surprise and belief [1].

As a striking introduction, he claims that, in a coin toss, throwing a large number of heads in a row is not surprising. He deploys a version of the sorites argument insofar as "surprise" is concerned: if the individual events \(e_k\) of getting heads on the \(k\)-th throw are unsurprising, then so is their conjunction.

This particular example is easily dealt with by noting the importance of prior information: the sequence of heads is surprising because we know that "The coins don’t appear to be double-headed or weighted or anything like that – just ordinary coins", as Smith insists in his first paragraph [2]. I know nothing about the theories of Shackle and Spohn, but I doubt his analysis would survive adding event \(e_0\): "We checked that all coins were unbiased". On the other hand, I believe a Bayesian treatment similar to that given by Jaynes in §5.2 of [3] would be quite satisfactory (see also Chapter 4 for a general presentation and §9.4 for an example dealing specifically with coin tossing and bias.) Once again, the information brought by \(e_1\) through \(e_{92}\) contradicts \(e_0\), this is why it is surprising (or informative), not because the events would have an intrinsic "surprising" character.

The author's insistance on the equivalence of the various results: "[E]ach one of these sequences is just as unlikely as 92 heads in a row." glosses over the fact that each sequence is more or less compatible with the fairness assumption \(e_0\). Let us introduce the probability \(p\) of throwing heads. Then, \(e_0\) amounts to saying that the (prior) probability distribution \(f(p)\) of parameter \(p\) is peaked in \(0.5\) and has a certain width \(w\). The higher our confidence in coin fairness, the lower \(w\).

It is only in the case of absolute certainty \(w \to 0\) (\(f(p)\) is a Dirac delta) that the results are equivalent. As soon as \(w\) exceeds a ridiculously small value, the evidence brought by the 92 heads dramatically shifts the peak of the (posterior) probability distribution \(f'(p)\) close to 1. A sequence with 46 heads, although exactly as improbable, has no such effect (at most, it may lead to a modest decrease in \(w\).)

The surprise is not related to the probability of a particular sequence, but to the extent it challenges our belief; I believe this statement to be rather trivial (or at least uncontroversial) and indeed Smith reaches pretty much the same conclusion in the last —and most interesting— section of the paper (to be discussed in a future post) although he cannot see the element of surprise in the coin toss experiment.

1. Martin Smith, Why throwing 92 heads in a row is not surprising, Philosophers' Imprint (forthcoming) 2017.
2. Had we known that all coins were double-headed, throwing only heads would not only be unsurprising, it would be certain.
3. E. T. Jaynes and G. L. Bretthorst, Probability theory the logic of science, Cambridge University Press 2003.


7 January 2017

Liberalism vs. Conservatism

I have always found the liberal/conservative distinction difficult to draw, largely due to the several meanings of each term (e.g. the concept of "liberal" in political science and its casual use in the United States, on the one hand, and in Europe on the other.) Motivated by my recent reading of David Gress' From Plato to Nato, I tried to define each side by a set of principles, as small and as general as possible. This is my first attempt (work very much in progress):

Liberal

(L1) Individuals are equal.
(L2) The individual precedes the community (ontologically).

Conservative

(C1) The community takes precedence over the individual.
(C2) The "essence" of the community defines a set of values (religious, national etc.) that limits individual freedom.

Below the fold I discuss some consequences of these definitions.

4 January 2017

From Plato to NATO - review

      I have spent some time reading through David Gress' From Plato to Nato. I was not very impressed by the book, but the exercise helped me reflect on the definition of liberalism and on the difference with respect to conservatism, so I took the opportunity to write a reaction (more than a review).

      Gress presents his whole interpretation as opposed to a "Grand Narrative" (GN) that has supposedly been very popular and that hides the true origins of Western civilization. His Introduction starts by:

      Liberty grew because it served the interests of power. [...] The key historical insight underlying this book is that liberty, and Western identity in general, are not primarily to be understood in the abstract, but as a set of practices and institutions that evolved, not from Greece, but from the synthesis of classical, Christian, and Germanic culture that took shape from the fifth to the eighth centuries A.D.

      This phrase summarizes not only the author's conclusions, but also his strategy of discourse: he switches before the history of ideas and that of events, favoring the latter but also resorting to the former when needed.

      18 December 2016

      Business as usual at the White House

      After the generalized commotion surrounding the US elections died down, the feeling of surprise lingered in the press, combined with predictions of imminent disaster. Against this alarmist tendency, I'm putting forward three obvious points:
      1. Although Trump's victory was unexpected (i.e. went against poll predictions), it follows a long-term pattern of Republican and Democratic presidents alternating every eight years. This has been the case since Eisenhower, with the exception of Reagan's first mandate.
      2. Speaking of Reagan, there are some striking similarities with Trump: both are (were) charismatic showmen, but not very intellectual and with a penchant for made-up stories. Time will tell how deep this resemblance goes.
      3. In contrast with his populist campaign speeches, Trump's post-election declarations and his cabinet choices signal that he will probably follow very closely the Republican platform: pro-big business, anti-abortion, pro-Israel, anti-environment control, for increased military spending, tax cuts for the rich, free trade and reductions in welfare programs. How many of these points were also on Clinton's agenda is left as an exercise for the reader.
      Finally, what alarms me is not how far Trump is from mainstream Republicans, but rather how close the Republican party is to Trump.

      1 December 2016

      CNRS positions - the 2017 campaign

      The detail of the 2017 campaign for permanent research positions at the CNRS (Centre national de la recherche scientifique) has been published in the Journal Officiel (see links below) and the submission site is open. The submission deadline is January 6th 2017. There are 211 open positions at the CR2 level (4 less than in 2016), 75 CR1 (2 less), 256 DR2 (+3) and 2 DR1 (+2). The total number has been stable over the last five years, as shown in the graph below:

      The official texts: CR2, CR1, DR2, DR1.

      20 November 2016

      The Ewald sphere

      The Ewald sphere is a widely used concept, but one that is quite difficult to grasp in the beginning (at least it was for me, as well as for some of my colleagues.) It can be seen as a way of converting vectors between the "real" space, in which the experiment is performed, and "reciprocal" space.