Comments on “Pop Bayesianism: cruder than I thought?”

Comments

acupuncture

Peter Steel 's picture

I am sorry to see that your opion about acupuncture is plain wrong. There is a very large research base supporting Traditional Chinese Medicine nowadays, and while some obviously is of low quality, some of that which is available is is of the highest order.
See for example

PubMed:
http://www.ncbi.nlm.nih.gov/pubmed/22965186
(many more articles on PubMed available of course)

Merck Manual:
http://www.merckmanuals.com/home/special_subjects/complementary_and_alte...

The World Health Organisation:
http://apps.who.int/medicinedocs/en/d/Js4926e/5.html

"Nature" magazine:
http://www.nature.com/nature/journal/v480/n7378_supp/

And even Harvard Medical School teaches a course in acupuncture nowadays:
http://cme.med.harvard.edu/index.asp?SECTION=CLASSES&ID=00342317&SO=N

Acupuncture: not worthless

Hi Peter,

Thank you very much for this. I'm definitely willing to have my mind changed by evidence.

However, I think you may be responding to something I didn't say. What I wrote was: "it has little practical value, and its elaborate theoretical framework is nonsense."

I didn't say acupuncture doesn't work at all, or that it is worthless. I deliberately chose it (over, say, homeopathy) because I knew that it has been validated as having some beneficial effect in some cases. Analogously, Bayes' Rule is actually useful in some cases; it's just that the Bayesians' claims for it are highly exaggerated.

I guess "little" is a subjective judgement. Are there cases in which acupuncture is significantly superior to alternatives? What I had read previously was that it has been shown to be superior to placebo for pain treatment, and that's the subject of the first article you cited. It concludes that "acupuncture is more than a placebo"; however, its superiority to placebo is "relatively modest," which doesn't suggest great practical value. Maybe useful for patients who are allergic to opiates?

The Merck piece says acupuncture "may" be useful as an adjunct treatment for other conditions. If there's better evidence than "may have some effect," I'd be interested to hear about it. (The WHO thing was too long to read, and didn't seem to try to assess the credibility of the studies.)

The Nature issue is on TCM generally, not acupuncture; it has only one acupuncture article, "Adenosine A1 receptors mediate local anti-nociceptive effects of acupuncture." In other words, acupuncture relieves pain as a result of neurotransmitter release.

That highlights the second half of my claim: "its elaborate theoretical framework is nonsense." The reasons acupuncture work evidently have nothing to do with yin/yang balancing, the five elements, qi meridians, or any of the rest of TCM's intricate philosophy. Further, my impression from talking to practitioners is that they mainly ignore the theoretical framework in practice.

Anthropologists have observed this as a general principle: magical/religious/healing systems often have a pretentious intellectual structure that has little to do with the routine course of the ritual. Its actual function is to create a priesthood, or in-group who can spout baffling jargon that justifies their power.

This seems to be the case for Bayesianism also. Leaders pepper their writing with allusions to the obscure metaphysics and math, which are only vaguely related to their actual conduct of reasoning.

If, after this, you still think I'm being unfair to acupuncture, please let me know why.

Thanks,

David

David -

Jason's picture

David -

I suggest you slightly misconstrue the point of Galef's video. You summarize her message as “don’t be so sure of your beliefs; be less sure when you see contradictory evidence.” Perhaps you would get more out of it if you viewed it rather as a suggestion to use probabilities AT ALL, rather than yes/no dichotomous beliefs.

For example, if I ask the average person, "Do you believe God exists?," most people would answer either "Yes" or "No." A few (agnostics) would answer "Maybe." How many people would say "I assign a 37% chance to God's existence"? Probably not many. Well, of course nobody's going to say that, you'd sound like a freak, but I read Galef's comments as saying that is how we should train ourselves to think, at least implicitly.

This I think is the meaning behind her segment describing the difference between changing one's mind - i.e. making a total switch from "I believe X" to "I believe not-X" in one big jump in the face of overwhelming evidence - VS. making small changes in one's probability assignment with each small piece of new evidence. In other words, beliefs in "grayscale" rather than "black and white."

One benefit of this is that we go from examining our beliefs purely for Correctness and instead also look at how well they are Calibrated. My ability to form factually correct beliefs will always be limited by my available information, my cognitive abilities ("intelligence") and the time I have available for research and consideration of the topic. However, I can learn to be internally epistemically well-calibrated. That is, if I take all of the propositions to which I assign 20% probability - do 20% of them turn out to be correct? They should be.

This is useful for the same reason it is useful to focus on one's technique and personal performance in sport, not just the final outcome. Winning a match depends on many outside factors. Getting better over time is about internal factors, i.e. my own performance. Nevertheless this will tend to result in winning more often. Similarly if I think about things probabilistically and learn to be better calibrated, then over time my picture of the world will grow more accurate.

Futhermore, since I cannot update on beliefs if I assign them a 100% or 0% probability, I have to admit to less-than-complete certainty in any case, leading hopefully to more intellectual modesty.

I hope this perspective is useful to you.

Jason

Grayscale

Hi Jason,

Thanks for your comment!

So, I have three queries about what you suggest.

1) Do most people really think in black and white? Or is this a straw man?

I suggest that "do you believe in God?" is a special case, because it's not really about a belief in the epistemic sense, it's a question of tribal affiliation. So let's set that aside. And, more generally, when asked "do you believe X", perhaps you are supposed to answer "yes" or "no"; and this may not reflect anything more than social convention.

I bet that experiments would show that everyone does keep track of uncertainty and some sort of informal measure of confidence. (I can't cite any evidence off-hand. If you think that's implausible, we can go looking.)

It's probably true that people are generally more confident than they should be, and education can correct that.

2) Are numerical values a good way to think about uncertainty in general?

Clearly there are cases when using explicit probabilities is the right approach. Times in which you have extensive frequency data, and reason to think the future will be similar to the past, are examples.

Part of my technical criticism of Bayesianism is to suggest that such cases are unusual. In most cases, using explicit numerical values will add confusion and spurious meta-level confidence, not clarity.

See my comments on this Slate Star Codex post.

3) Does anyone actually consistently use numerical probabilities in everyday situations of uncertainty?

I suspect that even the most committed Bayesians don't, in practice. If you forget where you left your keys, you don't make up a list of all the possible places they might be, with numbers attached to them.

As I mentioned in the OP, Julia Galef acknowledges this: she doesn't often do the actual numerical calculations in practice, despite having based her career on Bayesianism. My suggestion is that that this is because doing so would be obviously dumb.

So I come back round to the view that the valuable teachings of Bayesianism are:

A) Be aware of uncertainty; and
B) Explicit use of probability theory is sometimes extremely useful.

Both points I agree with strongly. But pop-Bayesianism goes much further, into claims I find distinctly dubious.

I see your Buddhist blog

Scott Alexander's picture

I see your Buddhist blog criticizing Bayesianism and raise you a Bayesian blog criticizing Buddhism.

I actually have a point there, which is that I think you and he are making similar errors - attacking an entire philosophy because the popular well-known symbol of the philosophy used in very short introductions for laypeople doesn't contain the entire philosophy in and of itself.

I think Bayes' Theorem is pretty neat, but it's neat for reasons you probably can't figure out if you just stick to Bayes' Theorem. I think this essay gives some good examples, if you haven't already read it: http://yudkowsky.net/rational/technical . And from there you can branch out to all sorts of interesting things like complexity priors and calibration training and cognitive biases, but the basic insight is that reasoning can be mathematical and systematic and you can use knowledge to make it better.

Where to direct a critique

Hi Scott,

It's very nice to see you here. I hugely enjoy and admire your blog.

I see your Buddhist blog criticizing Bayesianism and raise you a Bayesian blog criticizing Buddhism.

That's really funny!

I think you and he are making similar errors - attacking an entire philosophy because the popular well-known symbol of the philosophy used in very short introductions for laypeople doesn't contain the entire philosophy in and of itself.

Hmm. What I tried, and probably failed, to convey in this post was that watching the video made me realize that there is this sort of of very superficial introduction. And that made me re-think how to direct a critique. I had only been familiar with the more sophisticated presentations by Eliezer Yudkowsky and others on LW.

The central point of the critique in both cases would be the same. Namely, numbers are sometimes the right way to think about uncertainty; but in many/most cases they are unhelpful, or worse than useless because they give an illusion of understanding and control that isn't there. What I mostly find missing in the Bayesian world is an acknowledgement of that—much less an active investigation to discover heuristics for deciding when to use what sorts of techniques.

What I find bizarrely "spiritual" is the insistence that Bayesianism is The Answer. That phenomenon is something I want to understand, and I think I do half-understand. And I think my half-understanding is illuminating for other phenomena that are maybe not obviously related.

However, the way I'd present that explanation would be very different depending on who the audience is—people who find this video meaningful, or people who can read the "Technical Explanation of Technical Explanation."

A second uncertainty is whether to address the critique to the community itself, for the benefit of the community; or whether to analyze it as a sort of specimen, for the benefit of outsiders. (Or, of course, neither of the above!) Again, the presentation in those two cases would be quite different.

I think Bayes' Theorem is pretty neat, but it's neat for reasons you probably can't figure out if you just stick to Bayes' Theorem. I think this essay gives some good examples...

I've just now read the first half, and timed out. Then I looked for a conclusion and couldn't find one.

His radiator example at the beginning seemed to exactly address my criticism—that Bayesianism predisposes you toward the wrong kinds of models for most things. (I was amused to find that I got the right answer after reading only the first two sentences of the Verhagen quote.) But then he seemed to drop that and wander off into outer space. (To be clear, none of this stuff is mathematically difficult for me; I understand the technical details, I just disagree with their application.)

the basic insight is that reasoning can be mathematical and systematic and you can use knowledge to make it better

To that, I say: HELL YEAH!. To the extent that y'all are promoting that, I'm totally on your side.

But, Bayesian stats is one small branch of mathematical understanding.

In the piece you linked, Yudkowsky writes:

Let's get it out of our systems: Bayes Bayes Bayes Bayes Bayes Bayes Bayes Bayes Bayes...

The sacred syllable is meaningless, except insofar as it tells someone to apply math.

Right. So why doesn't he get it out his system? Here he's the one calling it a "sacred syllable." Apparently he's aware of the quasi-religious nature of what he's doing. What's up with that? This squicks me slightly. And then when Singularitarianism is added, the squick gets explosive...

David

an ex-acupuncturist nontheist, statistics fan Agrees !

Excellent write up, David!
This is something I have thought about but not explored - thank you for the summary of important points.

The hyper-rationality (deluding ones' self of the role of rationality in everyday life) seems to me to be a 6/10 MTAS threat for atheists, for the general population I probably agree with you that it is a 2/10 threat.

So sending your counter-meme out there more actively could serve atheists superbly. Yet, I could see theists picking up on this and getting a simplistic summary of this post itself to dismiss the value of probabilistic skills too.

Ah, the complexity of human self and other manipulation.

Your last paragraphs on non-theistic eternalism is fantastic -- I will have to start sharing it in appropriate circles.

As a former acupuncturist, I largely agree with your evaluation. (countering Peter Steel -- to whom your comment was perfect). And as you said, Homeopathy would be a more dangerous element.

I look forward to the folks to responded to. I will be back to read more and think about. Thank You

More dangerous for atheists

Hi, Sabio,

Yes, I think that some kind of ideologized Rationality-worship is a much more serious risk for American atheists than the general population. Rationality, after all, was what saved most American atheists from theism. Worshiping it is practically inevitable as a stage in personal development. (I didn't do graduate work in mathematical logic by accident...)

In countries where atheism is the de facto default—much of Europe, now—Rationality-worship isn't particularly prevalent.

Still, the emotional need for ultimate explanations remains, and most atheists remain eternalists of some sort. Political ideologies and psychotherapy have been popular substitutes for Christianity in Europe.

I hope to provide a box of tools for atheists to identify and free themselves from such ideologies, by going to their root—the stances, and emotional needs they address—rather than arguing with belief systems.

Excellent post, I

Excellent post, I unsurprisingly agree with almost all of it.

I think if you looked at the history of Bayes cultists you would find that many of them are refugees from another cult, that of Ayn Rand and objectivism. If you are caught in that dismal trap, a new belief system that says you don't have to have absolute certainty about everything would be both a relief and an improvement.

Objectivism and Bayesianism

Thanks, that's an interesting connection!

I don't know much about Objectivism, but I gather it also worships some ideologized concept of Rationality. So this makes sense.

One thing I find interesting about Bayesianism is that it starts by acknowledging incomplete knowledge and inherent uncertainty. Those are manifestations of nebulosity, and so undercut all eternalistic systems. This is the right starting point for any realistic approach to meaningness.

What Bayesianism claims to offer is an optimal response to nebulosity. (In some rare situations, it actually does, and in those cases Bayesian decision theory is the right framework, of course.)

That makes it appealing if you can mis-use it to blind yourself to nebulosity in general. "Situation is ill-defined, unstable, complex, mostly unknown? No problem! We'll just apply Bayesianism, and that guarantees we'll take the best possible action!"

This eliminates any need for genuine doubt. You still don't know what the outcome will be, but you don't have to worry that you are approaching life totally wrong.

Certainty about uncertainty

Yes, you have captured it perfectly. They are in some sense working the same area as you, but instead of fully coming to terms with nebulosity they are trying to construct a sold layer underneath it. Good luck with that!

Meaningness

Jason's picture

David -

This conversation has moved in interesting ways since I was last here!

Your comments suggest you are looking at "pop-Bayesianism" through the lens of your own work on meaningness, which is probably unfair, at least as far as groups like Less Wrong are concerned.

Meaningness is about meaning(lessness), value, purpose, significance. Bayes (actually now I think they are all about Solomonoff Induction as The Answer, which I think has pretty obvious problems that are being overlooked) is about drawing conclusions and making decisions under conditions of uncertainty about FACT, not value.

Now, these groups are doing work on ethics elsewhere (in this area I think many are hindered by the fact that they consider the philosophical tradition wrongheaded and beneath them and so ignore the many contributions other thinkers have made on the same topics they are exploring), but at least at their more basic/fundamental levels, these Rationality groups are trying to formulate methods to come to more correct beliefs and to more effectively achieve one's ends (they do not necessarily specify what these ends should be).

You stated,

"What Bayesianism claims to offer is an optimal response to nebulosity."

I would say this is untrue, or at least only true in the area of factual uncertainty (vs. value or purpose or meaning -type ambiguities).

"That makes it appealing if you can mis-use it to blind yourself to nebulosity in general. "Situation is ill-defined, unstable, complex, mostly unknown? No problem! We'll just apply Bayesianism, and that guarantees we'll take the best possible action!""

Do you think it is a priori impossible to develop best-practices for decision making under conditions of uncertainty?

Less Wrong-types think that it is possible, and they are trying to build it. Interpreting them charitably (and I think accurately), I would say they accept that there are problems with human cognition (heuristics & biases), they want to do better, they need a standard for what better means, and they use mathematical probabilities (with Bayes as the poster child) as that ideal standard. Their question is how to get real humans in real-world situations closer to that ideal. Currently there's a muddle in the middle.

Another way to say it would be that they are trying to apply the theoretical insights of (aspects of) probability theory and cognitive psychology in real life and/or daily life. A very different project than yours.

Nebulosity of fact too

Thanks for your continued interest!

I actually use "nebulosity" to analyze matters of fact as well as value. That's not mentioned in the introductory page on it. Unfortunately, most of even the structure of the book is currently missing from the site, which naturally causes all kinds of confusion! Highly frustrating; I wish I could get time to write more.

There's an underlying ontology, or more accurately anti-ontology, which is hinted at here. I'll make a case that breadbox-sized factual reality can't ever be precisely described.

I did a graduate seminar in Solomonoff stuff in 1985 or so. Beautiful mathematics; I loved it! But it has absolutely nothing to do with the real world. Trying to use it to fix the problem of finding Bayesian priors is a move of religious desperation.

Do you think it is a priori impossible to develop best-practices for decision making under conditions of uncertainty?

Probably. But the main point is that Bayesian methods are only useful when a situation is sufficiently well-characterized. You have to already mostly understand what is going on. This is the point that Bayesians seemingly either don't understand, or deliberately obscure.

One could make a list of features that have to minimally apply before Bayesian methods are useful. I.e. what "well-characterized" means in this context. I haven't seen such a list anywhere; have you? Maybe drawing this up would be useful...

When a situation isn't well-characterized, Bayesianism is worse than useless, because it makes you feel like you have control when you don't. Overconfidence can lead to disaster.

there are problems with human cognition (heuristics & biases), they want to do better

Absolutely. Teaching people about Kahnemann-Tversky type stuff is really important, for instance. And everyone should learn the basics of probability in high school.

But probability isn't rationality—it's a technical tool, useful only in some circumstances.

Denying this is the central point of Yudkowsky's My Bayesian Enlightenment, which is about how he came to found pop Bayesianism. His enlightenment experience was realizing that probability theory is not a tool, it's The Answer To Life, The Universe, And Everything.

Another way to say it would be that they are trying to apply the theoretical insights of (aspects of) probability theory and cognitive psychology in real life and/or daily life. A very different project than yours.

Much less different than one would suppose, based on what I've written here so far! That may or may not become apparent as I write more.

I'm mostly trying to aim this book at a popular, spiritual-seeker type of audience. But it's deeply informed by having done a PhD in artificial intelligence, and having read a lot of cognitive psychology as part of that.

meaning

Jason Clark's picture

So is there meaning to life or not?

is there meaning to life?

Oh. Well, the short version of my view is:

  • There isn't a meaning to life; life is too complicated for that
  • Meanings are everywhere; we're awash in them
  • There is no ultimate source of meaning; and there are no inherent, objective meanings
  • However, meanings are not purely subjective either
  • There are many sources of meaning, some of which we don't understand and probably will never understand

Is that helpful? (I'm not sure what point of view you are coming from, so it's a bit hard to know how to respond.)

No One Wrote (or is writing) Your Life

@ David

You said:

I hope to provide a box of tools for atheists to identify and free themselves from such ideologies, by going to their root—the stances, and emotional needs they address—rather than arguing with belief systems.

It is an excellent project. On my site, I flip between 'arguing with belief systems" so as to point them to their root emotional uses (or pragmatic uses), and then 'going to theirour shared roots' hoping to point how various belief systems can address them. (see my last diagram).

So I agree with your project, but do feel arguing belief systems can be useful. And that being specific/concrete can be useful.

Seque:

I love your reply to Jason Clark on the question "Is there meaning to life or not?" And especially your approach that understanding a person's point-of-view can help us make more meaningful replies -- speaking in the abstract can often be a waste of time.

But some additional thoughts:

(1) The mistaken notion that there is some THING called "Life" has been clear to me for a long time. We are so easily fooled by our own intention -- language. But to say "life is too complicated" to have meaning seems odd -- because complicated things have meaning all the time. Perhaps a persons life is not a "purpose intended thing" and thus can't be evaluated as a whole to have meaning, much like a play or a novel. Only a mythologizer can accomplish that.

(2) I have no understanding of what you mean by "meanings are not purely subjective" unless by "meaning" you mean "patterned relationships." Which is a different sense than the normal nuance implied in "The Meaning of Life", I think. Point: "Meaning" has a lot of senses that gets a conversation into knots quickly.

Specifics

Thanks for your point about life not being a thing! Correction gratefully accepted.

I agree that arguing with specific beliefs can be useful. I'm not going to do that for Christianity, for instance, because there's already many people doing a good job (including you).

Relatedly, I highly recommend the most recent blog post by Scott Alexander (who commented above), which is about how difficult it is know what will be most helpful when blogging about controversial topics. He points out that Reddit's r/atheism has been hugely useful in giving some fundamentalist kids permission to exit, while discrediting atheism for some other people.

What is on this site so far is mostly very abstract, because it's the introduction. The "meat" of the book will be more specific.

I'm not sure how much to argue with particular non-theistic eternalist ideologies. I'm publicly flip-flopping about whether to do a detailed critique of Bayesianism, for instance. The point of that would be to show its problematic emotional dynamics, rather than to argue that The Other Leading Brand of statistics is more correct technically. But some technical argument is needed, to show that Bayesians systematically overlook the limited applicability of what is, actually, just a statistical technique, not an eternal ordering principle.

Some other examples are psychotherapy-ism, assorted political ideologies, progress-ism, Romanticism, quantum codswallop, utilitarianism, and UFO cults. The better-known ones of those already have been extensively critiqued, and I don't want to duplicate the literature on how communism functioned as a religion, for example. On the other hand, picking on the little-known ones seems unfair, like playground bullying or something.

I have no understanding of what you mean by "meanings are not purely subjective" unless by "meaning" you mean "patterned relationships."

Yes, that... Although maybe not in the sense you meant; I'm not sure.

Which is a different sense than the normal nuance implied in "The Meaning of Life", I think.

Well, part of my job will be to try to persuade you that it isn't a different sense.

To put it a different way, there's several different ideas wrapped up in "The Meaning of Life." One is that meanings, or at least "real" meanings, are necessarily objective, and inherent in something other than oneself. That idea is just wrong (as many people would now agree). If you subtract that, the only alternative might seem to be that meanings are mental objects, or subjective. That's also wrong, which is non-obvious.

I'll argue that the most simple, mundane meanings (like the meaning of breakfast) are relational—they involve you, your yogurt and jam, the spoon and table, the people you are eating breakfast with, and (to decreasing extents) everyone involved in creating that situation, and all the non-human actors who were also involved. The more of that stuff you remove, the less meaning is there.

Someone committed to the representational theory of mind will say that, no, you don't need any of that stuff, the meaning is just in your head, and it's good enough to represent all those other things. This idea is deeply wrong, but so ingrained in current American academic philosophy, and therefore cognitive science, that it's hard to shake people out of.

Much work to do...

Curiosities about Thomas Bayes

I don't think it contributes much to the discussion, but I found interesting that it wasn't mentioned that Thomas Bayes was a Presbyterian minister.

From Wikipedia:

He is known to have published two works in his lifetime, one theological and one mathematical:

1- Divine Benevolence, or an Attempt to Prove That the Principal End of the Divine Providence and Government is the Happiness of His Creatures (1731)

2- An Introduction to the Doctrine of Fluxions, and a Defence of the Mathematicians Against the Objections of the Author of the Analyst (published anonymously in 1736), in which he defended the logical foundation of Isaac Newton's calculus ("fluxions") against the criticism of George Berkeley, author of The Analyst

By the way, I studied probability (and Bayes' theorem) in high school, and I liked it, but it was pretty obvious that it was pretty useless in everyday life (as is almost everything teached in high school, but that's another issue)

Redefining "God": as a method

To fight the naive magical thinking, the blinding patriotic rhetoric and the tribal exclusivism of religion I see several strategies:

(1) Fight the particular idiocies as they come up (frustrating but useful, even if not getting to the root of the problem)

(2) Fight religion in general (not my choice)

(3) Fight the aspects of the "Believing Mind" that generate those and similar secular habits. (my favorite)

(4) Redefine "God" so as to neuter (dis-empower) those aspects (very powerful, but not my path)

@Alfayate:
I agree with your points on Bayes. I also think many writers centuries ago could not escape their culture enough to allow any of the options above except perhaps #4. Some versions of "God" are far better of others, don't you think. Perhaps Bayes was, like others, using "God" as a tool to get people to believe what he does. We see the same done in Buddhism - re-definitions, re-workings, re-interpretations -- all methods to convince others of another path: harmful or helpful or both.

Value of Bayes' Rule

Also, the formula is almost never useful in everyday life.

On the other hand, once you understand the basic principles of probability,
Bayes’ Rule is obvious, and not particularly important.

I disagree. Although it's true that the formula is almost never useful in everyday life in the sense that you'd need to do any calculations with it, it's very useful to understand the extent to which it's crucial for all your reasoning processes.

For example:

  • Understanding Bayes' Rule helps clarify just what kinds of things should count as "strong evidence" - it tells you that Y is strong evidence about X being true to the extent Y is the kind of thing that you would only see if X were true (i.e. only X and nothing else would make you see Y). That's a very generally applicable rule that's useful for evaluating the correctness of a lot of different things. Yes, it is certainly possible to understand the above even without knowing the exact formula, but at least I personally found that knowing the exact reason (read: the exact math) for that rule made it feel like I understood it better.
  • It also helps understand why people tend to hear what they expect to hear instead of what the other speaker is saying: see e.g. http://lesswrong.com/lw/hv9/rationality_quotes_july_2013/9alt . Not only that, it makes me personally more aware of the way in which I personally might misunderstand the claims or experiences of others, and reminds me to consciously consider alternative explanations to somebody's words/motives besides just the first explanation that pops to mind, since I know that my
    priors may or may not be correct.
  • http://lesswrong.com/lw/2el/applied_bayes_theorem_reading_people/ (not very happy with this post, but it should get the rough ideas across)
  • And generally it helps me remember that in order for my beliefs to be accurate, I have to update them in a way that actually makes them correspond with reality, and Bayes' theorem shows many of the necessary preconditions for that.

That said, I do agree that just saying "Bayes' theorem!" is often unhelpful, and that one would instead need lots of worked examples of its implications to make it theorem really useful. (And maybe you could just give the examples and skip the formula entirely.) I tried to do something like that with the "What is Bayesianism" article as well as the "Applied Bayes' Theorem" one, but I don't think I did very well on either.

Bayes Rule without numbers

Hi Kaj,

Thank you very much for this thoughtful comment; sorry to be so slow to reply to it.

Bayes' Rule is a way of calculating probabilities when you have numbers. The uses you describe here are ones that don't involve numbers. And, this seems to be almost always the case in "pop Bayesianism."

Basically, this doesn't make any sense to me! (And your examples, I'm afraid, don't help me.) But I'm trying to find a sympathetic interpretation. So maybe it goes something like this:

Bayes' Rule as such is almost never useful. However, it is valuable as a symbol of a general rational world-view, in which probability theory plays an important heuristic role. The Rule's value is as a sort of handle for the whole world-view, or reminder to think rationally.

It occurs to me that whether the Rule seems compelling may have to do with the order in which a particular person encounters various ideas. I learned classical probability and statistics and decision theory and game theory and predicate calculus years before I came across Bayesianism. With that background, Bayes' Rule is trivial and seems nearly useless.

But if Bayes' Rule was the first example of formal rationality that someone encountered, it could be really exciting! And rightfully so—even though it's formal rationality that's what's valuable, not the Rule as such. If that's how you got started, it would make a deep emotional impression, and would remain a useful "handle" to invoke the whole rest of rationality.

Formal and informal rules

Hi David,

your offered formulation isn't quite what I was after, though I do appreciate the attempt to find a sympathetic interpretation. Let me try to rephrase...

If I understood you correctly, you said that "pop Bayesianism" doesn't seem to make sense, because it claims to use Bayes' Rule - which is a rule for calculating probabilities with numbers - despite almost never actually invoking any specific numbers. Would any of these examples make more sense to you?

  • A physicist, after having learned Newton's laws of motion, knows that his infant child should be tightly secured while in a car, because the child will continue its motion even during a sudden stop, and won't be easily held in place.
  • A computer scientist, after having learned the formal definitions for computational complexity classes and had some experience with applying them, finds that knowledge useful in doing informal guesses of how hard some problem might be, despite never running any numbers while doing so.
  • A philosopher, after having learned formal logic, intuitively recognizes patterns of logic in people's statements, even without doing a formal analysis on them.

What I'm trying to convey here is the notion that once you have learned a formal rule describing some phenomenon, then you will start recognizing the structure described by that rule even in situations where you don't have the exact numbers... that even if you can't do an exact calculation, knowing the formalism behind the exact calculations will make it possible to get a rough hunch of how the thing in question could be expected to behave, given the formalism and your intuitive guesses of the general magnitude of the various numbers involved. The physicist might not bother to calculate the exact speed at which an unsecured child would be moving during a sudden stop, but he knows that it is far too fast.

The examples in my previous post, then, were intended as examples of ways in which knowing Bayes' Rule and having some experience of some of its applications lets you recognize the "Bayes structure" of things and get a rough handle of how they behave, even when you can't do the exact numbers.

All of that said, I do admit that I am still not personally sure of exactly how much of the "pop Bayesian" influence on my thought has actually come from an understanding of the actual Bayes' rule itself, since I've only spent a rather limited time actually plugging in numbers to it. It could be that what most benefited me were the various qualitative examples of how to reason in a probabilistic manner, and general discussions about things like priors, the origins of our knowledge, and what counts as valid evidence. I would guess the same to be the case with many other "pop Bayesians", so you could be right that it's more of a symbol than anything.

Still, I do think that there is value in also teaching the rule itself, since it can help make the various informal arguments more clear and concrete...

How well does that work?

Thanks for your reply!

I think I understand the point of your examples, although I find each of them unconvincing. (For instance, I don't believe there is any physicist who deduced that he should use a child's car seat (even without numbers). You do that because everyone else does, because it's legally required, and because it is an obviously good idea, based on your own felt experience with seatbelts and cars stopping suddenly.)

The claim that "knowing Bayes' Rule is heuristically valuable even in the absence of numbers" is empirical. Is there any evidence for it? How valuable? In what sorts of circumstances? I would be interested to see studies on this. I haven't found any (but haven't made any serious attempt to look for them, either!). I'd be much more favorably disposed toward informal Bayesianism if it had some evidential support, rather than the vague a priori handwaving I've read on LW. :-)

It could be that what most benefited me were the various qualitative examples of how to reason in a probabilistic manner, and general discussions about things like priors, the origins of our knowledge, and what counts as valid evidence.

That would be my guess.

Still, I do think that there is value in also teaching the rule itself, since it can help make the various informal arguments more clear and concrete...

To be clear, I am glad I know Bayes' Rule, and think it has some value; perhaps huge value in some rare cases.

However, I think that teaching the "balls in boxes" (frequentist) formulation for probability would probably be much better. Once you understand that formulation, it's easy to see how to apply it to many different sorts of probabilistic circumstances. And, you can trivially derive Bayes' Rule from a Venn diagram once you understand "balls in boxes." But the reverse doesn't apply (as far as I know?). That is, knowing Bayes' Rule doesn't make it easy to solve a broad range of probability problems. ("How many red balls should I expect to find in the box?")

The formulation is independent of the metaphysics; even if you reject frequentist metaphysics, being able to calculate solutions to balls-in-boxes problems is useful.

Might or might not work :-)

The claim that "knowing Bayes' Rule is heuristically valuable even in the absence of numbers" is empirical. Is there any evidence for it?

All of my evidence is purely anecdotal, with all the confidence that that implies. :-)

However, I think that teaching the "balls in boxes" (frequentist) formulation for probability would probably be much better. Once you understand that formulation, it's easy to see how to apply it to many different sorts of probabilistic circumstances.

Oh, I've been implicitly presuming all along that the target audience already knows the very rudiments of probability, and the basic frequentist formulation of it. Though now that you point it out, that's probably an artifact of me spending excessive amounts of time in geek circles, and not very representative of the population at large...

But yeah, if I personally were to write an introduction to these kinds of things now, I might not even mention Bayes' rule until some later "advanced course".

Annecdotal evidence

Well, hmm, it seems to me that LW has built an enormous edifice, or castle in the air, on anecdotal evidence for the efficacy of Bayesianism. Oughtn't that to bother those in the movement? Especially since evidence is the sacred principle of the movement?

I'm trying not to be obnoxious about this, but it seems like self-refuting silliness...

I've been implicitly presuming all along that the target audience already knows the very rudiments of probability, and the basic frequentist formulation of it.

Well, I implicitly presumed that too, until I saw Julia Galef's video, which is what prompted my original post here. CFAR, apparently, is targeting people-off-the-street, and teaching the Bayes Rule as the one thing you need to know about rationality.

Once I watched the video, I went back and re-read some LW stuff, and got the impression that LW is mostly doing that too. Does the average LW reader actually understand classical decision theory and how to apply it in real-world situations? I now think: probably not.

Confidence

Well, hmm, it seems to me that LW has built an enormous edifice, or castle in the air, on anecdotal evidence for the efficacy of Bayesianism. Oughtn't that to bother those in the movement? Especially since evidence is the sacred principle of the movement?

When you're saying "Bayesianism", here, are you referring specifically to using Bayes' rule as an instruction tool, or the whole broader emphasis on how to use probabilistic thinking? What I meant to say was that I admit that the specific claim of "knowing Bayes' Rule is heuristically valuable even in the absence of numbers" doesn't necessarily have strong support. But "Bayesianism" in the LW sense covers much more than that, and is more broadly about probabilistic thinking and e.g. the nature of evidence in general...

And from what little contact I've had with them, CFAR's agenda is even more broad - one of their instructors was in town and trialed a small workshop that was basically about staying calm in an argument and being able to analyze each person in the argument actually wanted, so that you could deal with the ultimate issues instead of degenerating into a shouting match. Of course, they probably wouldn't claim that that falls under "Bayesianism", but I don't think they'd claim that Bayes' Rule is all you need to know about rationality, either. (Though I don't know CFAR that well.)

Does the average LW reader actually understand classical decision theory and how to apply it in real-world situations? I now think: probably not.

I would guess so, too. Though of course, the average LW reader is also likely to be a lurker, and less likely to be loudly praising Bayesianism; but the statement may quite likely still hold even if you change it to refer to the average LW commenter. Then again, I'm not sure of the extent to which the average LW commenter will be loudly praising Bayesianism, either... actually, now that I think of it, Yudkowsky is the only one whose LW writings I've seen explicitly saying that Bayesianism is something great and fantastic and incredible.

Bayesianism is a shifting target

Well, taking a step back here, my interest is in the emotional dynamics of spiritual ideologies. I find LW/Bayesianism to be one of those. (As well as an exploration of technical methods.) I first encountered Bayesianism around 1985, and it already had the messianic, quasi-religious This is The Answer to Everything! schtick going then.

Bayesianism (as a spiritual ideology) offers seeming certainty in the face of uncertainty, and seeming control in the face of chaos. Christianity does that too, but the distinctive feature of Bayesianism is its appeal to math (rather than divine revelation) as its proof. I think the purported proof decisively (and relatively obviously) fails. It's not that the math is wrong; it's that it doesn't apply.

I agree that probabilistic reasoning is massively valuable, and everyone should be taught it. If that were the agenda, I'd be totally on board. But I don't think it is. There's no clear explanation of classical probability theory on LW, as far as I know (but I haven't looked!). Instead, Bayes is invoked constantly; but there's very little practical explanation for how you'd use it.

If you don't have numbers, what do you do? How, concretely? Why should we believe that works?

When are Bayesian methods better than classical ones? I think there's about eight conditions that all have to be true for Bayesian methods to be superior. Does anyone on LW explain this? How often does that conjunction of conditions occur in practice? (Very rarely, I expect, but this is an empirical question.)

Probability, of whatever sort, is only a small part of rationality. Where did the rest go?

I'm probably guilty of overgeneralizing about CFAR from the little I know. I'm glad to hear they teach a broader range of methods. Still, if Galef's video is representative, I'm unimpressed.

LW's definition of Bayesianism

The closest thing that LW has to an introduction to classical probability is probably Eliezer's Intuitive Explanation of Bayes' Theorem, though I think that it already assumed some understanding of what probability is; I'm not entirely sure.

One thing that's worth keeping in mind is that LW's definition of "Bayesianism" is somewhat idiosyncratic. While Eliezer does often make disparaging comments about frequentist statistics, the main ideas aren't so much about how to apply formal Bayesian statistics - indeed, formal statistics of either kind haven't been very much discussed on LW. What LW calls "Bayesian" is more of a general mindset which says that there exist mathematical laws prescribing the optimal way of learning from the information that you encounter, and that although those laws are intractable in practice, you can still try to judge various reasoning processes and arguments by estimating the extent to which they approximate the laws.

"Frequentists" or "non-Bayesians" in LW lingo doesn't refer so much to people who use frequentist statistical methods, but to people who don't think of reasoning and probability in terms of laws that you have to obey if you wish to be correct. (Yes, this is a confusing way of using the terms, which I think is harmful.) For example, from Eliezer's Beautiful Probability:

And yet... should rationality be math? It is by no means a foregone conclusion that probability should be pretty. The real world is messy - so shouldn't you need messy reasoning to handle it? Maybe the non-Bayesian statisticians, with their vast collection of ad-hoc methods and ad-hoc justifications, are strictly more competent because they have a strictly larger toolbox. It's nice when problems are clean, but they usually aren't, and you have to live with that.

After all, it's a well-known fact that you can't use Bayesian methods on many problems because the Bayesian calculation is computationally intractable. So why not let many flowers bloom? Why not have more than one tool in your toolbox?

That's the fundamental difference in mindset. Old School statisticians thought in terms of tools, tricks to throw at particular problems. Bayesians - at least this Bayesian, though I don't think I'm speaking only for myself - we think in terms of laws.

Looking for laws isn't the same as looking for especially neat and pretty tools. The second law of thermodynamics isn't an especially neat and pretty refrigerator.

The Carnot cycle is an ideal engine - in fact, the ideal engine. No engine powered by two heat reservoirs can be more efficient than a Carnot engine. As a corollary, all thermodynamically reversible engines operating between the same heat reservoirs are equally efficient.

But, of course, you can't use a Carnot engine to power a real car. A real car's engine bears the same resemblance to a Carnot engine that the car's tires bear to perfect rolling cylinders.

Clearly, then, a Carnot engine is a useless tool for building a real-world car. The second law of thermodynamics, obviously, is not applicable here. It's too hard to make an engine that obeys it, in the real world. Just ignore thermodynamics - use whatever works.

This is the sort of confusion that I think reigns over they who still cling to the Old Ways.

No, you can't always do the exact Bayesian calculation for a problem. Sometimes you must seek an approximation; often, indeed. This doesn't mean that probability theory has ceased to apply, any more than your inability to calculate the aerodynamics of a 747 on an atom-by-atom basis implies that the 747 is not made out of atoms. Whatever approximation you use, it works to the extent that it approximates the ideal Bayesian calculation - and fails to the extent that it departs.

Bayesianism's coherence and uniqueness proofs cut both ways. Just as any calculation that obeys Cox's coherency axioms (or any of the many reformulations and generalizations) must map onto probabilities, so too, anything that is not Bayesian must fail one of the coherency tests. This, in turn, opens you to punishments like Dutch-booking (accepting combinations of bets that are sure losses, or rejecting combinations of bets that are sure gains).

You may not be able to compute the optimal answer. But whatever approximation you use, both its failures and successes will be explainable in terms of Bayesian probability theory. You may not know the explanation; that does not mean no explanation exists.

So you want to use a linear regression, instead of doing Bayesian updates? But look to the underlying structure of the linear regression, and you see that it corresponds to picking the best point estimate given a Gaussian likelihood function and a uniform prior over the parameters.

You want to use a regularized linear regression, because that works better in practice? Well, that corresponds (says the Bayesian) to having a Gaussian prior over the weights.

Sometimes you can't use Bayesian methods literally; often, indeed. But when you can use the exact Bayesian calculation that uses every scrap of available knowledge, you are done. You will never find a statistical method that yields a better answer. You may find a cheap approximation that works excellently nearly all the time, and it will be cheaper, but it will not be more accurate. Not unless the other method uses knowledge, perhaps in the form of disguised prior information, that you are not allowing into the Bayesian calculation; and then when you feed the prior information into the Bayesian calculation, the Bayesian calculation will again be equal or superior.

When you use an Old Style ad-hoc statistical tool with an ad-hoc (but often quite interesting) justification, you never know if someone else will come up with an even more clever tool tomorrow. But when you can directly use a calculation that mirrors the Bayesian law, you're done - like managing to put a Carnot heat engine into your car. It is, as the saying goes, "Bayes-optimal".

The power in that mindset, I would say, is that you can no longer just believe in something or think something and just automatically assume that it's correct. Instead, you are forced to constantly question and evaluate your thought processes: is this the kind of an inference that would actually cause me to have true beliefs?

As for your question of how you actually use it... in my original comment I gave some examples of ways to check your reasoning processes by checking whether they follow Bayes' rule. There are a bunch of other LW articles that apply either Bayes' rule or the more general mindset of lawful reasoning. It feels a little rude to throw a dozen links at someone in a conversation like this, but since you asked for examples, some semi-randomly picked ones: Absence of Evidence is Evidence of Absence, Conservation of Expected Evidence, Update Yourself Incrementally, One Argument Against An Army, What is Evidence?, What Evidence Filtered Evidence?, and Privileging the Hypothesis (I'll single out "What is Evidence?" as one that I particularly like, and "Privileging the Hypothesis" points out a fallacy that I often started to realize I was guilty of, after I read the article).

I wouldn't agree with your characterization of Bayesianism - at least in its LW version - offering certainty, however. Yes, it talks about laws that might lead us to the truth if we follow them... but it also makes a strong attack on various ways to rationalize pleasant beliefs to yourself, and undermines the implicit notion of "I can just believe whatever I want" that many people subconsciously have, even if they wouldn't admit it out loud.

This undermining happened to me - for example, I would use to have beliefs that I didn't want to talk about in public because I knew that I couldn't defend them, but reading LW for sufficiently long made me actually internalize the notion that if I can't defend it, I don't have any reason for believing in it myself. That might sound obvious, but it is a lot easier said than done.

The message I get out of LW-style Bayesianism, and what a lot of people seem to get out of it, is rather one of massive uncertainty - that we cannot know anything for sure, and that even if we make our best effort, nothing requires the universe to play fair and not screw us over anyway... certainly to me, it feels like reading LW made me become far, far less certain in anything.

As for your question of "where did the non-probability parts of rationality go" - well, I've only been discussing the probability parts of rationality because those were the topic of your original post. Certainly there's a lot of other rationality discussion on LW (and CFAR), too. Though LW has traditionally been most focused on epistemic rather than instrumental rationality, and probability is a core part of epistemic rationality. I gather that CFAR is more focused on instrumental rationality than LW is. I would assume that this "Checklist of Rationality Habits" on CFAR's website would be more representative of the general kind of stuff they do.

Shifting target and meta-certainty

Thanks for the CFAR link! I was clearly just wrong about that. Sloppy; sorry!

I titled my previous comment "Bayesianism is a shifting target", but forgot to actually write about that point! My original post was not an attempt at criticizing LW Bayesianism; it was musing about whether it would be useful to write such a critique, and how. What I observed is that the movement spans many levels of understanding, from the audience of Galef's video to people who read and understand Jaynes' book.

A successful critique would probably have to treat each level of understanding separately, because otherwise adherents will keep shifting:

"No, we don't teach Bayes' Rule as the foundation of rationality, we're doing something much more sophisticated." Except, at the maximally pop end of the spectrum, the video does do that.

"Yes, Bayesian methods are uncomputable and therefore useless, but we teach heuristic approximations instead." But mostly there are no justifications for the utility of the approximations other than to insist that the uncomputable methods are Ultimate Truth.

What I find, reading LW, is an enormous wall of writing, which gives a superficial impression of substance partly just from its mass. When I read any individual page in the Sequences, I find nothing there. Yudkowsky carries the reader along with his enthusiasm, his certainty, and throwing in just enough math to confuse people (but rarely enough that they can learn anything useful). But in the end, each page fails to make a substantive point. Or, that's my experience, anyway.

Bayesians - at least this Bayesian, though I don't think I'm speaking only for myself - we think in terms of laws.

This is a statement of eternalism (as I use that word). "There is a fundamental principle of the universe which makes everything make sense." It's this eternalism at the heart of Yudkowsky's philosophy that I find most interesting. (There's a stronger statement of eternalism at "My Bayesian Enlightenment".)

It's an unusual eternalism in its pretense to be grounded in math and science, and its explicit acknowledgement of un-knowing and uncertainty. That makes it almost attractive to me personally; but I think it comprehensively fails.

One thing that's worth keeping in mind is that LW's definition of "Bayesianism" is somewhat idiosyncratic... What LW calls "Bayesian" is more of a general mindset

This is the difficulty for me: debunking a general mindset as opposed to any specific claim.

there exist mathematical laws prescribing the optimal way of learning from the information that you encounter, and that although those laws are intractable in practice, you can still try to judge various reasoning processes and arguments by estimating the extent to which they approximate the laws.

I don't think this is true, except in quite restricted cases; and I think that pursuing this as a general strategy will have bad results.

I can argue this, but it would require extensive steelmaning, because (so far as I can tell) LW doesn't make the claim specific enough (and presents little if any empirical evidence).

If I did that—which would take months of work—would it change many peoples' minds? (Genuine question; I genuinely wonder whether it's worth doing.)

Here's part of the story, in case it's interesting:

For Bayesian methods to even apply, you have to have already defined the space of possible evidence-events and possible hypotheses and (in a decision theoretic framework) possible actions. The universe doesn't come pre-parsed with those. Choosing the vocabulary in which to formulate evidence, hypotheses, and actions is most of the work of understanding something. Bayesianism gives you no help with that. Thus, I expect it predisposes you take someone else's wrong vocabulary as given.

the main ideas aren't so much about how to apply formal Bayesian statistics - indeed, formal statistics of either kind haven't been very much discussed on LW.

Quite so. (No substance! And statistics are boring and difficult; not what the readership wants.)

Instead, Bayesian methods are alluded to as the incomprehensible voodoo that justifies everything. How many LW readers are impressed with Solomonoff induction? How many of them actually understand it? I did a graduate seminar in that [or, to be scrupulously accurate, Kolmogorov induction, but that's essentially the same] at MIT. I think it's a really cool piece of math and I think it has absolutely nothing whatsoever to do with physical reality.

Frequentists" or "non-Bayesians" in LW lingo doesn't refer so much to people who use frequentist statistical methods, but to people who don't think of reasoning and probability in terms of laws that you have to obey if you wish to be correct. (Yes, this is a confusing way of using the terms, which I think is harmful.) For example, from Eliezer's Beautiful Probability...

Thanks, this is a helpful clarification!

Let me reformulate: in the last couple of paragraphs you quoted, Yudkowsky is confusing two distinctions: frequentist vs. Bayesian, and ad hoc vs. exact methods. The latter pair concerns the relationship between statistics and probability. Ad hoc methods either do not exactly match what you'd get from probabilistic calculations, or depend on complicated and perhaps unstated assumptions about distributions. Exact methods are computations that conform perfectly to probability and make all assumptions explicit. Bayesian and frequentist probability always give exactly the same numerical answers; they differ only in metaphysics, not in mathematics.

So what he's saying here is that exact methods are better. Sure, when you can use them, of course they are! Sometimes you can't, and then sometimes ad hoc methods are better than nothing. This is a boring fact about how stats work in practice. You get the same throughout engineering and science: exact computations are often infeasible, and then you try to validate an approximation.

That is something everyone ought actually to know; but this valuable (if not very exciting) point was lost in the paean to the superiority of Bayesianism.

The message instead is that "Bayesianism is The Truth, and everything else is just a bunch of wrong-headed confusion." And, Bayesianism is not just the right way to do statistics. (Statistics is the most boring subject on earth; no one cares about it.) Bayesianism is the Right Way, period. Bayesian decision theory is the right way for everyone to live.

The power in that mindset, I would say, is that you can no longer just believe in something or think something and just automatically assume that it's correct. Instead, you are forced to constantly question and evaluate your thought processes: is this the kind of an inference that would actually cause me to have true beliefs?

Yes; that's a very good thing. But is the LW approach the best way to bring about that sort of questioning? There are many other pedagogical approaches available (e.g. "critical thinking" in the humanities, or just getting a decent general STEM education). Empirically, LW seems to lead people into metaphysical speculation and obsession with peculiar unlikely future scenarios.

I read some more of the articles you linked about "how to use it". I didn't find them compelling. The best point out particular fallacies (e.g. "privileging the hypothesis"), which is a good thing. But, I think there are clearer, more concise, more comprehensive presentations of good and bad reasoning elsewhere.

I wouldn't agree with your characterization of Bayesianism - at least in its LW version - offering certainty, however. Yes, it talks about laws that might lead us to the truth if we follow them... but it also makes a strong attack on various ways to rationalize pleasant beliefs to yourself, and undermines the implicit notion of "I can just believe whatever I want" that many people subconsciously have, even if they wouldn't admit it out loud.

Yes... What's interesting is a kind of dual motion: uncertainty at the object level, and certainty at the meta level. LW seems to go out of its way to provoke anxiety about factual uncertainty ("maybe evil computers are about to take over the world!"), which it then relieves at the meta level ("but we know the eternal, ultimate principles and methods that allow us to devise an optimal response strategy!").

This seems to me relevantly similar to the emotional appeal of salvation religions: "you will almost certainly go to hell! except that we have the magic principles and methods that guarantee eternal bliss instead."

The core of LW Bayesianism

Thanks for the CFAR link! I was clearly just wrong about that.

Glad I could be of help!

I can argue this, but it would require extensive steelmaning, because (so far
as I can tell) LW doesn't make the claim specific enough (and presents little if any empirical evidence).

If I did that—which would take months of work—would it change many peoples' minds? (Genuine question; I genuinely wonder whether it's worth doing.)

I guess that would depend on what you'd count as changing people's minds? What exactly is the position that you find harmful and wish to subvert, here?

As you've noted, the core technical argument for Bayesianism isn't really made properly explicit in the Sequences. That being the case, it makes me dubious about whether a detailed refutation of the technical argument would do much to change people's minds, since the persuasiveness of the argument never came from there in the first place. Rather, I think that the thing that has made the Sequences so attractive to people are the small practical insights. For example, if I were to briefly summarize the insights that made me link to the posts that I did in my earlier comment:

  • Absence of Evidence is Evidence of Absence: What the name says.
  • Conservation of Expected Evidence: If you would interpret a piece of evidence to be in favor of your hypothesis, you should interpret evidence that's in the opposite direction as being contrary evidence for the hypothesis; you can only seek evidence to test a theory instead of confirming it.
  • Update Yourself Incrementally: Hypotheses aren't all-or-nothing: it's fine to be aware of contrary evidence to a hypothesis and still hold that hypothesis, as long as the contrary evidence isn't too strong.
  • One Argument Against An Army: You should always make sure that you aren't selectively double-counting some of your evidence.
  • What Is Evidence?: Evidence is an event entangled by links of cause and effect with something that you want to know about; for an event to be evidence about a target of inquiry, it has to happen differently depending on the state of the target.
  • What Evidence Filtered Evidence?: If you're expecting the possibility of your source filtering away some of the evidence that would go against their conclusion, you shouldn't trust their conclusion too strongly.
  • Privileging the Hypothesis: A hypothesis has to already have considerable evidence in its support before you should even seriously consider it; it's no use to ask ”but there's no strong evidence against it, right?” if there's no strong evidence in favor of it.

Now the argument that you just made against Bayesianism – of the tough part being the selection of the space of evidence and hypotheses and actions – sounds reasonable and probably correct to me. Does that argument shake my confidence in any of the insights I summarized above? Well, since I should update myself incrementally ;-), I need to admit that this is something that I should count as evidence against any insights derived from a Bayesian framework... but it doesn't feel like very strong evidence against them. The content of these posts still seems generally right, supporting the case for Bayesianism – and it intuitively seems like, even if we cannot yet figure out the answers to all the hard questions that you outlined, the fact that this line of reasoning has provided a coherent unifying framework for all these (obvious in retrospect) ideas suggests that the truth is at least somewhere in the rough same direction as Bayesianism. I would expect that, to the extent that it effectively arrives at the truth, my brain still works along Bayesian lines - even if I don't know exactly how it does that.

I'm reminded of Scott's notion about people having different standards for what counts as a ”trivial” problem for a philosophical theory. To many LW users (me included), there exists an immense amount of small practical insights in the Sequences, ones which seem obvious in retrospect but which might have taken us a long while to think of ourselves: and a large part of those insights are presented in the context of a unifying Bayesian framework. So once you point out that there are deep unsolved - perhaps even unsolvable in principle - problems with formally applying Bayesian methods... then this particular technical failure of the framework, when that framework has provided plenty of practical successes, feels likely to register as a ”trivial” problem.

I would expect that if you wanted to actually de-convince people on LW of Bayesianism, it wouldn't be enough to just show the problems that it has – you'd also need to answer the question of ”if not Bayesianism, then what?”. Even if your argument was successful in reducing people's confidence in Bayesianism being correct, that still doesn't do much if they don't have any plausible alternative hypotheses.

As an aside, I suspect that part of the reason why many people found the Sequences so convincing and why many other people don't find them particularly insightful, has to do with the way that they are (were) read. The posts were originally written over an extended period, and many of us began reading them as a bit of interesting entertainment that would pop up in our RSS feeds once a day. In order to properly learn a concept, you need to encounter it several times with slight variations, and the actual message being spread out over many posts was originally helpful in this respect. It spread out the message over several days of reading and thus helped internalize it better than if there had been just one clear, to-the-point post - that you read once and then forgot.

Compare reading the Sequences over several years with reading, over a couple of days, a book that had all of those same insights expressed in a more concise way. One might be quite impressed with the book, but with there being so much information packed into such a short time, people would end up just forgetting most of it within a few days or weeks. The Sequences, in contrast, offered a series of posts examining a particular mindset from slightly different angles, keeping a few ideas at a time active in the reader's mind as the reader went on with her daily life. That gave the reader a much better opportunity to actually notice it when she encountered something related to those ideas in her life, making her remember it and all the related ideas better.

But now that nobody is reading the posts at a one-per-day rate anymore, the style and format seems harmful in getting the message across. When you're reading through a (huge) archived sequence of posts, unnecessary fluff will just create a feeling of things having being written in a needlessly wordy way.

Yes; that's a very good thing. But is the LW approach the best way to bring about that sort of questioning? There are many other pedagogical approaches available (e.g. "critical thinking" in the humanities, or just getting a decent general STEM education).

Well, the LW approach has clearly worked for some people. For others, other kinds of approaches are more effective. As far as I can tell, CFAR's agenda is to experiment with different kinds of approaches and figure out the ones that are the most effective for the largest fraction of the populace.

Empirically, LW seems to lead people into metaphysical speculation and obsession with peculiar unlikely future scenarios.

I would expect that a large part of the metaphysical speculation on LW is due to the LW approach appealing to the kinds of people who already enjoy abstract metaphysical speculation. As for the peculiar unlikely future scenarios... well, as someone who found the AI risk argument a major and serious one even before LW existed, I cannot consider it a bad thing if more people become aware of it and give it serious attention. :-)

Yes... What's interesting is a kind of dual motion: uncertainty at the object level, and certainty at the meta level. LW seems to go out of its way to provoke anxiety about factual uncertainty ("maybe evil computers are about to take over the world!"), which it then relieves at the meta level ("but we know the eternal, ultimate principles and methods that allow us to devise an optimal response strategy!").

This seems to me relevantly similar to the emotional appeal of salvation religions: "you will almost certainly go to hell! except that we have the magic principles and methods that guarantee eternal bliss instead."

That's an interesting observation. While people have made religious comparisons of LW before, I'm not sure if I've seen it phrased in quite this way earlier.

LessWrong 2.0?

Aha! With your most recent comment, I now understand things I did not before. Thank you very much indeed for your generosity and perseverance.

The two new things I think I understand:

  • The valuable part of LW, for many people, is a collection of simple, practical insights into reasoning, rather than the complex technical framework.
  • LW began with one smart guy live-blogging his thought process as he learned about some interesting stuff, which was exhilarating for readers at the time.

The small practical insights you listed are all excellent. Everyone should know them!

I'd suggest that the Bayesian framework is not necessary to understand any of them, and perhaps not helpful (except maybe for "Update Yourself Incrementally"). Maybe this depends on one's cognitive style. For some people, understanding that all those insights loosely relate to a mathematical framework would be satisfying and helpful; for others, the framework would be difficult to understand and an unnecessary distraction.

On the other topic, as a tentative model, I'd suggest that LW is attractive for at least these reasons:

  1. Small practical insights
  2. Yudkowsky's personality, writing ability, and willingness to engage readers
  3. A discussion community of other smart, interesting, decent people
  4. Presentations of some cool technical stuff (including Bayesian probability and statistics)
  5. A quasi-religion in which the cool technical stuff is misused as magic dogma

Of these, what I find most interesting is the quasi-religion. It's also what I find dangerous and repellent, and it's what I would hope to weaken.

I think the presentations of cool technical stuff are mostly not very good explanations, are often entwined with the religion, and make exaggerated claims that fail on technical grounds.

What I was wondering is whether pointing out the technical problems would help undermine the religion. If most readers are there for the small practical insights, and ignore the religion, then this wouldn't work; but also wouldn't be necessary!

The community of smart, interesting, decent people is what I find attractive about LW. Caring about them is mainly what makes me consider saving them from the religion.

This makes me wonder if it's time for LessWrong 2.0: a discussion community regarding rationality that isn't based on Yudkowsky's blog, that is non-religious, and doesn't invoke technical stuff it doesn't understand as magical justification.

Part of LessWrong 2.0 could be a new presentation of the small practical insights that didn't involve the big, wrong claims about Bayesianism. (It should involve the basics of probability theory, which could be Bayesian, although I'd advocate starting with the frequentist framework because it's easier to apply to a broader range of cases.)

Maybe LessWrong is already evolving into something like that? Maybe this is what CFAR is about? I don't know, because I'm not close enough to the community. If this is the way things are headed, I'd probably want to be more involved, in a positive way.

The evolution of LW

Glad to be of help, again. :)

Your tentative model sounds roughly correct. I'm not sure of exactly how much of the quasi-religion even is present in practice: while it's clearly there in the original Sequences, I haven't observed very much of it in the discussions on the site.

I would say that LW is already evolving in the description you described. For example, looking at this year's "Promoted" articles, only 4 out of 43 are by Eliezer, and those four are all either summaries of math papers or, in one case, an advertisement of MetaMed. And like I already mentioned, I don't get a very strong "magic" vibe from the actual discussions in general. The only exception I can think of are some of the solstice meetup reports.

My impression is also that CFAR is very much what you described as LW 2.0, but I'm again not very familiar with them, as they're basically focused on doing things in real life and have been rather US-centric so far, while I'm here in Europe.

LW's Bayesianism

Vaniver's picture

David,

I thought you might appreciate some additional comments; for background, I'm the guy that wrote LW's introduction to Ron Howard-style decision analysis. LWers do seem to appreciate discussion of the technical bits, though I don't think everyone follows them.

As someone who understands all the technical detail, I agree with you that the quasi-religious aspects of LW are troubling. But I think a lot of that is that it's more fun for EY to talk that way, and so he talks that way, and I don't think that it's a significant part of the LW culture.

I think the actual quantitative use of Bayes is not that important for most people. I think the qualitative use of Bayes is very important, but is hard to discuss, and I don't think anyone on LW has found a really good way to do that yet. (CFAR is trying, but hasn't come up with anything spectacular so far to the best of my knowledge.)

I think that tools are best when they make the hard parts hard and the easy parts easy, which I think Bayes is good at doing and non-Bayesian tools are bad at doing. With Bayes, coming up with a prior is hard (as it should be) and coming up with a decision rule for the posterior is hard (as it should be). With, say, a null hypothesis test with p=.05, the prior and decision rule are both selected at default. Are they appropriate? Maybe, maybe not- but such discussions are entirely sidestepped without Bayes, and someone can go through the motions without realizing that what they're doing doesn't make sense. (This xkcd strip comes to mind. Yes, they used the wrong test- but is it obvious that they did so?)

For instance, I don't believe there is any physicist who deduced that he should use a child's car seat (even without numbers). You do that because everyone else does, because it's legally required, and because it is an obviously good idea, based on your own felt experience with seatbelts and cars stopping suddenly.

You might be interested in this anecdote, in which a physicist explains to a mother what deceleration is like. The relevant bit:

“But I’m holding on to him!”, the woman started saying, but by the time she made it to the end of her sentence, she was full-on screaming, completely losing her cool. “Nothing bad is going to happen to him!!”

“Right, please hear me out,” Syd got them back on track. “Say that you go from 30 miles per hour to zero in the space of about a foot and a half, right? That is an acceleration of about 20 times the earth’s gravity. That means that your 1-stone baby would go from 1 stone of weight in your arms, to about 20 stone moving away from you.”

The woman just stared at Syd, but her eyes showed that she was trying to envision holding on to a 20-stone baby.

“I guess what I am asking: Would you be able to hold a 20-stone item the size of a large watermelon in your arms in the middle of a car crash?”

“I…”, she said.

“Be honest.” he said. “No, you wouldn’t. You wouldn’t have a chance. Which means that if you guys had been in a crash, your baby would have gone flying straight into the windshield. There, he wouldn’t have had the benefit of being slowed down gradually: Instead of being slowed down by the seat belt over a distance of about a foot and a half, he would be brought to a stop in an inch or less. Then…”

Syd was on a roll, and I could tell that he was about to launch into a further explanation. He was so caught up in his own discussion, that he hadn’t seen how pale the woman had gotten. I shook my head at him. He spotted me, and stopped his train of thought.

“You get the picture; I don’t need to explain. Suffice to say that it’s unlikely that your baby would survive an impact like that.”

The woman went even more pale, and was hugging her child closely. She looked, for a moment, as if she might keel over, but her husband stepped in and put an arm around her.

Babies don't bounce

Vaniver, thank you very much for this!

I think the qualitative use of Bayes is very important, but is hard to discuss

Maybe that's my essential confusion?

I take "update confidence in the direction of evidence" for granted. Is that the whole of "the qualitative use of Bayes"? Or is there more to it? (The Kahneman/Tversky type errors are more subtle than just failing to update given new evidence, and need at least approximate quantities.)

Certainly, people fail to update sometimes. Maybe I live in a high-education/high-IQ bubble where people rarely fail to update (other than in special domains like religion). So maybe I misunderestimate the magnitude of this problem. It would be very interesting to get real-life quantitative data on that.

Note that I'm not arguing frequentism vs. Bayesianism. My main substantive critique of Bayesianism is that probability theory, of whatever stripe, is a very small part of rationality.

Thanks for the anecdote! I've done some updating, accordingly :-)

However, I don't think it really illustrates the point in the way you'd like. Kaj Sotala wrote:

A physicist, after having learned Newton's laws of motion, knows that his infant child should be tightly secured while in a car, because the child will continue its motion even during a sudden stop, and won't be easily held in place.

In the anecdote, someone (who, incidentally, is not a physicist as far as I could tell from searching the whole page) explains to someone else that they should use a baby seat because in a 30mph crash, the deceleration is equal to about 20g. I think the force of this explanation depends critically on that quantitative information (even if it's somewhat approximate).

Some questions

Anonymous's picture

Hey David, you have quite a lot of thought-provoking stuff here on this website. I found a link to this article while browsing Lesswrong, and, after seeing you display high level of sanity in comments, have read through most of your "book contents".

Unfortunately, I can't quite seem to make sense of your core ideas and I'd appreciate you clarifying a few points.

  • What is your definition of the word "meaning" that you use throughout the book? I have tried substituting both my own intuitive definition and the dictionary one - both substitutions seem to result in nonsense.
  • Are stances (within one specific dimension) discrete categories of thinking that each person falls squarely in, or are the two confused stances bounds of the worldview distribution, with most people being in-between?
    If it's the continuous case, is the complete stance just a roughly 50% A 50% B mix? You describe the completes like they are transcendent with respect to confuseds, not merely a partway point.

    It seems to me that the discrete case is what you believe in, but the continuous one is what fits my real-world experiences the best.

  • You go in detail on each of the complete stances, but are they the only/the only true compromises between confuseds? In most dimensions, I partially agree with/compromise between/partially accept both confuseds, and yet my worldviews of those dimensions look nothing like your descriptions of the complete stances
  • Are your views falsifiable? Can you think up and tell us a few specific examples of evidence that, if observed, would cause you to abandon your whole worldview of "meaningness"?
  • The whole worldview kind of looks like yet another attempt at the Answer to Life, the Universe, and Everything, but you said you don't like those. What am I missing here that makes it special?

Not analytic philosophy

Hey, thanks, glad this is interesting (and I sound sane)!

Before addressing your specific questions, a meta point: I'm explicitly not doing analytic philosophy on this site. In fact, I'm not doing philosophy at all; what I offer are practical methods for dealing with questions of meaning.

Analytic philosophy starts out by saying "let's get really clear about what the relevant words mean, because otherwise we have no idea what we are talking about, and we'll just get more confused." That's often a good idea, but it won't work in this subject domain. Philosophers have tried to get clear about purpose, ethics, value, etc., for thousands of years, and have failed so far. In fact, they've mostly given up. (Probably that is because these issues are inherently nebulous.)

But these topics are too important to give up on, so we can't afford to wait around for definitional clarity before proceeding. We need good-enough practical answers for right now.

To the specifics:

I deliberately don't have a definition of "meaning." The closest thing is an informal taxonomy of "dimensions of meaningness." But that is explicitly open-ended; it's not meant to be complete or definitive.

Stances are momentary attitudes, not enduring beliefs. They are highly unstable, and anecdotally we all frequently bounce back and forth between opposite ones without noticing. (See the monologue in "Stances are unstable.") At any given time, we're probably squarely in one or the other of an opposing pair of confused stances. However, individuals have tendencies, so maybe I spend a lot more time in the nihilist stance than the eternalist one. You might say I'm 0.73 nihilist.

Yes, the complete stances are definitely not mid points. Each pair of confused stances is based on a wrong assumption about how meaningness works, which shared by both sides of the opposition. The complete stance rejects that assumption.

Example: eternalism and nihilism share the wrong idea that "real" meaning has to be objectively given. Eternalism asserts that there is objective meaning; nihilism denies that there is any meaning at all, since it (accurately) recognizes that there no objective meaning; the complete stance recognizes that non-objective meaning is real.

You go in detail on each of the complete stances, but are they the only/the only true compromises between confuseds?

Well, the complete stances are not compromises. But I'd be interested in alternative ways of resolving the false dichotomies!

yet my worldviews of those dimensions look nothing like your descriptions of the complete stances

I guess I'm glad about that, in a way; it suggests that I'm doing something unusual and maybe new.

Are your views falsifiable? Can you think up and tell us a few specific examples of evidence that, if observed, would cause you to abandon your whole worldview of "meaningness"?

Due to the nebulosity of the subject matter, falsification would be difficult.

But since the orientation here is practical, "is it useful?" is a better question than "is it true?". So falsification is not really at issue. Assessing efficacy would be more valuable.

Anecdotally, the approach here has been helpful for some people. In the long run, I'd love to see it validated (or rejected or amended) empirically, in terms of "how helpful is it?"

For now, I've struggled to find time even just to write up the basic ideas. Rigorous testing will have to wait...

The whole worldview kind of looks like yet another attempt at the Answer to Life, the Universe, and Everything...

It kind of looks like that because it addresses the same sorts of issues that Cosmic Theories address. It's different in that it explicitly rejects the possibility of Answers. (That would be eternalistic.) Instead, it offers practical methods. The methods are explicitly not exhaustive, systematic, guaranteed, or well-defined. They're just heuristics.

This discussion strikes home for me

Grant's picture

As someone who lurks on LW without a background in classical statistics, I thought you might find my experiences helpful.

The "small, practical insights" are great! Unfortunately they are often found within walls of not-small text, inter-mixed with personal narrative and quasi-religious language. It all seems a bit cult-ish, which Eliezer obviously realizes and pokes fun at. However the feeling remains that he hasn't engaged the wider scientific community and defended his beliefs.

The lack of discussion of non-bayesian methods is especially troubling. I rarely use much statistics in my work (software engineering); when I do I look up specific tools which apply to the task at hand, use them, then largely forget about them. I just don't use statistics often enough to stay skilled in it. So when I encountered LW I thought "oh neat, they've found a simpler, more general theory of probability I can more easily remember".

So I started fiddling with and thinking about Bayes' Rule, and it seemed like you could infer some important insights directly from it. Things like "absence of evidence is evidence of absence" (which we already felt, but its nice to know), and how priors influence our beliefs as new evidence is acquired.

Useful stuff, I thought. But without any contrast to classical methods, I don't know how unique or event valid these insights are.

When I need to Get Something Done that requires I learn a concept in statistics, decision theory or rationality (this has happened all of twice), I will always consult wikipedia and its peer-reviewed sources before LW. LW feels good for provoking thought and discussion, but not actually getting stuff done. In the Hansonian sense it feels "far", while sources such as wikipedia are "near".

Analogical inference as a possible alternative to Bayes

Hi David,

For years I've been extremely sceptical of Yudkowsky's claim than Bayesianism is some sort of ultimate foundation of rationality.

I think you got to the heart of the matter when you pointed out that Bayes cannot define the space of hypotheses in the first place, it only works once a set of pre-defined concepts is assumed. As you state;

"The universe doesn't come pre-parsed with those. Choosing the vocabulary in which to formulate evidence, hypotheses, and actions is most of the work of understanding something."

Exactly so!

What you describe is the task of knowledge representation, or categorization, which is closely related to the generation of analogies, and is PRIOR to any probabilistic calculations. Now it may turn out to be the case that these things can be entirely defined in Bayesian terms, but there is no reason for believing this, and every reason for disbelieving it. Some years ago, on a list called the everything-list, I argued the case against Bayes and suggested that analogical inference may turn out to be a more general framework for science, of which Bayes will only be a special case.

Here's the link to my arguments:
https://groups.google.com/forum/#!topic/everything-list/AhDfBxh2E_c

In my summing up, I listed '5 big problems with Bayes' and pointed out some preliminary evidence that my suggested alternative (analogical inference) might be able to solve these problems. Here was my summary:

(1) Bayes can’t handle mathematical reasoning, and especially, it
can’t deal with Godel undecidables
(2) Bayes has a problem of different priors and models
(3) Formalizations of Occam’s razor are uncomputable and
approximations don’t scale.
(4) Most of the work of science is knowledge representation, not
prediction, and knowledge representation is primary to prediction
(5) The type of pure math that Bayesian inference resembles (functions/
relations) is lower down the math hierarchy than that of analogical
inference (categories)

For each point, there's some evidence that analogical inference can handle the
problem:

(1) Analogical reasoning can engage in mathematical reasoning and
bypass Godel (see Hoftstadler, Godelian reasoning is analogical)
(2) Analogical reasoning can produce priors, by biasing the mind in
the right direction by generating categories which simplify (see
Analogy as categorization)
(3) Analogical reasoning does not depend on huge amounts of data thus
it does not suffer from uncomputibility.
(4) Analogical reasoning naturally deals with knowledge representation
(analogies are categories)
(5) The fact that analogical reasoning closely resembles category
theory, the deepest form of math, suggests it’s the deepest form of
inference

Base-rate fallacy

I just watched Galef's brief video, and I must say the point of here talk seemed to me to be something that you have apparently totally missed. What she is talking about is a formal but flawed mode of reasoning used frequently by scientists, and something that happens all the time when people use informal reasoning. It's called the base-rate fallacy, it's crappy reasoning, and when you know what Bayes' theorem means, it's obviously wrong: P(H | D) is not the same as P(D | H). This is what her example about being or not being a good driver was about - its not enough that the hypothesis fits the data, you must look at how other hypotheses fit the data also.

For your convenience, regarding the base-rate fallacy, see my brief article
http://maximum-entropy-blog.blogspot.com/p/glossary.html#base-rate-fallacy
and linked material.

(By the way, the commenter above me, Marc, complains that Bayes' theorem doesn't specify a hypothesis space. This is correct, it's called theory ladeness, and its just something we have live with - no other procedure can provide this, or any alternative. To complain about not having the hypothesis space laid out in detail is to complain about not being omniscient. If there was some unique, true hypothesis space, attainable by some procedure, then what would it look like? Why would it contain any hypotheses beyond the one true proposition? Where would this miraculous information come from?)

Where does the hypothesis space come from?

Yes, I'm familiar with the base rate fallacy.

In order to apply Bayes' Theorem to any specific problem, you have to first have a specific hypothesis space. Different problems require different hypothesis spaces.

Unless I'm missing something, a hypothesis space for a particular problem does have to be laid out in detail before you can use Bayes' Theorem. In Galef's example, the space is {"I am a good driver", "I am not a good driver"}.

Clearly, many different hypothesis spaces are possible for any given problem. There is no "right" hypothesis space—other than the one that contains only the correct answer—but some are better than others. Choosing a good hypothesis space is typically most of the work of using Bayes' Theorem.

My following post is about that. You might find it interesting.

Unconvinced of the harm

Fhyve's picture

Hi,

I agree with the pedagogical issues with teaching Bayes, and the issue of the worship of the all mighty Bayes rule.

However, you mention that pop Bayesianism might do more harm than good (beyond the religiosity) and I am not convinced. The only evidence you have given is that you have observed some Bayesians being very confident in beliefs that you think that they shouldn't be confident in, though you don't tell us any of those. I would like to hear some of those. And are these beliefs associated with the meme cluster that Bayesianism tends to be attached to (LW things like FAI, cryonics, etc.)?

By the way, here is a brief summary of the Bayes unit at CFAR. The basic tool is to used how surprised you are of certain odds to approximate numerical probabilities based on your beliefs. You use this to generate an approximate numerical prior in the form of odds, and a likelihood ratio, then multiply to get an order of magnitude approximation of the posterior. I find the most useful part to be the calculation of the likelihood ratio.

Would I be surprised if he got in a life threatening car accident 1 time for every 1,000,000 times he goes out with friends? Yes? How about 5:1M? Yes? 20:1M? Not really. Lets go with 10:1M (you could probably use actual stats for this). Now, would I be surprised if he got in an accident 1 times for every 10,000 times? Yes. [etc.] Ok, lets go with 30:1M. So a good prior would be about 20:1M.

Now this probably isn't the greatest example since the numbers are so large, but I think that the likelihood ratios are the interesting part:

Now lets say he got in a car accident. How often would he fail to call by 11PM? Essentially 1/1. Now lets say he didn't get in an accident. How often would he fail to call by 11PM? [...uses above method for finding a probability...] about 3/10 , he is a little disorganized and forgetful. So our likelihood ratio is 10:3 or about 3:1. So he is only 3 times more likely to get in a fatal accident if he fails to call, or 60:1M chance, or 0.006%. I should stop my unreasonable worrying.

Anyways, as some other people have mentioned, I find that the most useful part of Bayes is the qualitative stuff. It also tells you why you should update in the ways that seem intuitively reasonable.

The rest of the video goes on to say that Bayesianism boils down to “don’t be so sure of your beliefs; be less sure when you see contradictory evidence.”

Now that is just common sense. Why does anyone need to be told this? And how does the formula help?

What do you mean by common sense? And I think you are being a little bit optimistic. These things (and other such things mentioned your article) seem like common sense upon reflection, but you probably won't notice your brain violating those rules anyways.

Overconfidence and common sense

Hi, Fhyve,

Hmm, yes, my worry about Bayesianism being actively harmful is theoretical and anecdotal, not properly empirical. In my defense, I did say "maybe harmful," and also called it "mostly harmless"!

The poster child for probability theory being harmful is the 2008 financial crash. That had multiple causes, but among the most important was the misuse of probability theory to model things that it can't model. There was a near-universal consensus, based on probability theory, that significant losses in mortgages were effectively impossible. Presumably most of the financial quants who screwed up used frequentist methods, not Bayesian ones, and possibly if they had been Bayesians, they wouldn't have screwed up. But I doubt it; I think Bayesians would have made more-or-less the same errors. As far as I know there weren't any Bayesians predicting doom in 2006. (I predicted doom then, and made a lot of money shorting mortgages, btw.)

I think LessWrong participants are much too confident in utilitarianism. However, utilitarianism is probably attractive to LessWrong folks for its own reasons, and Bayesian reasoning is not especially to blame.

I think LessWrong participants are too confident in the efficacy of informal Bayesian reasoning. Their confidence in this seems to be very high. I've gradually adjusted my confidence in that upward, through this dialog, but it's still not very high. I would love to see some controlled studies, rather than anecdotes and a priori arguments.

I'm actually more confident about FAI and the Singularity than LW folks. I think their probability in the next 40 years is < 0.1, whereas the average LW person might say 0.5. The difference may be due to the fact that I've actually worked in AI and know the field, whereas LW folks generally haven't.

I think you are being a little bit optimistic [about people's native ability to do informal probabilistic reasoning].

I think you are right. I already thought educating everyone in probability is important (I said so in my original post), and the dialog here and on Scott's site has reinforced that belief.

There's a nice relevant post today from Katja Grace here. We are all mostly stupid and wrong, and reiterating the obvious is necessary.

Singularity timelines

I'm actually more confident about FAI and the Singularity than LW folks. I think their probability in the next 40 years is < 0.1, whereas the average LW person might say 0.5.

The 2011 Less Wrong Survey had a question for what people thought was the most likely date of the Singularity. "The mean for the Singularity question is useless because of the very high numbers some people put in, but the median was 2080 (quartiles 2050, 2080, 2150)." The results page of the 2012 survey doesn't mention the Singularity dates, but downloading the raw data, it looks like that year's quartiles were 2045, 2060, 2080.

It might also be relevant to mention the results for the "Which disaster do you think is most likely to wipe out greater than 90% of humanity before the year 2100?" question. In 2011, "Unfriendly AI" got 16.5% of the votes and came out second (behind bioengineered pandemics at 17.8%). In 2012, it got 13.5% of the votes and came out third (behind bioengineered pandemics at 23% and environmental collapse at 14.5%).

Singularity timing

Thanks, Kaj, data are always good!

Generally, they confirm that my subjective estimate of the likelihood of a Singularity is lower than the average LW reader. But, again, I don't think this has much to do with Bayesianism as such. If there's any connection, it's the attraction of general geekiness.

When I started doing AI in the early 1980s, I thought we'd have human-level intelligence within a decade, or two at most, and this seemed the typical view of researchers at the time. (Oddly, we didn't think about the implications of superhuman intelligence, besides "wow, cool!".) Perhaps timing predictions get pushed forward at about the same rate there is no progress on the fundamental problems.

What I find particulary weird

Niv's picture

What I find particulary weird about this LW obsession with singularity is that I haven't seen in LW any evidence whatsoever that people are getting results that get them closer and closer to singularity.

Is the Singularity worth obsessing about?

Hmm, well, I'm not sure it's true that there's no evidence of progress. MIRI, for example, has papers with some evidence. Mostly people cite Moore's Law, which is not entirely irrelevant. There's also attempts at measuring the rate of improvements in general algorithms, for instance, which again I think is extremely weak evidence, but not zero.

I think it's reasonable for a small number of people to spend time and money thinking about a possible Singularity, just in case. I think no one else should lose sleep over it.

I think the fascination with it is partly "wouldn't it be cool" (like cold fusion), and partly quasi-religious. And it's easier to fool yourself into thinking you understand something about AI than to fool yourself into thinking you understand something about cold fusion.

Kaj, I'm looking at the histograms of time-to-AI predictions in your 2012 paper with Stuart Armstrong, and the modal prediction is 16-20 years. This is markedly different from the LessWrong survey results you mention. I find that striking; but I've no idea what it implies about LessWrong (or AI). Any thoughts about that?

This is markedly different

This is markedly different from the LessWrong survey results you mention. I find that striking; but I've no idea what it implies about LessWrong (or AI). Any thoughts about that?

No, not really. I could try to come up with a hypothesis, but I'd just be making stuff up. :-)

Incidentally, the reason why I personally find AI in 50 years to be quite plausible is because of the progress we're making in neuroscience. We've already successfully created brain prostheses that replicate hippocampal and cerebrellar function in rats, and there are claims that we wouldn't need any big conceptual breakthroughs to implement whole brain emulation, just gradual improvement of already existing techniques. One distinguished computational neuroscientist was also willing to go on record about us not needing any huge conceptual breakthroughs for creating neuroprostheses that would mimic human cortical function, and co-authored a paper about that and its consequences with me.

If we're already this far now, it wouldn't seem at all implausible that we'd have reverse-engineered the building blocks of intelligence within the next 50 years. Though obviously that's also what the AI gurus of 50 years ago thought, so I wouldn't say that "AI within 50 years" is certain, just plausible.

I agree that often

27chaos's picture

I agree that often Bayesianism is not about Bayes. Still, I think you underestimate the value of the framework of thought. Personally, I found Conservation of Expected Evidence to be a useful idea that was not obvious to me in advance. Similarly, I never realized how important priors were in making predictions. Finally, I think the Bayesian approach lends itself very easily to considering several different possibilities at once. These skills/ideas can be taught outside a Bayesian framework, but I don't see any compelling reason to avoid it. While the ideas might not be exclusive to Bayes, they still deserve to be promoted. And since these ideas are implicit in the theorem, I don't mind that the theorem is the focus of promoting these ideas.

I do think that Bayes is often used as intimidation tactic or shibboleth. But that isn't the theorem's fault, no matter what idea was used in its place similar events would occur. Criticizing the way people deploy Bayesianism is fine, but like we both agree on much use of Bayesianism is not about Bayes, and so even if bad ideas also are in the area "true" Bayesianism still seems like a good thing to support.

Above, you claimed that a

27chaos's picture

Above, you claimed that a major problem with Bayesianism is that the universe does not come prepackaged with hypotheses. However, human brains are born with intuitions about causality and inference, which I think suffices as a starting point for Bayesianism or something closely akin to it, at least once badly conflicting intuitions are recognized and repaired through reflective equilibrium. I do not think any epistemic framework can avoid relying on human inferences, so I do not see why you think it is a problem with Bayesianism that the universe does not hand us hypotheses or reference classes or objective priors.

Choosing a hypothesis space is difficult

Hi 27chaos,

Choosing a hypothesis space is difficult in practice. The fact that the universe doesn't supply them is not just a theoretical objection.

My following post "How to think real good" discusses this problem, and suggests various heuristics for hypothesis-space selection. Surprisingly little seems to have been written about this by other people!

Instrumental rationality

Sorry to dig up this old post, again there's a topic here that's been in my head a lot recently on and off. I've tried not to make this too ranty, I really like the LW people but have a similar fascination/frustration dynamic to you.

In the comments here you write:

Part of LessWrong 2.0 could be a new presentation of the small practical insights that didn't involve the big, wrong claims about Bayesianism.

I think LW has already made some nice steps towards this under the heading of 'instrumental rationality'. The bit that interests me most is that that side of LW is often very insightful on the kinds of preverbal/emotional content that ideas have associated with them: there's a pretty sophisticated vocabulary of 'ugh fields' and 'affective death spirals', and Alicorn wrote a whole sequence of posts on introspection and trying to pick apart the background affective tone you associate with beliefs, with the intention of maybe changing them (or in Silicon Valley terms 'hacking yourself to liking different stuff').

This is very close to the territory I'm going over with my mathematical intuition obsession, so it's helpful for me to read. And they write in a language I can understand! Normally I'm worried that I'm going to have to go and read reams of obscure text by continental phenomenologists or something, so I definitely appreciate this.

CfAR seems to do even more of this, but I think they're also increasingly tightly coupled to MIRI/x-risk... not really sure what's going on there.

I really don't understand how this side of LW fits with the side that's obsessed with formalising everything in some improbably narrow mathematical framework. They seem completely at odds. And maybe I'm being unfair but it always seems to have the feeling on LW of being a kind of sideshow, like it's being treated as a set of interesting soft questions for the less technically minded, while the grown-ups get on with the serious business of trying to make ethical statements take values in the real numbers or whatever. I'm convinced it's the good bit though!

LessWrong 2.0

With hundreds of participants, LW was always diverse, with both differences in what people found most interesting, and substantive differences of opinion. That may account for some of the puzzle you pose.

Also, I wrote the post almost four years ago; and LessWrong, and CFAR, have undergone substantial changes since then. Some major contributors have left, and new leaders have arrived.

More interestingly, some people's understanding has evolved. There are signs that some are moving out of dogmatic systematicity, and taking first steps toward meta-rationality.

I gather that CFAR stopped teaching Bayesian stuff, because participants in their workshops did not find it useful. (I feel rather vindicated by this!) I don't know much about what they do teach now, but my impression is that it's mostly psychological techniques for getting past personal limiting assumptions. I think that's good and important (although I can't speak to the specifics).

I agree that "ugh fields," for instance, are a very useful idea (and I use the term often myself now). Such things have nothing to do with formal rationality... but I guess they are rational in the broadest sense, of "thinking in ways that work."

I now have several quite good friends who are in the LW/CFAR milieu. I'm still skeptical, and still don't fully understand why it is appealing, but it attracts good people.

Maybe that's bad? I don't know. I guess it makes me more willing to try to help sort out their confusions. (The original question posed by this post was just that: is it worth making a large effort to try to help de-confuse this group? I still don't know.)

One important thing I've learned recently is that LW/CFAR is primarily a subsociety (in the jargon of my history of meaning). From the outside, it looks like a subculture; but, as with the classic subcultures/subsocieties of the 1990s, the culture is mainly window-dressing which holds together a tightly-knit local group of friends (in Berkeley).

Posting speculations on LW is the playful, creative cultural activity they use as friendly competition and for mutual enjoyment. This is the essence of subculturalism: a fun and artistically productive "scene." It shouldn't be taken seriously, because ultimately it isn't—any more than death metal is serious, despite that musical genre's role in holding together subsocieties who try to pretend it's a profound philosophy.

That said, death metal has real musical value, and some creative products of the LW/CFAR scene have real value too.

LW and CfAR

Thanks for the reply! It's true that I'm missing a lot of context. I've hung round the edges of 'rationalist-adjacent tumblr' quite a bit, but that's really its own separate thing, and geographically the interesting stuff is mostly going on thousands of miles from me. I'm fascinated by what I can glean online, though - it's one of those things where some parts of the community really appeal to me and some really don't, so I just keep coming back and poking around some more.

That's interesting that CfAR stopped the Bayesian part of their curriculum - I was still under the impression that that was important to them. Some details of their new content have been published on the LW discussion pages, though, and 'mostly psychological techniques for getting past personal limiting assumptions' is probably accurate. It looks good to me!

I guess what I was trying to get at is that the new CfAR-type-stuff seems to point to quite a complex, messy account of how we learn new things and change our minds, whereas old-school Less Wrong seemed to be obsessed with manipulating clean formal 'representations in the head'. I'm pretty sure that in at least some cases these are the same people (CfAR and MIRI are still closely linked, and the MIRI forum still looks like 'old school Less Wrong' to me), so I'm just interested in how they resolve the tension!

How they resolve the tension

I'm just interested in how they resolve the tension!

Yes, that's a bit of a puzzle for me, too.

The word "rationality" is ill-defined. There is the narrow sense of formal rationality (mathematical logic, decision theory) and the broad sense of "thinking in ways that work, pragmatically."

Actually, "narrow" and "broad" suggest the former is a subset of the latter—but mathematical logic and decision theory almost never work pragmatically, so these are mostly-disjoint sets!

My impression is that the tension is resolved, mostly, by carefully not noticing this. Mental compartmentalization for cognitive-dissonance avoidance.

More charitably, different segments of the rationalist movement emphasize one or the other of these two definitions, and some participants may be clearly cognizant of their dissimilarity. The MIRI people are mainly interested in formal rationality, and the CFAR folks in practical rationality.

To the extent that there is a semi-deliberate conflation of the two definitions, I expect that it is because formal rationality seems to offer strong guarantees of power and certainty—its eternalist appeal—but cannot deliver. On the other hand, the pragmatic techniques taught by CFAR may be moderately useful but don't sound impressive, and come with no guarantees—only rather vague references to the cognitive psychology literature.

Then "Bayes!" can serve as an abstract holy talisman that sanctifies mundane, messy, undramatic practices by ritual invocation of magical perfection.

drossbucket:

drossbucket:

I really don't understand how this side of LW fits with the side that's obsessed with formalising everything in some improbably narrow mathematical framework. They seem completely at odds.

Not sure exactly which parts of LW you are referring to when you're talking about "formalizing everything in math", but for some parts (e.g. anything to do with decision theory) at least, the answer is that it's the LW/MIRI flavor of AI research. It's meant to be something that you use for thinking about how to build an AI; it's not meant to be a practical guide to life any more than a theoretical physicist's work on string theory is meant to help him with his more mundane concerns.

David:

I'm sure you realize that if you're curious about CFAR, they do run workshops you could attend, right? ;) If the cost is the deciding factor, it's negotiable. (It would probably be gauche to elaborate on the exact deal that I got, but let's just say that I was pretty penniless when I contacted them about attending but we came to a mutually satisfactory arrangement anyway.)

MIRI again

Kaj Sotala:

Thanks for replying, and sorry, I know I was being vague. Yes, I'm talking about the general probability/logic/decision theory cluster that MIRI work within. This is still rather vague, but as I said in the comments to the recent SSC post I haven't read any of their recent stuff and don't know exactly what e.g. the logical induction paper is about. (I'd link to the specific comment but the spam filter here didn't like that post; it's under the same username.)

It's meant to be something that you use for thinking about how to build an AI; it's not meant to be a practical guide to life any more than a theoretical physicist's work on string theory is meant to help him with his more mundane concerns.

My question is the other way round (I fleshed it out a bit more in that thread comment). Given the kind of high-level understanding of cognition we build up from things like instrumental rationality, what makes MIRI think that their strategy is a promising one for explaining it at an underlying level?

This is a genuine question where I'd like to know the answer; maybe I'd be convinced by it. FeepingCreature on the SSC thread said, essentially, 'It's something where we know how to do calculations, so we can come up with toy models'. That's fine, but I'm interested in if there are any deeper reasons.

Phlogistonics

I find MIRI's logicism quite surreal, because during the 1980s, mainstream AI figured out very thoroughly why mathematical logic is in principle unusable as a basis for AI.

It's like a bunch of enthusiastic amateurs reviving the phlogiston theory and going around trying to convince everyone that advanced phlogistonics will solve global warming. Obviously, we can't do that quite yet, but global warming is an extremely important problem, so you should fund us to do more phlogistonics research. A better model of combustion, based on the phlogiston theory—which has been suppressed and ignored by the mainstream for far too long—may enable us to burn hydrocarbons without any CO2 emissions. Yes, of course there are some technical difficulties, but that's why we do research! There's always luddites who say something can't be done, and yet technical breakthroughs happen all the time.

Both:

Both:

Sorry, my answer was not quite right. It's not that MIRI is using logical approaches to figuring out how to build an AI. Rather, they are using logical approaches to figure out what we would want our AI to do.

A slightly analogous, established form of logic use can be found in the design of concurrent systems. As you may know, it's surprisingly difficult to design software that has multiple concurrent processes manipulating the same data. You typically either screw up by letting the processes edit the same data at the same time or in the wrong order, or by having them wait for each other forever. (If not previously familiar, Google "dining philosophers problem" for a simple illustration.)

So to help reason more clearly about this kind of thing, people developed different forms of temporal logic that let them express in a maximally unambiguous form different desiderata that they have for the system. Temporal logic lets you express statements that say things like "if a process wants to have access to some resource, it will eventually enter a state where it has access to that resource". You can then use temporal logic to figure out how exactly you want your system to behave, in order for it to do the things you want it to do and not run into any problems.

Building a logical model of how you want your system to behave is not the same thing as building the system. The logic only addresses one set of desiderata: there are many others it doesn't address at all, like what you want the UI to be like and how to make the system efficient in terms of memory and processor use. It's a model that you can use for a specific subset of your constraints, both for checking whether the finished system meets those constraints, and for building a system so that it's maximally easy for it to meet those constraints. Although the model is not a whole solution, having the model at hand before you start writing all the concurrency code is going to make things a lot easier for you than if you didn't have any clear idea of how you wanted the concurrent parts to work and were just winging it as you went.

So getting back to MIRI, they are basically trying to do something similar. Their work on decision theory, for instance, is not so much asking "how do we want the AI to make its decisions", but rather "what do we think that 'making the right decisions' means, and what kinds of constraints do we think that includes". Not asking "how do we make an AI generally intelligent", but rather "if we did have a generally intelligent AI, what would we want it to do in the first place", and then pinning those desiderata down in sufficiently precise terms so as to make them unambiguous.

As David correctly points out, mathematical logic is an unusable basis for building an intelligent system. But "how do we build an intelligent system" is a different question from what MIRI is asking - they are asking the question of "what would it even mean for an AI system to be aligned with our values". And they are arguing - I think convincingly - that if you start building an AI system without having any clue of what kinds of constraints its design should fulfill, you are very unlikely to get them right. In a similar manner as the guy who starts coding up a concurrent system without having any formal understanding of what kinds of properties a concurrent system should have, is just going to have a horrible dysfunctional mess at their hands.

Or to take the climate change analogy. Rather than being the guys who revive phologiston theory and then try to apply that to global warming, MIRI is being more like economists who start applying economic theory to the question of "given that we're going to have global warming, how's it going to affect our economy and which of the proposed methods for dealing with it would be the best in economic terms" (only to be dismissed by climate researchers who say that economic models are completely unsuitable for making predictions about the global weather system, which is completely correct but also completely beside the point).

Uses of logic

Building a logical model of how you want your system to behave is not the same thing as building the system.

Thanks, yes, that's a useful distinction!

Useful distinction

Kaj Sotala: Thanks, yes that is a useful distinction! I'll have to think more about how much it helps answer my question, but it definitely makes things a bit clearer for me.

Add new comment

To post a comment, you must enable Javascript and reload this page.

Navigation

You are reading a metablog post, dated June 5, 2013.

The next metablog post is How To Think Real Good.

The previous metablog post was RSS users: please re-subscribe.

This page’s topics are Atheism, Eternalism, History of ideas, and Rationalism.

General explanation: Meaningness is a hypertext book (in progress), plus a “metablog” that comments on it. The book begins with an appetizer. Alternatively, you might like to look at its table of contents, or some other starting points. Classification of pages by topics supplements the book and metablog structures. Terms with dotted underlining (example: meaningness) show a definition if you click on them. Pages marked with ⚒ are still under construction. Copyright ©2010–2017 David Chapman.