Randomly Generated BSTs
  - Meh. All these complicated balanced trees are cool, but do we really need them? 
  - What do you mean? Without balancing, the tree could become a stick! 
  - True. But is that really likely? It seems like a stick would be a pretty rare outcome. 
  - Hmm. That's a great question. Let's look into it! 
For any given stick, there's exactly one order of inserts that could generate that specific structure.
But for more balanced trees, there are lots of orders that can generate the same structure!
Let's consider these two trees:

 
Since our BST-insertion algorithm always inserts new values at the bottom of the tree, the value at the root (apple) must have been inserted first. The value under the root (banana) must have been inserted second, and the leaf (coconut) must have been inserted last.
Since there are two leaf nodes, we have no way of knowing which one was inserted first. The root (banana) must have been inserted first, but we could have had either of these two insertion orders:
- banana, then apple, then coconut
- banana, then coconut, then apple
This idea that there are more ways to make balanaced trees than sticks is true in general even in the subtrees of a tree. As a result, there are far more ways to build a “relatively balanced” tree than ways to build a really unbalanced one.
The animated GIF below shows one hundred trees built with the values 1 … 31, inserted in random order. In principle, any tree is “possible”, including a perfectly balanced one or a stick, but in fact, these examples look nothing like those extremes:

That graphic only shows a hundred trees, but we can give some data about all possible trees. Here are the number of ways of building a 31-node binary-search tree of different heights:
countBSTs(size=31,height= 4) = 74836825861835980800000
countBSTs(size=31,height= 5) = 9635994875418354624496533504000
countBSTs(size=31,height= 6) = 554742975239507208998947230842880
countBSTs(size=31,height= 7) = 2139907126527076808931989127168000
countBSTs(size=31,height= 8) = 2581031431111558948452913272586240
countBSTs(size=31,height= 9) = 1722684604380783700136699622457344
countBSTs(size=31,height=10) = 807144154870253366097902756429824
countBSTs(size=31,height=11) = 293807513318555680155922398183424
countBSTs(size=31,height=12) = 87014253718424275446396234498048
countBSTs(size=31,height=13) = 21467696205370185353673521233920
countBSTs(size=31,height=14) = 4472005811093900634709265219584
countBSTs(size=31,height=15) = 793022195294097048842629283840
countBSTs(size=31,height=16) = 120296706951177507502629847040
countBSTs(size=31,height=17) = 15649972296467407207188004864
countBSTs(size=31,height=18) = 1747241577268150152941797376
countBSTs(size=31,height=19) = 167232517096894897551572992
countBSTs(size=31,height=20) = 13684455445839134492459008
countBSTs(size=31,height=21) = 952972784694566718537728
countBSTs(size=31,height=22) = 56097686586924761874432
countBSTs(size=31,height=23) = 2765077684331420319744
countBSTs(size=31,height=24) = 112638186030024884224
countBSTs(size=31,height=25) = 3723722818152562688
countBSTs(size=31,height=26) = 97344212046970880
countBSTs(size=31,height=27) = 1935705521520640
countBSTs(size=31,height=28) = 27501212467200
countBSTs(size=31,height=29) = 248571232256
countBSTs(size=31,height=30) = 1073741824
  - Hmm. I think I've seen that shape before, somewhere… 
Uniform-Probability Model
  - See? It's really unlikely that we would get a stick! 
  - Hold on a minute. That conclusion has some implicit assumptions. 
  - Let's turn the subtext into text! 
Goat's assertion that “fewer ways of building a stick” means that “a stick is less likely” implicitly assumes that every possible ordering of keys is equally likely.
But if there were a high probability of inserting keys in sorted order, a stick would be very likely, and Goat's conclusion would be wrong.
That said, the uniform-probability model—all possible inputs have equal probability—is reasonable if we don't have a reason to believe that some inputs are more likely than others.
So let's proceed under that assumption and see what comes out.
Assumption: for a given set of keys to be inserted into the tree, all orderings of those keys are equally likely.
Under the uniform-probability model, here are the probabilities of different heights in a BST of size 31:
probBST(size=31,height= 4) ≈ 0.000000000009101093796095866889
probBST(size=31,height= 5) ≈ 0.001171857466827763253595489742
probBST(size=31,height= 6) ≈ 0.067463682381466788358984411563
probBST(size=31,height= 7) ≈ 0.260239464316841047694640281118
probBST(size=31,height= 8) ≈ 0.313885695641148052744060540411
probBST(size=31,height= 9) ≈ 0.209499988608618620701685103487
probBST(size=31,height=10) ≈ 0.098158821888126600218580138926
probBST(size=31,height=11) ≈ 0.035730667434326430494837243246
probBST(size=31,height=12) ≈ 0.010582021291906707573944851958
probBST(size=31,height=13) ≈ 0.002610740293981411627028950346
probBST(size=31,height=14) ≈ 0.000543851825284414365152575897
probBST(size=31,height=15) ≈ 0.000096441415020489580013197150
probBST(size=31,height=16) ≈ 0.000014629583773975211788876723
probBST(size=31,height=17) ≈ 0.000001903232320935283032231175
probBST(size=31,height=18) ≈ 0.000000212486423575926334809821
probBST(size=31,height=19) ≈ 0.000000020337565180356070823253
probBST(size=31,height=20) ≈ 0.000000001664200894770838243423
probBST(size=31,height=21) ≈ 0.000000000115893406738605166490
probBST(size=31,height=22) ≈ 0.000000000006822180143158010256
probBST(size=31,height=23) ≈ 0.000000000000336268021375625372
probBST(size=31,height=24) ≈ 0.000000000000013698211866627690
probBST(size=31,height=25) ≈ 0.000000000000000452851256695957
probBST(size=31,height=26) ≈ 0.000000000000000011838273392062
probBST(size=31,height=27) ≈ 0.000000000000000000235405995779
probBST(size=31,height=28) ≈ 0.000000000000000000003344491315
probBST(size=31,height=29) ≈ 0.000000000000000000000030229370
probBST(size=31,height=30) ≈ 0.000000000000000000000000130580
  - Whoa. A stick is really, really rare! 
  - Yeah! It's technically possible to build a stick, but vanishingly unlikely. 
  - Note that it's also really unlikely to build a perfectly balanced tree (smallest height). But the most likely height (8) is still pretty small. 
Summarizing these distributions, we can say that the mean is:
$$ \frac{11,994,961,748,538,257,538,823,933}{1,467,491,935,895,582,556,960,000} \approx 8.174 $$ and the standard deviation is $$ \frac{ \sqrt{ \frac{6,931,977,688,136,249,431,724,240,925,138,780,712,808,881,338,7349}{19}} }{ 1,467,491,935,895,582,556,960,000 } \approx 1.302 $$
So the expected cost of insertion into a random tree is pretty good!
(When logged in, completion status appears here.)