Entropy is a tricky concept. Although it pops up everywhere—take this scene from Portlandia, for example—there are many different definitions for it, and as a result it tends to be hand-waved around as some kind of measure of “disorder” when explained in a casual situation. That being said, entropy is such a pervasive concept in physics that it has been blamed for anything from the eventual heat death of the universe to the fact that time exists as we know it! So to try and give this topic the attention it deserves, and to dispel the notion that entropy is somehow fundamentally tied to disorder, I’ll try to demystify the concept of entropy without ever touching on disorder by presenting a fairly specific (but conceptually broad) definition of it.

Consider some sort of “shape generator” that randomly pops out either a diamond, arrow, or circle with equal probability. In fact, you’ll find an (only approximately random) gif version of such a contraption below!

As you can see, there are only three distinct possible outcomes of picking a shape. And, at least for this example, these outcomes contain all the information you could get out of the shape generator. In fact, by repeatedly picking shapes over and over, you could understand everything there is to know about the shape generator itself! We can call these shapes **complete shapes*** since they are both shown completely and they collectively provide complete information on the shape generator.

*The formal name for these in statistical mechanics doesn’t make much intuitive sense here.

Now I’m going to block off some information from you by taking the exact same shape generator from above and putting a blue bar over the right half of it. Try picking a shape below now!

Notice that, even though the underlying shape generator is the same, you can only pull out *two* distinct outcomes out of this blocked generator compared to the three in the unblocked one; triangle or semicircle. (This is because the left-hand side of both the arrow and the diamond are exactly the same.) We can call the shapes we get from this generator **incomplete shapes** since they don’t show the underlying shape completely thanks to the blue bar.

From an informational perspective, the blue bar also doesn’t let us fully know what the underlying shape generator is; someone who has no idea what’s under the blue bar might not even think of—and can’t prove—the fact that there are two distinct “triangle-y” shapes. This is because, although the semicircle in the blocked case indicates a circle in the unblocked case, the triangle in the blocked case doesn’t uniquely correspond to either the diamond or the arrow. And this is precisely where entropy comes in.

Entropy is just a measurement of the number of complete shapes associated with an incomplete shape, which can be measured in this case by counting the lines attached to each of the incomplete shapes in the diagram above. Here, the triangle shape has an “entropy” of 2, while the semicircle shape has an “entropy” of 1—which is the lowest value it could possibly be in any shape generator, since there has to be at least one line attached to an incomplete shape. (You couldn’t put a blue bar on our shape generator and suddenly expect to see a hexagon, could you?)

In a roundabout way, **entropy is giving you a quantitative measurement of how little information you get about an underlying system when you observe some specific “blocked” or inefficient measurement of that system.*** A shape with large entropy has a large number of lines attached to it, which means that a bunch of complete shapes could be associated with the incomplete shape; a shape with low entropy has only a few complete shapes associated with it, which gives you less uncertainty about the underlying shape, and by extension, the underlying generator creating those shapes.

*Curiously enough, this quantity as described in terms of specific measurements isn’t called entropy in information theory; informational entropy is defined as the *average* of this quantity over all blocked measurements, which means that it’s really a function of the blocking itself rather than of a specific blocked measurement. I consider this to be * very* annoying.

That’s mostly all there is to it! One important thing I have to mention, which I consider a bit of a boring formality, is that the proper definition of entropy is actually the *logarithm* of the number of lines between each shape. This changes nothing about the qualitative statements I made above and, depending on your point of view, is more pragmatic than fundamental; it’s done chiefly so that if I have two identical shape generators and pulled out two triangle shapes, the entropy of both of those triangle shapes together is the sum of each individual triangle entropy rather than the multiplication of them. Bo-ring!

Also note that because the triangle shape has a larger entropy, and each complete shape is equally likely to pop up in the shape generator, the triangle is more likely to show up than the semicircle in our blocked generator. In fact, if there were a massive amount of triangle-y complete shapes compared to circular ones in our shape generator, the entropy of the triangle shape would be far larger than the semicircular one, and it would be almost certain that you’d pull out a triangle from the blocked shape generator.

What does this all have to do with thermodynamics? Well, in thermodynamics, we’re dealing with massive ensembles of individual physical objects (atoms/molecules) with many different physical properties such as energy, momentum, and so forth. Because it’s hopeless to keep track of all of the properties of all of these individual objects, we decide to “block off” our information about every individual object and keep track of only averages of these quantities in the entire ensemble. This then automatically defines an entropy connecting each of these “incomplete” averaged measurements to the “complete” collective states of the individual objects. And, if the largest entropy of an averaged measurement is far larger than the second-largest entropy as in the multi-triangle scenario in the paragraph above, then it is overwhelmingly likely for the ensemble to be in that largest-entropy state.

Note that disorder never even came into the picture! The connection to disorder only obviously pops up when your “complete” system consists of a very large amount of identical subsystems, in which case the most likely incomplete measurements tend to be ones with little “organization” (and, by extension, little information) in the complete underlying system. In short, **entropy is only a measure of disorder when disorder implies a lack of information about the system**. And if you’re interested in seeing some examples of this, well, you’ll have to do a little bit of extra reading.