How Many Bits are in Common Sense?

A quick calculation on a lazy Sunday.


The Age of Sense

One way to estimate the size of common sense is to ballpark the age by which we expect the average human to have attained common sense.

What’s Common Sense?

I’m really going to assume common sense is common here. I expect nearly everyone to have it, and it should be pretty universal, at least in the order of magnitude of information content, among cultures. It’s not some specialized thing you learn late in life. I’m talking about physical common sense like “if you drop something it will fall” as well as human/animal theory of mind that gives you a basic understanding of how other living beings will act contingent on your behavior.

In other words, I will assume you are spending a good portion of your early years learning common sense, and your later years are mostly spent learning what perhaps should be called “uncommon sense.”

The Age of Reason

There is an interesting psychological transition that occurs around the age of 7 that some researchers call the age of sense or more commonly the “Age of Reason.”1See for example the usage of the term “Age of Reason” in Tooley et al. (2022). The actual age range is a bit looser than what I am conveying (usually it is characterized as a transition during the years 5 to 7), but I am just choosing numbers. During this period, a child rapidly improves in abstract reasoning and their common sense coalesces considerably, with significantly less change in subsequent years.

We will equate an Age of Common Sense with the culmination of the Age of Reason. We then assume a typical human has a relatively complete common sense by the age of 8. This is a strong assumption, but feels right as an estimate to me. Certainly we expect older elementary school students to have some basic common sense! (Also, we aren’t trying to estimate the fraction of learning that goes into non-common material, so to compensate we are biasing toward a slightly younger age.)

The Effective Speed of Learning

We then need an estimate for the maximum speed of learning. This will help us get a rough upper bound of the typical size of common sense. According to the psychologist Thomas Landauer2This Landauer is different than the physicist Rolf Landauer, who famously stated Landauer’s principle, which says it requires kBTln2k_B T \ln 2 to erase 1 bit of information. in his 1986 article “How much do people remember?,” we learn at approximately 1 bit/s. This, roughly, is the rate at which you can incorporate information into your long-term memory, which we will take as a proxy for the order of magnitude estimate of the rate at which you can update your world model. This is different than more recent research by Zheng and Meister (2025) that we consciously experience at 10 bits/s. Both of these numbers are also different than the raw sensory processing rate of the brain, which is much, much higher—around 10910^{9} bits/s, Zheng and Meister note.

Assumption: You Gain Sense Awake

Let’s assume no acquisition of new information happens during sleep. This isn’t to say sleep isn’t important for learning, but I expect you are processing and consolidating existing information rather than gaining new experience while asleep. Accounting for sleep is not going to affect the order of magnitude, but I am including it in my estimate anyways.

We now simply need to find how many wakeful seconds there are in 8 years.

Assuming an average sleep of 8 hours per 24 hours, in a day there are 16 hours of wakeful activity in which learning the common sense world model is possible. Actually the average sleep during the included age ranges can (and should) be much higher, according to the American Academy of Sleep Medicine. For example, infants between 4 and 12 months should sleep between 12 to 16 hours per day. However, we are looking for a rough upper bound for world-model-learning time, so I will settle for the smaller sleep average.

Common Sense, In Bits

Our rough upper bound for the amount of information in common sense is

 size of common sense1bit/s×3600s/h× 16hawake/day×(365×8)days=168192000bits21Megabyte\begin{aligned} &\quad\ \, \text{size of common sense}\\ &\lesssim 1\,\mathrm{bit/s} \times 3600\,\mathrm{s/h} \times \\ &\quad\ \, 16\,\mathrm{h}_{\mathrm{awake}}\mathrm{/ day} \times (365 \times 8)\, \mathrm{days} \\ &= 168\,192\,000\,\mathrm{bits} \\ &\approx 21\,\mathrm{Megabyte} \end{aligned}

That’s less than the size of a single 24-bit RGB uncompressed 4K image (which is around 25 MB).31 bit of information is conveyed with a yes / no answer with equal prior possibility for either yes or no. A byte is 8 bits (equal to 8 independent yes / no responses). A Megabyte (MB) is 1,000,000 bytes. The fact that we can compress the essence of all the images and sensations we have seen into that small a size speaks to the remarkable efficiency and compression of human representations, which makes up for the fact that we learn so slowly. (1 bit/s is really crazy slow!)

They say a picture is worth a thousand words. But really a picture can mean everything.4The meaning of “everything,” after all, is simply common sense.

My Thoughts

Weaknesses in the Analysis

We’ve somewhat mixed choosing larger numbers and smaller numbers for things. I don’t expect this to be a strict upper bound then, and it could be very off if some assumptions are wrong. But I enjoyed working through this little Fermi problem! Hopefully I can come back to this problem with greater insights later.

Obviously, this raises the question of how do we get the small bit(s) of common sense into artificial agents? Do they need to experience the world like we do, or is common sense already there lurking on data extracted from the Internet?

Also, as I was passing this through AI models to proofread, Gemini raised a good point: 21 MB of information can be a lot in certain regimes. I’ll just quote Gemini:

21 MB of uncompressed video is just a second of footage, whereas 21 MB of source code (text) is massive—roughly the size of the entire Linux kernel source in the early 2000s, or thousands of books. If “common sense” is a procedural model (rules/code) rather than declarative memory (images/data), 21 MB is actually a staggering amount of complexity.

Deep Implications

This last point might suggest we aren’t thinking about bits of world model in the right way.

The problem is starker if you think about physics. Let’s (erroneously) assume the Standard Model is a complete model of physics. In symbols, it’s small enough to fit on a T-shirt. Is that really enough information to specify the universe?

I think we have to give more credit to the substrate that runs the code, so to speak. We have to credit the chemical complexity of the brain and the atomic intricacies of the computer. Without the substrate, the symbols are nothing. What is DNA without RNA polymerase to make RNA, what is RNA without ribosome to make protein, what is protein without the physical environment to make it fold?

What is the Standard Model without the Big Bang to give it existence?

These are the sorts of questions I want to refine and to answer in later blog posts.