#5 Big Data: Intelligence through Information

Ishaan Bhattacharya
9 min readDec 28, 2024

--

This post is part of a series on the similarities and differences between natural and artificial intelligence. You can find the first introductory post here, which includes a description of the topics being covered and links to the previous posts: https://ishaan-b.medium.com/the-nature-of-intelligence-in-man-and-machine-a-series-c9b6c8a5e2a6

— —

After a long hiatus, I am eager to return to this series with its most exciting and final topic: the power of big data and its profound connection to intelligence.

In earlier posts, we discussed how machine learning-based AI systems are trained on data. As training datasets grow larger and more complex, these AI models gain extraordinary capabilities, to the point where we no longer precisely understand how they’re able to solve certain problems. Simultaneously, we discussed how natural intelligence developed as an evolutionary enabler, tied to survival and reproduction. Natural intelligence, as we defined it, involves using sensory data and information to make decisions and take actions. In performing inductive reasoning, we use data about specific instances to identify some overarching truths or guiding principles.

The interplay between data and intelligent behavior lies at the heart of this discussion. Data shapes how we perform intelligent tasks and is deeply entwined with entropy, information theory, and evolutionary processes. These three concepts provide a foundation for understanding the immense potential of big data. In this post, I will unpack each concept individually and then explore how they converge to explain the transformative role of big data in enabling intelligence.

If you haven’t had a chance to read the earlier posts in this series, I highly recommend doing so to better grasp the foundational ideas about intelligence (in both man and machine) that underpin this discussion.

Entropy

Entropy is an esoteric concept that is often misunderstood and used as a catch-all term. Unlike many physics concepts, it is harder to grasp because its effects are difficult to observe or to intuitively understand.

The simple, non-physics, description of entropy is that it is the tendency towards disorder. An egg that is left on a table around a bunch of people is likely to accidentally get smashed or roll off the table and break. The broken egg is the more probable state, and is also the higher entropy state. The entropy of the system containing this egg and the table increases when the egg breaks. Entropy never decreases; it remains constant in a stable system but always increases in an evolving system.

Another simple example is racking up a set of pool balls before ‘breaking’ them with a shot. When you hit the well-ordered triangle of pool balls, it scatters in all directions as expected. But let’s say you want to hit your white ball again in such a way that the colored balls return to the original setup before your first shot. Impossible, right? This thought experiment illustrates two key principles. First, entropy’s ever-marching descent towards disorder helps us conceptualize the forward march of time and the notion of causality. Second, we learn that it requires much more information to return the pool balls to an ordered state than it took to make them disordered. We don’t have enough information to tell us how to return them to an ordered triangle, and therefore we are not able to.

So if entropy is explained in terms of disorder, why is it a concept from physics? Well, the second law of thermodynamics states that any process moves in the direction that causes entropy to increase. Entropy was originally understood as a thermodynamic variable related to energy, and the interpretation of entropy as ‘disorder’ came much further down the line. The second law of thermodynamics is described as:
ΔS = ΔQ/T
where ∆S is change in entropy, ∆Q is change in energy, and T is the temperature in Kelvin at which the heat or energy change is applied. The second law of thermodynamics had far-reaching implications and laid the groundwork for much of the industrial revolution, including the steam engine.

Bear with me here as we look briefly at the mathematical formulation; it is not important in itself, but it serves to illustrate a profound observation.

What this law tells us is that the change in energy, or entropy, is highest at thermal equilibrium. That is why entropy is only stable in a system at equilibrium. In this state of equilibrium, the molecular disorder of the system is at its highest. Entropy is nothing but molecular disorder! Later, entropy was formulated as:
𝑆 = 𝑘 log(𝑊)
where W is a measure of molecular disorder and k is Boltzmann’s constant.

Molecular disorder is further defined in terms of various possible microstates of the molecules, where p is the probability of any particular microstate i, giving us this final equation to describe entropy:
𝑆 = −𝑘 ∑ᵢ 𝑝ᵢ log(𝑝ᵢ)

Information Theory

The pool ball example highlights the link between entropy and information theory, in terms of the amount of information required to describe a system. Information theory emerged as a branch of academia a few decades after thermodynamics, and completely independently. One of the most interesting realizations is the link between information and entropy.

In information theory, the mathematical expression for uncertainty is:
H = −∑ᵢ 𝑝ᵢ log(𝑝ᵢ)
Observe the striking similarity between this equation and the final equation for entropy in thermodynamics! H in the equation above symbolizes uncertainty, and can be understood as the amount of information needed to describe the state of a system. So as entropy increases, the uncertainty increases, and vice versa*. In our example of the broken egg, the entropy of the system increases when it cracks. We require a tremendous amount of information, more information than it is possible for us to get, in order to return the egg to its whole state. Spontaneous events in any system increase the disorder, i.e. increase entropy, and also the amount of information needed to describe the system.

( *in fact, within the field of information theory, the variable H is actually also referred to as ‘entropy’)

Evolution

So if entropy in the Universe is always increasing, how do we manage to exist as an ordered state of living matter? This leads us to a very insightful implication of everything we’ve discussed so far: Entropy is an evolutionary imperative.

Living beings exist in a highly ordered state that is not in thermal equilibrium with its surroundings. Thus, the very existence of a living organism is a fight against entropy. When we are finally in thermodynamic equilibrium with our surroundings, we are dead. Evolution took place to battle against this entropic decay. Our metabolic systems have evolved as the answer to entropy.

This is exemplified by the relationship of living organisms with the sun. Contrary to popular perception, the Earth does not absorb and retain energy from the sun. The amount of energy absorbed from the sun is exactly equal to the amount of energy emitted by Earth back into space. The difference being that energy from the sun is absorbed in high-frequency directed photon packets, whereas the energy radiated back is in low-frequency photons. The living organisms on Earth, i.e. the food chain, absorb the high-frequency photos and emit the low-frequency ones in turn, but the total energy absorbed and emitted remains the same.

So what is the food chain actually taking from the sun, if all the energy is emitted back outward? We have evolved to absorb information from the sun! We recognize, approach and apprehend information as an adaptation to enhance survival, i.e. resist entropy. The food chain extracts order from the sun’s photons to maintain the highly ordered living state that it must exist in. This is enabled by biological mechanisms such as photosynthesis, glycolysis, the TCA Cycle and Oxidative Phosphorylation of ATP.

We already saw the connection between information and intelligence; we understood that intelligence is required for conditioning and survival; now we see that information and entropy are two sides of the same coin and are intimately related to the evolution of all biological species on Earth.

Big Data: A step-up in the power of Information

In the past 10–15 years, ‘big data’ has become a commonly used term, arising from seemingly nowhere. The internet, and the connectivity it enables, has given us an unprecedented ability to produce and store extraordinarily large amounts of data. More significantly, we have learned to navigate this data, extracting useful information to create new knowledge about the world. This has led to the popular term ‘big data’, and the establishment of data science as an important academic area.

Big data has been transformational. It has changed the way we understand things, and what we can figure out about the world. It has shown that we can get enormous insight about any phenomenon given enough data about it. Phenomena that were previously considered idiosyncratic can now be shown to be systematic, and maybe even predictable, if we have a large enough sample of data related to it. Big data is allowing us to better understand the world, which is the purpose of intelligence.

Natural intelligence uses our sensory experience to collect information. From this information we reason about the world and recognize patterns. We’ve seen that machine learning also works to recognize such patterns. This ability of machine learning has been powered by big data. Deep-learning models use vast troves of data to make predictions that often elude our understanding. At best guess, machine learning models are able to use big data to identify some fundamental truths about the phenomena it describes.

An important characteristic about machine learning is the emergent properties of ML models. In other words, the unexpected abilities that appear as ML systems become more complex. These abilities don’t scale linearly, they seem to take an exponential jump. In today’s gen-AI tools, the simple act of ‘token-prediction’ (i.e. suggesting the best combination of words) does not explain the ability to reason and explain complex issues/problems. Evidently, reaching a critical threshold in the size of training data enables these emergent effects.

The phenomenon of life is the most classic example of emergent behavior. A single-celled bacterium is alive, but the macromolecules that make up the bacterium are not alive on their own. Emergent means that the whole is greater than the sum of parts. This is also characteristic of big data. Data in vast troves provides insight that is much greater than the sum of what we can get from little bits of data. In post #2 of this series, we discussed inductive reasoning in humans as the process of abstracting core principles from specific instances. This is similar to what deep-learning achieves: identifying basic underlying truths ostensibly through the troves of data that were consumed during training. Such forms of artificial intelligence are therefore enabled by big data.

Big data may hold answers to phenomena that we are not yet able to precisely describe. It has already been used to make extraordinary strides in describing human behavior, since we are now able to collect data about almost every action taken by us in everyday life. These insights are being leveraged in advertising, social media, product design etc. Let’s zoom out and look at the social sciences as an aggregate whole. The social sciences have always been inherently imprecise. Many theorems in the social sciences are filled with exceptions to describe edge cases in the real world. In truth, this imprecision always frustrated me in my childhood, and made me gravitate towards physics in my undergraduate studies. Physics is beautiful in its simplicity, and can describe complex systems through a few fundamental, precise rules. However, I had a recent realization: This may simply be because data about physical phenomena is easy to observe and capture, and thus fit into the general-purpose rules that we’ve developed to describe the world. Whereas data in economics, sociology, even biology, is inherently noisy and messy. Too messy for our brains to compute with our limited computational power. My hypothesis: Machine learning models, with their millions of parameters, can compute this messy data and make more precise predictions. Big data could usher in a new golden age for the social sciences, enhancing our intelligence in unprecedented ways.

We have come full circle now, and it’s important to see the connections between all the concepts discussed so far:

Entropy is an evolutionary imperative.

Information is the other side of that very same coin.

Life uses information to exist and battle against entropic disorder.

Intelligent organisms use sensory information to reason and understand truths about the world.

Similarly, artificial intelligence systems use data (i.e. information) about discrete events to make new predictions. The proliferation of big data has enabled the rapid development of these machine-learning capabilities.

Big data continues to grow; we are finally learning to harness the power of data in every aspect of our lives.

An undeniable connection exists between how big data powers AI systems and how information fueled evolution and natural intelligence. I cannot yet mathematically describe this relationship; it will probably require research by some very smart minds (assuming I am even correct). I suspect the connection lies somewhere within the mathematics of information theory. But here is the most ambitious claim I will make in this post: data at scale encodes fundamental truths about the world, intelligence is simply the search for truth, and that is why we are able to harness big data to power intelligent systems.

--

--

Ishaan Bhattacharya
Ishaan Bhattacharya

Written by Ishaan Bhattacharya

I'm a deep-tech investor writing about deep-tech (surprise surprise), AI, startups, physics, philosophy, and other things that are generally fun to think about.

No responses yet