#4 Neural Networks: A Human-Inspired Metaphor
This post is part of a series on the similarities and differences between natural and artificial intelligence.
I’m excited to dive back in and get things rolling again after a few months of hiatus due to other projects and life events, including graduating from my MBA program. In my last few posts, I established rigorous definitions of intelligence and explored the mechanics of reasoning in both AI and humans. However, any discussion about the similarities between AI and natural intelligence warrants a pause to talk about neural networks and deep learning, especially since the name “neural networks” is quite literally inspired by the human brain.
If you’re just joining me on this journey, you can find the first introductory post here: The Nature of Intelligence in Man and Machine.
Neural networks are what enable deep learning, a form of machine learning that constitutes the strongest prediction engines we’ve created so far. Deep learning models can process different pieces of data simultaneously through a complex web of layers and interconnected nodes. The nodes process data in a coordinated and adaptive system. They exchange feedback on generated output, learn from mistakes, and improve continuously. This is similar in structure to how information passes through the human brain, which is how neural nets got their name.
Perceptrons are the single units in a neural net, the equivalent of a neuron in the brain. They were developed in the 1950s and loosely inspired by the physical structure of a neuron. The perceptron takes in multiple data inputs at one end (similar to dendrites in a neuron), transforms it according to some function, and produces an output at the other end. When perceptrons were first invented, there was chatter in the media about a new pinnacle in the creation of artificial intelligence. Evidently, we’ve come a long way since then, and we still have a long way to go.
We can think of the perceptron as a device that makes decisions by weighing up evidence. Every neural network has parameters, such as weights and biases, associated with each connection between perceptrons. These parameters dictate how much importance to accord to the data that comes through that particular perceptron. The weights are adjusted using optimization techniques such as loss minimization and smoothing. To put it simplistically, the ‘learning’ process involves adjusting the weights to arrive closest to an established ‘gold’ outcome.
Not all neural networks are deep learning systems. In simpler feed-forward neural networks, data is processed in one direction from the input node to the output node via a middle computational layer. Deep learning networks, on the other hand, have several hidden layers that give them ‘depth’. Data is processed in parallel, going back and forth, which allows the network to learn complex relationships between features and make high-level predictions. The larger number of nodes in deep learning models allows a much higher number of parameters, enabling them to manipulate data in more ways. Just like the human brain, there are millions of connections, and they fire simultaneously in different permutations and combinations. Modern large-language models have eclipsed this, and are developed using billions of parameters.
It’s worth noting that while there have been many advancements in neural network architectures culminating in the transformer architecture¹ in 2017, the spate of AI developments in the last 3–4 years does not use any fundamentally new technologies. We have used the same methods for several years now, and just added more parameters and data. The scientific and engineering communities have been endlessly astonished by the increase in capabilities associated with larger models, without any change in the underlying architectures. In my next post, I will explore the power of big data, which is the force powering the ongoing machine learning revolution, and will draw some fascinating connections between big data, information theory, and entropy.
If you’re excited by this subject, feel free to subscribe here and get email updates for any new posts: https://ishaan-b.medium.com/subscribe
Footnotes
1. From the famous paper “Attention is all you need” published by Google scientists in 2017: https://arxiv.org/abs/1706.03762