NeuroSymbolic AI Explained
First-generation of AI also known as GOFAI, Good Old Fashioned AI comprised of simple calculations done at scale e.g. by using a calculator…
First-generation of AI also known as GOFAI, Good Old Fashioned AI comprised of simple calculations done at scale e.g. by using a calculator and called it intelligent. Since 2012, compute power has afforded faster operations on matrices that made machine learning/ deep learning on humungous datasets a possibility, ushering AI into its second generation where autonomous cars, voice assistants, recommender systems, and fraud detection systems could use the models created by AI.
The amount of data generated by 2020 is estimated to be about 64 zettabytes according to Statista. By 2025 it is forecasted to be grown to a whopping 180 zettabytes. Extrapolating these stats, it seems obvious that computing power to generate insights and models out of that data is a catch-22. One of those ever-growing data streams is unstructured text data. Large Language Models (LLMs) aim to address these problems. However, there are severe limitations to that approach.
LLM's lack of reproducibility by the scientific community and power requirements makes it dead-end research. It took 512 V100 GPUs to train MegatronLM that had 45 terabytes of data consuming 27,648 kWh of power which is equal to three years of energy of an average US household. By 2040, the amount of energy required to train larger models at this rate would require more energy than the entire energy that is powering the earth.
This necessitates research into alternate models with less taxing requirements both on the algorithmic and hardware sides. MIT-IBM Watson AI Lab is one of the research labs that is trying to solve that problem by proposing a new approach. This consists of making large BlackBox models more explainable through the symbolic representation of embeddings and the relationship between them.
Let's first assume that ANNs model the neuroanatomy of the human brain in its entirety and machines use perception just as a human child learns cognition by experimenting with its environment. Famous cognitive psychologist Jean Piaget explains different stages of a child’s cognitive development starting with exploring its environment. While sub-cortical regions of the brain cannot be replicated in machines that are responsible for emotional regulation and memory formation based on biological cells primarily consisting of chemical processes, cortical functions can more or less be emulated. Earlier works on decoding neural systems are some of the groundbreaking discoveries in the area. Carver Mead’s book ‘Analog VLSI and Neural Systems’ is one good read.
In reinforcement learning terminology these embeddings are called schemas. Once a child learns those schemas, additional information serves as model updates and is called assimilation. An incongruence with the existing model is accounted for by a process called accommodation which in machine learning terms can be called hyperparameter tuning. These weights can be inhibitory or excitatory in nature.
The third generation of AI proposes that NNs use a system in which machine perception uses neural networks to detect objects but the symbolic meanings of these objects must be determined by a layer of semantics.
In the image above object detection is followed by describing the attributes of these objects namely shape, color, position, and in moving objects, distance calculations. This meta-learning solves a large part of the black box where interpretability is a huge issue, especially in mission-critical situations like battlefields and healthcare.
A consortium of universities like MIT, Stanford, Caltech, UT and University of Pennsylvania among others is working on a joint project funded by NSF called Understanding the World Through Code.
According to their website
“The goal of our project is to develop new learning techniques that can help automate this process of generating scientific theories from data. In particular, we are working to develop methods for learning neurosymbolic models that combine neural elements capable of identifying complex patterns in data with symbolic constructs that are able to represent higher-level concepts. Our approach is based on the observation that programming languages provide a uniquely expressive formalism to describe complex models. Our aim is therefore to develop learning techniques that can produce models that look more like the models that scientists already write by hand in code.” Their work can be read on their website here.
Inference based on such an architecture is more explainable and uses sparse datasets to accurately predict the outcomes. Moreover, since the symbolic layer deduces object attributes it can be used as a prior for different domains to explain domain knowledge. In drug discovery, it can be used to describe the candidate medicines for target diseases and why they are chosen. This automated reasoning helps find novel drug families and distances between their clusters.
The battle on the hardware side comprises of finding new ways to save energy and models based on sparse data. A possible solution is Neuromorphic computing which differentiates itself from traditional von Neumann architecture in that it doesn’t have the separation between memory and compute regions helping with the latency. It also doesn’t have 0 and 1 states but infinite states in-between. Intel Labs has a neuromorphic computing lab.
Intel Labs’ second-generation neuromorphic research chip, code named Loihi 2, and Lava, an open-source software framework, will drive innovation and adoption of neuromorphic computing solutions.
Enhancements include:
Up to 10x faster processing capability1
Up to 60x more inter-chip bandwidth2
Up to 1 million neurons with 15x greater resource density3
3D Scalable with native Ethernet support
A new, open-source software framework called Lava
Fully programmable neuron models with graded spikes
Enhanced learning and adaptation capabilities
Finally, big tech companies are investing in quantum computing to solve energy efficiency problems. Quantum computing is a type of computation that harnesses the collective properties of quantum states, such as superposition, interference, and entanglement, to perform calculations. The devices that perform quantum computations are known as quantum computers.
According to fortune business insights, a market research company “The quantum computing market is projected to grow from $486.1 million in 2021 to $3,180.9 million in 2028 at a CAGR of 30.8% in the forecast period, 2021–2028.
Neurosymbolic AI with the added help of Neuromorphic computing and quantum computing architectures might revolutionize the AI as we know it ushering us into a world with whole new possibilities. It certainly is an exciting emerging field that needs interdisciplinary research.
References:
https://www.intel.com/content/www/us/en/research/neuromorphic-computing.html
https://mitibmwatsonailab.mit.edu/category/neuro-symbolic-ai/
https://www.statista.com/statistics/871513/worldwide-data-created/