Innovative research and technological advances have greatly affected daily human life. Besides personal computers, laptops, and other edgy devices, an excellent example of this would be the evolution of smartphones. Interestingly, one of the statistics websites; Statista [1] has demonstrated the number of smartphone users worldwide from 2014 to 2020 (in billions). In Figure [1], the number of users almost doubled in the last five years, and the norm continues to grow massively.
With the massive availability of various electronic devices, a colossal amount of data in different forms (text, images, audios, videos) is being uploaded to and downloaded from the Internet; ready for processing and waiting to be utilized automatically by computers. AI researchers in Machine learning particularly Neural Networks and Deep Learning thrive in this scenario.
Recently, Deep Learning (DL)is monopolizing AI research and evolving significantly in the industrial market. DL is considered to be a part of the ML era based on learning data representations [2]. It is a progressive and hierarchical learning paradigm. Concretely, multiple layers of data abstractions are learned within DL models prior to achieving the required AI tasks [2], e.g., recognizing objects in Figure 2.
As humans, even though our body capacity is limited, our brains are massively equipped with apparently infinite ability to intelligently perform the target task. For example, human eyes can easily differentiate between the bicycle tires and animal eyes with a glance, as shown in Figure 2. This intelligence and ability to understand the input data quickly is due to the billions of basic processing units called Neurons in the human brain [5]. Each of these neurons is connected by Synapses to several thousand other neurons. They communicate with each other via Axons, which carry tons of sensory signal pulses to distant parts of the brain or body targeting specific recipient cells [5]. Miraculously, the brain processes around 400 Billion bits of information a second [4]. The neurons can abstract signals, analyze and understand them, store them in memories, feedback and infer from previous experiences and interact with other distant neurons, and yet make vital and smart decisions in a very tiny fraction of time. The brain is indeed a very complex entity itself. Insistently, DL is trying to mimic our brain.
Labeled data can be utilized as learning examples for DL baby models, such data provides supervision and guidance on what to be learned and how to produce useful representations that yield better performance. The model is trained to see many data examples, for instance, an image containing a cat is tagged to a cat label, and then once the model is trained well by seeing a lot of cat images while being informed with cat labels, it will start to generalize on other similar data, i.e. if it sees a random image that has a cat then it can tag it with cat label with high confidence.
Figure 2: An example of images of bicycle tires and animal eyes.
Diving deep in DL hierarchical layers allows us to visualize hidden representations that are learned by composing layers, and thus understand why DL empowers computers with potential real intelligence ability. Figure 3 shows some snippets taken from [3]. These visualizations are for five consecutive Convolutional neural network layers (CNN); a popular type of neural network that reflects the essence of the DL concept. Interestingly, the computer here can capture simple to complex representations about the input signal progressively through the CNN layers. The first layer visualizes primary colors and simple shapes of lines. The second layer shows more complex shapes like edges, and so the third layer visualizations is now a composite of intricate circles and polygons. Concretely, the fourth and fifth layers are showing activations of dog faces and more complex detailed shapes. The deeper we go with layers; the more abstractions and complex shapes are leaned in the DL model.
Figure 3: Visualizations of Deep Learning Convolutional Neural Networks Layers. The deeper it progresses in layers, the more complex abstractions and representations it learns. Credit to [3].
As a computer algorithm, DL models can be trained in an end-to-end fire and forget fashion. It digests a great amount of data and automatically capture the underlying relations and discriminatively represent them to better perform the task. It obviates the burden of feature engineering, which is typically and heavily practiced by ML researchers and domain knowledge experts.
At this point, we can understand why DL is thriving in all AI domains and recently other scientific applications. Simply because of its deep and complex nature that mimics human brain structure and functionality.
If our present holds the possibilities of even brain-controlled prosthetic arms which almost seemed impossible a few years ago, what does our future generation entail?
References:
1.https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide/
2.Bengio, Y.; Courville, A.; Vincent, P. (2013). “Representation Learning: A Review and New Perspectives”. IEEE TPAMI
3.Zeiler, M.; Fergus, R. (2014). “Visualizing and Understanding Convolutional Networks”. ECCV 2014