C HAPTER 11
2. Artificial Intelligence and its Development
The beginning of AI dates back to the invention of computers around the 1940s and 1950s. During the development stage of AI, the very first applications focused on programming the computers to make them do things intelligently just like a human. Programming computers to imitate human behavior started a discussion of the differences between a computer and a human brain, and how close a computer to a human brain could be in the 1960s and 1970s. In the 1980s and 1990s, the research took a different path with the emergence of artificial brains. From that point, AI was not limited to purely imitating human intelligence, rather it could be intelligent in its own way since its potential has been substantially grown.
This has brought the idea that argues an artificial brain had the potential to be faster and more efficient than a human brain, and it could potentially outperform it. Recently, research efforts on AI are on the rise, and the real-world applications of AI in finance, manufacturing, and military fields have greater results than a human brain can achieve. Today, artificial brains are being designed to learn, adapt, and carry out their needs. They perceive their surroundings, move around in their own body, and make their decision independently (Warwick 2013).
Smart factories, that are the unification of software and hardware devices, constitute the main framework of Industry 4.0. Smart factories are the manufacturing environments where workers, machines, and resources communicate and cooperate in complex manufacturing networks. The integration of AI into the manufacturing systems will facilitate to set up these networks, and enable them to learn, infer, and act autonomously based on the collected data during the industrial process (Dopico et al. 2016). AI focuses on computer programs that are capable of making their own decision to solve a problem of interest, and the systems that are created with AI are intended to mimic the intelligent behavior of expertise (Kumar 2017).
The history of AI comprises of imaginations, fictions, possibilities, demonstrations, and promise. From the ancient philosophers to today’s scientists, the possibility of non-human intelligent machines made humankind cast about this subject. Fiction writers, who used intelligent machines in their novels, such as Jules Verne, Isaac Asimov, and L. Frank Baum were ahead of their time and have inspired many AI researchers (Buchanan 2005).
In the 18th and 19th centuries, chess-playing machines were exhibited as intelligent machines, however, these machines were not playing autonomously. The most notable one named “the Turk” that was invented by Hungarian inventor Wolfgang von Kempelen in 1770. A human chess player hid inside the machine and decided the moves on behalf of the machine. Although it was a fake chess-playing machine, it brought about the idea that a machine could perform intelligent assignment like a human (Stanford.edu 2019). Even though chess was used as an instrument for studying inference and representation mechanisms in the early AI research efforts, the first major improvement was the time when IBM’s chess- playing computer, named Deep Blue, defeated the world chess champion, Gary Kasparov, in 1997 (Buchanan 2005).
During its development stage, AI has been affected by many disciplines such as engineering, biology, experimental psychology, communication theory, game theory, mathematics, statistics, logic and philosophy, and linguistics. Computers and programming languages were capable of conducting experiments and generating results about AI research starting from the last half of the 20th century. The advances in electronics and the increase in the computing power of modern computers in the 20th century made computers “giant brains”, and paved the way for the quick development of AI. Today, robots are also powerful devices that take part in AI research efforts. Robots are being given common knowledge about the objects that we come across every day in a human environment and tested whether they make intelligent decisions that we expect (Buchanan 2005).
Vannevar Bush’s 1945 paper, “As We May Think”, sheds light on the possible advancements in technology, with the growth of computer science, during and in the future of the information age (Bush 1945). In his 1950 paper, “Computing Machinery and Intelligence”, A.M. Turing proposed a question: “Can machines think?” His paper was the starting point in the history of AI. He described the imitation game, also known as the Turing Test, and presented his ideas about the possibility of creating intelligent machines that can think and behave rationally (Turing 1950).
O.G. Selfridge defines the term “Pattern Recognition” in his 1955 paper, “Pattern Recognition and Modern Computers”.
In his paper, Selfridge defines pattern recognition as: “the extraction of the significant features of data from a background of irrelevant detail”. Today, pattern recognition is one of the main research branch in the field of AI (Selfridge 1955).
The Dartmouth Summer Research Project of 1956 is accepted as the event that initiated AI as a research discipline.
The proposal of the event was written by computer scientists John McCarthy, cognitive scientist Marvin Minsky, mathematician Claude Shannon, and computer scientist Nathaniel Rochester. John McCarthy is credited with coining the
158 Logistics 4.0: Digital Transformation of Supply Chain Management
term “artificial intelligence” and shaping the future of the field (Moor 2006). The major milestones in the history of AI is given in Table 1 below.
Table 1: A Quick Look to the History of AI (adopted from Buchanan 2005; Bosch Global 2018).
Year Event
1945 Bush’s paper “As We May Think” published in the “Atlantic Monthly”
1950 Turing’s seminal paper “Computing Machinery and Intelligence” published in the philosophy journal “Mind”
1955 Oliver Selfridge’s paper, “Pattern Recognition and Modern Computers”, was published in Proceedings of the Western Joint Computer Conference
1956 Arthur Samuel’s checker-playing program was able to learn from experience by playing against human and computer opponents and improve its playing ability
1956 John McCarthy coined the term “Artificial Intelligence” in a conference at the Dartmouth College
1966 The first chatbot was developed by MIT professor Joseph Weizenbaum. The program was able to simulate real conversation like a conversation partner.
1972 MYCIN, an expert system that used artificial intelligence to treat illnesses, was invented by Ted Shortliffe at Stanford University.
1986 Terrence J. Sejnowski and Charles Rosenberg developed a program called “NETtalk” that was able to speak, read words and pronounce them correctly.
1997 IBM’s AI chess computer “Deep Blue” defeated the incumbent chess world champion, Garry Kasparov.
2011–
2015 Powerful processors and graphics cards in computers, smartphones, and tablets led AI programs to spread in everyday life. Apple’s “Siri”, Microsoft’s “Cortana”, and Amazon’s “Echo” and “Alexa” were introduced to the market.
2018 IBM’s “Project Debater” debated complex topics with master debaters, and performed unusually well. Google’s
“Duplex” called a hairdresser and made an appointment on behalf of a customer while the lady on the other end of the line did not notice that she was talking to a machine.
2.1 Artificial Intelligence
AI and artificial neural networks (ANN) are interconnected domains in computer science. AI can be successfully executed through ANN. There are still key differences between AI and ANN. First and foremost, neural networks are the basis for research in the field of AI. The primary aim of AI is to build intelligent machines that can achieve a specific task, such as playing chess, without crossing the boundaries set by the computer scientist. ANN are being used to surpass the limitations of the task-orientated AI. ANN create computer programs that can receive feedback and respond to a problem through adaptive learning. These programs can optimize their response by solving the same problem many times and adjusting the response based on the feedback received each time (Techopedia.com 2017).
Nowadays, AI refers to any machine that can simulate human cognitive skills, such as problem-solving. In other words, it is an attribute of machines that conceptualize a form of intelligence rather than merely perform the orders programmed by the users. Checker and chess-playing machines, and language analysis software were among the early applications of AI.
Machine learning is an AI technique that allows machines to learn from input data without having particularly programmed (The Scientist Magazine 2019). The goal of AI is the development of algorithms that lead machines to perform cognitive tasks. An AI system must be capable of storing knowledge, applying this knowledge to solve problems, and acquiring new knowledge through experience. An AI system consists of three key components as shown in Table 2 below (Haykin 1998):
Table 2: Three Key Component of an AI System (adopted from Haykin 1998).
Component Definition
1. Representation It is the pervasive use of the language of symbol structures to represent both general knowledge about a problem of interest and specific knowledge about the solution to the problem. Clarity of symbolic representation of AI paves the way for easier human-machine communication.
2. Reasoning Reasoning refers to the ability to solve problems.
3. Learning The learning element acquires some information from the environment and uses this information to improve the knowledge base component of the machine learning process.
2.2 Machine Learning
Machine learning (ML) deals with the building process of computers that can improve themselves through experience.
The field of ML is the foundation of AI and data science, and it is situated at the intersection of computer science and statistics. Even though it is relatively young, ML is one of the most swiftly growing technical fields today. The rapid development of the new learning algorithms, rapid growth in the availability of accessible data, and constantly lowering computation cost are among the main reasons for the recent progress in ML. The field of ML involves in our everyday life by penetrating across major sectors of the economy including health care, manufacturing, education, financial modeling, policing, and marketing (Jordan and Mitchell 2015).
ML applications have become an important part of our daily life routine over the last few decades. Some of the top ML applications in our everyday life are given in Table 3.
The main goal of ML is to develop computer systems that can automatically improve themselves through experience. ML also focusses on the area of the fundamental computational statistics methods that involve in all learning systems. The discipline of ML has emerged as a branch of AI and it is used as a practical software for computer vision, speech recognition, natural language processing, and robot control. AI researchers all agree that training a system by presenting the examples of desired input-output behavior is much easier than programming the system manually by predicting the desired outcome for all possible inputs (Jordan and Mitchell 2015).
A simple model of ML consists of four elements depicted in Figure 1. The environment supplies information to the learning element. This information is processed and used by the learning element to ensure progress in the knowledge base.
At the final stage, the performance element performs its task by using the knowledge base. The information received by the machine is usually imperfect, hence the learning element does not know the details about information in advance. For this reason, the machine operates by guessing while receiving “feedback” from the performance element. As the machine receives feedback, it evaluates its decisions and revises them if needed (Haykin 1998).
Table 3: Top applications of Machine Learning (adopted from Mitchell 2006 and Datasciencecentral.com 2017).
Application Description
1 Computer Vision Many current vision systems, e.g., face recognition systems, and systems that automatically classify microscope images of cells, are developed using ML.
2 Speech Recognition Speech recognition (SR) is the translation of spoken words into text, where a software application recognizes spoken words. Currently available commercial systems for speech recognition all use ML to train the system to recognize speech.
3 Medical Diagnosis Methods, techniques, and tools that are built by using ML can assist in solving diagnostic and prognostic problems.
4 Statistical Arbitrage In finance, statistical arbitrage refers to automated trading strategies that are typical of the short term and involve a large number of securities. ML methods such as linear regression and support vector regression (SVR) are applied to build an index arbitrage strategy.
5 Learning Associations The process of generating insights into various associations between products refers to learning association. ML techniques help to understand and analyze customer buying behaviors.
6 Classification Classification is a process of placing each individual from the population under study in many classes. In other words, it separated observations into groups based on their characteristics. This ML technique helps the analyst to identify the category of an object in a large set.
7 Prediction Prediction is one of the most used ML algorithms. It helps businesses to take required decisions (based on historical data) on time.
8 Extraction Information Extraction is the process of extracting structured information from unstructured data, e.g., web pages, articles, reports, and e-mails. The input is taken as a set of documents and the output is produced as a structured data, e.g., excel sheet, table in a relational database.
9 Regression The principles of ML can be applied to optimize parameters in regression, e.g., to cut the approximation error and calculate the closest possible outcome.
10 Robot Control ML methods have been successfully implemented in many robotics applications such as stable helicopter flight, and self-driving vehicles.
160 Logistics 4.0: Digital Transformation of Supply Chain Management
Environment Learning
Element
Knowledge
Base Performance
Element
Fig. 1: Simple Model of Machine Learning (Haykin 1998).
2.3 Two Methods of ML: Artificial Neural Network and Deep Learning
The neural network is one of the techniques used in ML research that allows processing algorithms via interconnected nodes called artificial neurons. The first artificial neural network was created by Marvin Minsky in 1951. The structure of the human brain was the inspiration to the researchers for the design of the ANN. In its early years, the ANN approach was very limited, and other approaches in the field of AI has drawn more attention. In the last few years, ANN came back to the stage as a form of an ML approach called deep learning (The Scientist Magazine 2019).
A neuron is an information processing unit of a neural network, and it is necessary for the neural network operation.
Figure 2 below depicts the model of a neuron that is the basic building block of an ANN. A neuronal model consists of three basic elements that are explained in Table 4 (Haykin 1998).
Deep learning (DL) is an AI technique and a type of ML approach that leads computer systems to improve with experience and data. ML is the only feasible way to construct successful AI systems that can adapt and operate in complex real-world environments. DL is a specific kind of ML technique whose power and flexibility comes from its ability to continuously learn from the physical world and it describes this world as a nested hierarchy of concepts. While DL is considered as an emerging technology, its history dates back to the 1940s. Before it gained its current popularity, it was a relatively unpopular field for several decades. It was also influenced by many researchers and perspectives, and recently named “deep learning” after it had been called in many different names. DL was used to be known as cybernetics in the 1940s–1960s, and it became to be known as connectionism in the 1980s–1990s. Then it was coined as deep learning
Bias
b
kx
1x
2 OutputInput Signals xm
x
mTable 4: Three Elements of a Neuronal Model (adopted from Haykin 1998).
Fig. 2: Nonlinear Model of a Neuron (Haykin 1998).
Element Attribute
A set of synapses or connecting links Each of them is identified by a weight or strength of its own. The synaptic weight of an artificial neuron may lie in a range that includes both negative and positive values. In the figure given below, a signal Xj at the input of synapse j connected to neuron k is multiplied by the synaptic weight wkj.
An adder It sums the input signals that are weighted by the respective synapses of the neuron.
An activation function It limits the amplitude of the output of a neuron. Since the activation function squashes (limits) the permissible amplitude range of the output signal to a finite value, it referred to as a squashing function.
Table 5: Three-way Categorization of Deep Learning Networks (adopted from Deng and Yu 2014).
Category Goal
1. Deep networks for unsupervised or
generative learning These networks are designed to analyze the high-order correlation of the observed data for pattern analysis or synthesis purposes when no information about target class labels is available.
2. Deep networks for supervised
learning This type of network is designed to provide solutions for pattern classification problems by aiming to characterize the distributions of classes behind the visible data. These networks are also named as discriminative deep networks.
3. Hybrid deep networks Hybrid refers to the deep architecture that comprises of both generative and discriminative model components. The outcomes of the generative component are mostly used for discrimination which is the final goal of hybrid deep networks.
around 2006 (Goodfellow et al. 2016). DL is considered an ML approach that can use many layers of hierarchical non-linear information processing. DL architecture or technique can be constructed based on its intention to use and it is categorized into three major domains (Deng and Yu 2014). Table 5 explains three major categorization domains of DL networks.
Figure 3 below represents a Venn diagram showing the AI technology and its relations with ML, ANN, and DL.
Fig. 3: Artificial Intelligence and its Subsets (The Scientist Magazine 2019).