《机器学习 Machine Learning》课程教学资源（书籍文献）An introduction to neural networks for beginners.pdf_大学文库

PAGE 1 Table of Contents Introduction.................................................................................................................................... 2 Who I am and my approach ..................................................................................................... 2 The code, pre-requisites and installation................................................................................ 3 Part 1 – Introduction to neural networks .................................................................................... 3 1.1 What are artificial neural networks?................................................................................... 3 1.2 The structure of an ANN......................................................................................................4 1.2.1 The artificial neuron.......................................................................................................4 1.2.2 Nodes............................................................................................................................... 5 1.2.3 The bias ...........................................................................................................................6 1.2.4 Putting together the structure .....................................................................................8 1.2.5 The notation...................................................................................................................9 1.3 The feed-forward pass .........................................................................................................10 1.3.1 A feed-forward example.................................................................................................11 1.3.2 Our first attempt at a feed-forward function..............................................................11 1.3.3 A more efficient implementation................................................................................13 1.3.4 Vectorisation in neural networks................................................................................13 1.3.5 Matrix multiplication....................................................................................................14 1.4 Gradient descent and optimisation ...................................................................................16 1.4.1 A simple example in code.............................................................................................18 1.4.2 The cost function..........................................................................................................19 1.4.3 Gradient descent in neural networks ........................................................................ 20 1.4.4 A two dimensional gradient descent example ......................................................... 20 1.4.5 Backpropagation in depth ...........................................................................................21 1.4.6 Propagating into the hidden layers ........................................................................... 24 1.4.7 Vectorisation of backpropagation.............................................................................. 26 1.4.8 Implementing the gradient descent step.................................................................. 27 1.4.9 The final gradient descent algorithm........................................................................ 28 1.5 Implementing the neural network in Python.................................................................. 29 1.5.1 Scaling data ................................................................................................................... 30

1.5.2 Creating test and training datasets............. 1.5.3 Setting up the output layer.... .32 1.5.4 Creating the neural network .................... 32 1.5.5 Assessing the accuracy of the trained model 38 Introduction Welcome to the"An introduction to neural networks for beginners"book.The aim of this much larger book is to get you up to speed with all you need to start on the deep learning journey using TensorFlow.Once you're finished,you may like to check out my follow-up book entitled Coding the Deep Learning Revolution -a step by step introduction using Python,Keras and TensorFlow.What is deep learning,and what is TensorFlow?Deep learning is the field of machine learning that is making many state-of-the-art advancements,from beating players at Go and Poker,to speeding up drug discovery and assisting self-driving cars.If these types of cutting edge applications excite you like they excite me,then you will be interesting in learning as much as you can about deep learning.However,that requires you to know quite a bit about how neural networks work.This will be what this book covers-getting you up to speed on the basic concepts of neural networks and how to create them in Python. WHO I AM AND MY APPROACH I am an engineer who works in the energy utility business who uses machine learning almost daily to excel in my duties.I believe that knowledge of machine learning,and its associated concepts,gives you a significant edge in many different industries,and allows you to approach a multitude of problems in novel and interesting ways.I also maintain an avid interest in machine and deep learning in my spare time,and wish to leverage my previous experience as a university lecturer and academic to educate others in the coming Al and machine learning revolution.My main base for doing this is my website- Adventures in Machine Learning. Some educators in this area tend to focus solely on the code,with neglect of the theory. Others focus more on the theory,with neglect of the code.There are problems with both these types of approaches.The first leads to a stunted understanding of what one is doing -you get quite good at implementing frameworks but when something goes awry or not quite to plan,you have no idea how to fix it.The second often leads to people getting swamped in theory and mathematics and losing interest before implementing anything in code. PAGE 2

PAGE 2 1.5.2 Creating test and training datasets ............................................................................. 31 1.5.3 Setting up the output layer......................................................................................... 32 1.5.4 Creating the neural network ...................................................................................... 32 1.5.5 Assessing the accuracy of the trained model............................................................ 38 Introduction Welcome to the “An introduction to neural networks for beginners” book. The aim of this much larger book is to get you up to speed with all you need to start on the deep learning journey using TensorFlow. Once you’re finished, you may like to check out my follow-up book entitled Coding the Deep Learning Revolution – A step by step introduction using Python, Keras and TensorFlow. What is deep learning, and what is TensorFlow? Deep learning is the field of machine learning that is making many state-of-the-art advancements, from beating players at Go and Poker, to speeding up drug discovery and assisting self-driving cars. If these types of cutting edge applications excite you like they excite me, then you will be interesting in learning as much as you can about deep learning. However, that requires you to know quite a bit about how neural networks work. This will be what this book covers – getting you up to speed on the basic concepts of neural networks and how to create them in Python. WHO I AM AND MY APPROACH I am an engineer who works in the energy / utility business who uses machine learning almost daily to excel in my duties. I believe that knowledge of machine learning, and its associated concepts, gives you a significant edge in many different industries, and allows you to approach a multitude of problems in novel and interesting ways. I also maintain an avid interest in machine and deep learning in my spare time, and wish to leverage my previous experience as a university lecturer and academic to educate others in the coming AI and machine learning revolution. My main base for doing this is my website – Adventures in Machine Learning. Some educators in this area tend to focus solely on the code, with neglect of the theory. Others focus more on the theory, with neglect of the code. There are problems with both these types of approaches. The first leads to a stunted understanding of what one is doing – you get quite good at implementing frameworks but when something goes awry or not quite to plan, you have no idea how to fix it. The second often leads to people getting swamped in theory and mathematics and losing interest before implementing anything in code

My approach is to try to walk a middle path-with some focus on theory but only as much as is necessary before trying it out in code.I also take things slowly,in a step-by-step fashion as much as possible.I get frustrated when educators take multiple steps at once and perform large leaps in logic,which makes things difficult to follow,so I assume my readers are likewise annoyed at such leaps and therefore I try not to assume too much. THE CODE,PRE-REQUISITES AND INSTALLATION This book will feature snippets of code as we go through the explanations,however the full set of code can be found for download at my github repository.This book does require some loose pre-requisites of the reader-these are as follows: A basic understanding of Python variables,arrays,functions,loops and control statements A basic understanding of the numpy library,and multi-dimensional indexing Basic matrix multiplication concepts and differentiation While I list these points as pre-requisites,I expect that you will still be able to follow along reasonably well if you are lacking in some of these areas.I expect you'll be able to pick up these ideas as you go along-I'll provide links and go slowly to ensure that is the case. To install the required software,consult the following links: Python 3.6(this version is required for TensorFlow): https://www.python.org/downloads/ Numpy:https://www.scipy org/install html Sci-kit learn:http://scikit-learn org/stable/install.html It may be easier for you to install Anaconda,which comes with most of these packages ready to go and allows easy installation of virtual environments. Part 1-Introduction to neural networks 1.1 WHAT ARE ARTIFICIAL NEURAL NETWORKS? Artificial neural networks(ANNs)are software implementations of the neuronal structure of our brains.We don't need to talk about the complex biology of our brain structures,but suffice to say,the brain contains neurons which are kind of like organic switches.These can change their output state depending on the strength of their electrical or chemical input.The neural network in a person's brain is a hugely interconnected network of neurons,where the output of any given neuron may be the input to thousands of other neurons.Learning occurs by repeatedly activating certain neural connections over others,and this reinforces those connections.This makes them more likely to produce a desired outcome given a specified input.This PAGE 3

PAGE 3 My approach is to try to walk a middle path – with some focus on theory but only as much as is necessary before trying it out in code. I also take things slowly, in a step-by-step fashion as much as possible. I get frustrated when educators take multiple steps at once and perform large leaps in logic, which makes things difficult to follow, so I assume my readers are likewise annoyed at such leaps and therefore I try not to assume too much. THE CODE, PRE-REQUISITES AND INSTALLATION This book will feature snippets of code as we go through the explanations, however the full set of code can be found for download at my github repository. This book does require some loose pre-requisites of the reader – these are as follows: - A basic understanding of Python variables, arrays, functions, loops and control statements - A basic understanding of the numpy library, and multi-dimensional indexing - Basic matrix multiplication concepts and differentiation While I list these points as pre-requisites, I expect that you will still be able to follow along reasonably well if you are lacking in some of these areas. I expect you’ll be able to pick up these ideas as you go along – I’ll provide links and go slowly to ensure that is the case. To install the required software, consult the following links: - Python 3.6 (this version is required for TensorFlow): https://www.python.org/downloads/ - Numpy: https://www.scipy.org/install.html - Sci-kit learn: http://scikit-learn.org/stable/install.html It may be easier for you to install Anaconda, which comes with most of these packages ready to go and allows easy installation of virtual environments. Part 1 – Introduction to neural networks 1.1 WHAT ARE ARTIFICIAL NEURAL NETWORKS? Artificial neural networks (ANNs) are software implementations of the neuronal structure of our brains. We don’t need to talk about the complex biology of our brain structures, but suffice to say, the brain contains neurons which are kind of like organic switches. These can change their output state depending on the strength of their electrical or chemical input. The neural network in a person’s brain is a hugely interconnected network of neurons, where the output of any given neuron may be the input to thousands of other neurons. Learning occurs by repeatedly activating certain neural connections over others, and this reinforces those connections. This makes them more likely to produce a desired outcome given a specified input. This

network structure consists of an input layer,a hidden layer and an output layer.An example of such a structure can be seen below: h,②) X X2 h22 h,3) hw.b(x) ha) Layer 1 Layer 2 Layer3 Figure 7 Three layer neural network The three layers of the network can be seen in the above figure-Layer 1 represents the input layer,where the external input data enters the network.Layer 2 is called the hidden layer as this layer is not part of the input or output.Note:neural networks can have many hidden layers,but in this case for simplicity I have just included one.Finally, Layer 3 is the output layer.You can observe the many connections between the layers,in particular between Layer 1(Li)and Layer 2(L2).As can be seen,each node in Li has a connection to all the nodes in L2.Likewise for the nodes in L2 to the single output node L3.Each of these connections will have an associated weight. 1.2.5 The notation The maths below requires some fairly precise notation so that we know what we are talking about.The notation I am using here is similar to that used in the Stanford deep learning tutorial.In the upcoming equations,each of these weights are identified with the following notation:wij.irefers to the node number of the connection in layer+1and j refers to the node number of the connection in layer I.Take special note of this order.So, for the connection between node 1 in layer 1 and node 2 in layer 2,the weight notation would be w21(1).This notation may seem a bit odd,as you would expect the i and jto refer the node numbers in layers I and l+1 respectively(i.e.in the direction of input to output),rather than the opposite.However,this notation makes more sense when you add the bias. As you can observe in the figure above-the (+1)bias is connected to each of the nodes in the subsequent layer.The bias in layer 1 is connected to the all the nodes in layer two. Because the bias is not a true node with an activation function,it has no inputs(it always outputs the value+).The notation of the bias weight is b,whereiis the node number in the layerl+1-the same as used for the normal weight notation w2().So,the PAGE 9

PAGE 9 network structure consists of an input layer, a hidden layer and an output layer. An example of such a structure can be seen below: Figure 7 Three layer neural network The three layers of the network can be seen in the above figure – Layer 1 represents the input layer, where the external input data enters the network. Layer 2 is called the hidden layer as this layer is not part of the input or output. Note: neural networks can have many hidden layers, but in this case for simplicity I have just included one. Finally, Layer 3 is the output layer. You can observe the many connections between the layers, in particular between Layer 1 (L1) and Layer 2 (L2). As can be seen, each node in L1 has a connection to all the nodes in L2. Likewise for the nodes in L2 to the single output node L3. Each of these connections will have an associated weight. 1.2.5 The notation The maths below requires some fairly precise notation so that we know what we are talking about. The notation I am using here is similar to that used in the Stanford deep learning tutorial. In the upcoming equations, each of these weights are identified with the following notation: 𝑤𝑖𝑗 (𝑙) . i refers to the node number of the connection in layer 𝑙 + 1 and j refers to the node number of the connection in layer l. Take special note of this order. So, for the connection between node 1 in layer 1 and node 2 in layer 2, the weight notation would be 𝑤21 (1) . This notation may seem a bit odd, as you would expect the i and j to refer the node numbers in layers l and 𝑙 + 1 respectively (i.e. in the direction of input to output), rather than the opposite. However, this notation makes more sense when you add the bias. As you can observe in the figure above – the (+1) bias is connected to each of the nodes in the subsequent layer. The bias in layer 1 is connected to the all the nodes in layer two. Because the bias is not a true node with an activation function, it has no inputs (it always outputs the value +1). The notation of the bias weight is 𝑏𝑖 (𝑙) , where i is the node number in the layer 𝑙 + 1 – the same as used for the normal weight notation 𝑤21 (1) . So, the