In this video, I explain what convolutions and convolutional neural networks are, and introduce, in detail, one of the best and most used state-of-the-art CNN architectures in 2020: DenseNet. Watch the Video Below If you would like me to cover any other neural network architecture or research paper, please let me know in the comments! References DenseNet paper: https://arxiv.org/pdf/1608.06993.pdf DenseNet on GitHub: https://github.com/liuzhuang13/DenseNet Follow me for more AI content LinkedIn: https://www.linkedin.com/in/whats-ai/ Twitter: https://twitter.com/Whats_AI Facebook: https://www.facebook.com/whats.artifi... The best courses in AI: https://www.omologapps.com/whats-ai Join Our Discord channel, Learn AI Together: https://discord.gg/learnaitogether Chapters 0:00 - Hey! Tap the Thumbs Up button and Subscribe. You'll learn a lot of cool stuff, I promise. 0:18 - The Convolutional Neural Networks 0:39 - A … convolution? 2:07 - Training a CNN 2:45 - The activation function: ReLU 3:20 - The pooling layers: Max-Pooling 4:40 - The state-of-the-art CNNs: A quick history 5:23 - The most promising CNN architecture: DenseNet 8:39 - Conclusion Video Transcript facial recognition targeted ads image 00:03 recognition 00:04 video analysis animali detection these 00:07 are all powerful ai applications 00:09 you must already have heard of at least 00:12 once 00:12 but do you know what they all have in 00:14 common they are all using the same type 00:16 of neural network architecture 00:18 the convolutional neural network they 00:21 are the most used type of neural 00:22 networks 00:23 and the best for any computer vision 00:25 applications 00:26 once you understand these you are ready 00:28 to dive into the field and become an 00:30 expert 00:31 the convolutional neural networks are a 00:33 family of deep neural networks that uses 00:36 mainly convolutions to achieve the task 00:38 expected as the name says convolution is 00:41 the process 00:42 where the original image which is our 00:45 input in a computer vision application 00:47 is convolved using filters that detects 00:50 important small features of an image 00:52 such as edges the network will 00:55 autonomously learn filter's value that 00:57 detect 00:57 important features to match the output 00:59 we want to have 01:01 such as the name of the object in a 01:02 specific image 01:04 sent as input these filters are 01:06 basically squares of size 01:08 3x3 or 5x5 so they can detect the 01:12 direction 01:12 of the edge left right up or down 01:16 just like you can see in this image the 01:18 process of convolution makes a dot 01:20 product between the filter and the 01:22 pixels it faces 01:24 then it goes to the right and does it 01:26 again convolving the whole 01:28 image once it's done these give us the 01:31 output of the first convolution layer 01:33 which is called 01:34 a feature map then we do the same thing 01:37 with another filter 01:38 giving us many filter maps at the end 01:42 which are all sent into the next layer 01:44 as input to produce 01:45 again many other feature maps until it 01:48 reaches the end of the network with 01:50 extremely detailed general information 01:53 about what the image contains there are 01:56 many filters and the numbers inside 01:58 these filters are called the weights 02:00 which are the parameters trained during 02:02 our training phase 02:04 of course the network is not only 02:05 composed of convolutions 02:08 in order to learn we also need to add an 02:10 activation function 02:11 and a pooling layer between each 02:13 convolution layer 02:15 basically these activation functions 02:17 make possible the use of the back 02:19 propagation technique 02:21 which basically calculates the error 02:23 between our guess 02:24 and the real answer we were supposed to 02:26 have 02:27 then propagating this error throughout 02:29 the network 02:30 changing the weights of the filters 02:32 based on this error 02:34 once the propagated error reaches the 02:36 first layer another example is fed to 02:38 the network 02:39 and the whole learning process is 02:41 repeated thus iteratively improving our 02:44 algorithm 02:45 this activation function is responsible 02:48 for determining 02:49 the output of each convolution 02:51 computation and reducing the complexity 02:53 of our network 02:55 the most popular activation function is 02:57 called the real u 02:58 function which stands for rectified 03:00 linear 03:01 unit it puts to zero any negative 03:04 results which are known to be harmful to 03:06 the network 03:07 and keeps positive values the same 03:10 having all these zeros make the network 03:12 much more efficient to train in 03:14 computation time 03:16 since a multiplication with zero will 03:18 always equal 03:19 zero then again to simplify our network 03:22 and reduce the numbers of parameters 03:24 we have the pooling layers typically 03:27 we use a two by two pixels window and 03:30 take the maximum value of this window to 03:32 make the first pixel of our feature map 03:35 this is known as max pooling then we 03:38 repeat this process for the whole 03:39 feature map 03:40 which will reduce the x y dimensions of 03:43 the feature map 03:44 thus reducing the number of parameters 03:46 in the network the deeper we get into it 03:48 this is all done while keeping the most 03:51 important information 03:53 these three layers convolution 03:55 activation and pooling layers can be 03:57 repeated multiple times in a network 03:59 which we call our conf layers making the 04:02 network 04:03 deeper and deeper finally there are the 04:06 fully connected layers that learn a 04:08 non-linear function 04:09 from the last pooling layer's outputs it 04:12 flattens the multi-dimensional 04:14 volume that is resulted from the pooling 04:16 layers into a one-dimensional vector 04:18 with the same amount of total parameters 04:21 then we use this vector in a small fully 04:24 connected neural network 04:25 with one or more layers for image 04:28 classification 04:29 or other purposes resulting in one 04:31 output per image 04:33 such as the class of the object of 04:36 course 04:36 this is the most basic form of 04:38 convolutional neural networks 04:40 there have been many different 04:42 convolutional architectures 04:44 since lenet5 by jan lacun in 1998 04:47 and more recently with the first deep 04:49 learning network 04:50 applied in the most popular object 04:52 recognition competition 04:54 with the progress of the gpus the alex 04:57 net network in 2012 04:59 this competition is the imagenet 05:01 large-scale visual recognition 05:03 competition 05:04 rls vrc where the best object detection 05:07 algorithms were competing every year 05:10 on the biggest computer vision data set 05:12 ever created 05:13 imagenet it exploded right after this 05:16 year 05:17 where new architectures were beating the 05:19 precedent one 05:20 and always performing better until today 05:23 nowadays most state-of-the-art 05:25 architectures perform 05:26 similarly and have some specific use 05:29 cases 05:29 where they are better you can see here a 05:32 quick comparison of the most used 05:34 architectures in 2020 05:37 this is why i will only cover my 05:40 favorite network in this video which is 05:42 the one that yields the best results in 05:44 my researches 05:45 densenet it is also the most interesting 05:48 and promising cnn architecture in my 05:50 opinion please let me know in the 05:53 comments if you would like me to cover 05:55 any other type of network architecture 05:58 the densenet family first appeared in 06:00 2016 06:01 in the paper called densely connected 06:03 convolutional 06:04 networks by facebook ai research 06:07 it is a family because it has many 06:10 versions 06:11 with different depth ranging from 121 06:14 layers 06:15 with 0.8 million parameters 06:18 up to a version with 264 06:22 layers with 15.3 million parameters 06:26 which is smaller than the 101 layers 06:28 deep 06:29 resnet architecture as you can see here 06:32 the densnet architecture uses the same 06:34 concepts of convolutions 06:35 pooling and the relu activation function 06:38 to work 06:39 the important detail and innovation in 06:41 this network architecture 06:42 are the dense blocks here is an example 06:45 of a five-layer dense block in these 06:48 dense blocks 06:49 each layer takes all the preceding 06:51 feature maps as input 06:53 thus helping the training process by 06:56 alleviating the vanishing gradient 06:58 problem 06:59 this vanishing gradient problem appears 07:01 in really deep 07:02 networks where they are so deep that 07:04 when we back propagate the error into 07:06 the network 07:07 this error is reduced at every step and 07:10 eventually becomes 07:11 zero these connections basically allow 07:14 the error to be propagated 07:16 further without being reduced too much 07:19 these connections also encourage feature 07:21 reuse and reduce the numbers of 07:23 parameters 07:24 for the same reason since it's reusing 07:27 previous feature maps information 07:29 instead of generating more parameters 07:31 and therefore 07:32 accessing the network's collective 07:34 knowledge and reducing the chance of 07:36 overfitting 07:37 due to this reduction in total 07:39 parameters 07:40 and as i said this works extremely well 07:43 reducing the number of parameters by 07:45 around 5 times compared to a 07:46 state-of-the-art resnet architecture 07:48 with the same number of layers 07:50 the original dense net family is 07:52 composed of four dense blocks 07:55 with transition layers which do 07:57 convolution 07:58 and pooling as well and a final 08:00 classification layer if we are working 08:02 on an image classification task 08:04 such as the rls vrc competition 08:08 the size of the dense block is the only 08:10 thing changing for 08:12 each version of the densenet family to 08:14 make the network 08:15 deeper of course this was just an 08:18 introduction to the convolutional 08:19 neural networks and more precisely the 08:22 dense net architecture 08:23 i strongly invite you to further read 08:25 about these architectures if you want to 08:27 make a well thought choice for your 08:29 application 08:30 the paper and github links for densenet 08:32 are in the description of the video 08:34 please let me know if you would like me 08:36 to cover any other architecture 08:39 please leave a like if you went this far 08:41 in the video 08:42 and since there are over 90 of you guys 08:44 watching that are not subscribed yet 08:46 consider subscribing to the channel to 08:48 not miss any further news clearly 08:50 explained 08:51 thank you for watching 08:55 [Music]