You are on page 1of 3

Machine Learning [CSE 4020]

Digital Assignment – I
Course Faculty: Manoov.R
SLOT – E2
15BCE0163
INAMDAR CHAITANYA RAVINDRA

Q-1. Write a critical review (1 - 3 page max.) of any highly cited research paper
in the field of ‘Machine Learning’ for the past two years [2016 – 2017]

CRITICAL REVIEW OF
Residual Learning for Image Recognition
DATE: (Jun 1st 2016 Computer Vision and Pattern Recognition pp 770-778 DOI: 10.1109/CVPR.2016.90)
AUTHORS: (Kaiming He (Microsoft) Xiangyu Zhang (Xi'an Jiaotong University) Shaoqing Ren (University
of Science and Technology of China) Jian Sun (Microsoft))

Introduction

This paper by Kaiming He and others is a very good Residual Network paper which
gives a brief idea about which the techniques for convolution of neural systems with
125 trainable layers. This paper won the ILSVRC 2015 order errand and numerous
different rivalries. I thought for example, vanishing inclinations have importance for
understanding of this paper. One of the essential take-aways from this paper,
however, is that keeping gradients from vanishing does not really make it practical to
discover ideal answers for deep models. In this paper, author wants to note on
appoint which is modification does not gives increment in the number of parameters.
This paper provide us importantly three take away models which are highly useful in
residual networks.
Motivation
While incrementing the deepness of traditional networks, author noticed that first
there were increment in precision then it becomes steady and then again becomes
deep. If the depth is keep on increasing the precision fall off suddenly. Instead of
changing the layers of convolution. The convolution layers with identity functions,
they add identity functions as a shortcut connection between multiple convolution
layers. I think because of this it allows learning to the network about solutions which
are optimal in which layers which are extra are treated as functions and thus it helps
to improve and make solution precise. This network is pretty simple to train as the
networks which is original. As classical neural networks are so deep the errors
which are training and validation can be make precise by some changes like
replacing layers of convolution with identity functions. This was the basic motivation
behind this paper in my opinion.

References

In addition to follow up of the paper I got some other additional papers. Which have
some different but relatable concepts. Chu, Yang, and Tadinada (2017). This paper
deals with the techniques of visualization. Zagoruyko and Komodakis (2016) .This
paper deals with the how to make residual networks wider and get updated with
more number of channels. . Huang et al. (2016). This paper give some different
ideas about residual networks. The common thing between all the papers is each
layer in convolution is attached to layer which inside of dense block. Some of these
references are in the literature survey of this paper. This makes this paper is well
researched and well referenced paper.

Summary
In this paper author explained about the residual network. What is implementation of
residual network, how it is unintuitive. Author also detailed about how it plays
important role in major deep learning frameworks. Also this paper gives brief
information about consecutive convolution layers. Which are mainly convolutional
neural network 1) without and 2) with shortcut connections. Where in and out blocks
are the input and output blocks respectively. The residual network gives a shortcut
connection and the addition operation does the work of precise addition.
In this paper, author wants to note on appoint which is modification does not
gives increment in the number of parameters. Mainly the residual model which is
trained have chances to get converted to a similar simple convolutional network and
vice versa.

Analysis
I know your thinking may be like the model is getting over-fitted but you will find that
this is not true as you go with the steps. The error in training gets incremented with
the increment in depth. And also there will be incrimination in validation error. I’m not
saying that we should replace the concepts in the paper. You hopefully will think that
some connections which are used as shortcuts allows a way where inclination does
not go low and I guess this might be wrong. Authors mentioned that the
standardization helps the inclinations from lowering. They have given that simple
networks which do not have shortcut ways have more changeable rate. If you go in
details you will find some loopholes which I got while reading.
This paper is little bit tricky in some concepts if we through in details step by step.
But this paper is highly informative and understandable. Plus this paper well written
and creates interest while reading it. This topic is well researched by the authors.

Conclusion

This paper clears all ideas such as vanishing inclinations which are highly useful.
One of the important conclusion of this paper is prevention of inclinations from
vanishing in between does not has to be practical to get a precise solution to models
which are very deep. This paper provide us importantly three take away models
which are highly useful in residual networks. The residual network gives a shortcut
connection and the addition operation does the work of precise addition. This paper
also concludes that we can improve the errors in training and validation by changing
layers of convolution with some identity functions. This paper directed at proper
audience and meeting all the purposes. This paper is also well researched and is
with appropriate conclusions.

You might also like