Sneak Peek into Deep Learning for Beginners

October 16th, 2020

Do you want to have a headstart in a career in data analytics and deep learning?

Then you have come to the right place, This blog is about how I kickstarted my career into deep learning, built a strong foundation, and also about my experience in the Global Academic Internship Program(GAIP) wherein I competed with senior year students from top universities across India under the mentorship of world-class faculties and managed to earn an excellent grade with an exponential learning curve.

Kickstarting into AI as a Newbie:

Almost every one of us was introduced to this field of automation, with jargons like Artificial Intelligence(AI), Machine Learning (ML), Deep Learning (DL), Data Science/Analytics, etc., which seemed like magic at first sight. I'm no stranger to it, my baby steps came when I was selected to a national level conference at IIT Madras in January 2019

I had to present my solution for their problem statements, amongst the students selected from various regionals & state rounds. I was solving a problem in the agricultural domain and every solution I could come up with without AI was miles away from what the potential of AI has in solving a problem. Being fascinated by it, I presented my solution based on an AI use-case and won the conference having AI as a magic wand. That's when I started diving into this field to demystify the magic behind this art of data science.

I chose to pursue a CSE specialization in data science degree over a management program at IIM-Rohtak and started attending various workshops, online webinars, etc.

Having explored the topics of data science, I was amazed and intrigued by the various buzz words like Deep Learning, Data Modelling, Natural Language Processing and Computer Vision etc. That's when I came across this program called Global Academic Internship Program (GAIP) by Corporate Gurukul (CG) held in partnership with National University Singapore(NUS) and Hewlett Packard Enterprise(HPE). GAIP generally admits students from the senior years of college and me being a fresher, just thought of giving it a shot.

“If you don't try, the chances of you being successful is zero”

As a beginner I only had basic knowledge of python and statistics, I decided to work on it extensively before the admission test. I focused on basic data structures, methods, common probability distributions which I'll be discussing in detail in a later part. And you know how it paid off, astonishingly, I got the offer letter.

[Tip] That's when I realized all we need is to practice and work hard as every expert in any field was once a beginner.

Building a strong foundation in Data Science:

Once I got admitted, it suddenly seemed too early to start with such advanced coursework scheduled to complete in the senior year of my university degree. To everyone out there who wants to start into a new career or domain, it’s never early or late to start as long as you have a strong motive to follow, a clear goal to achieve, and a will to work hard. Because-

“You either start it one day or on the day one”

Being determined, I decided to start from day one. The backbone of the course 'Data Analytics using Deep Learning' rests on these below topics:

Programming in Python:

Note: The coursework I did in Machine Learning & Deep Learning was based on python, but there are alternatives like R, MATLAB, etc. but python is the most popular choice.

Some key places to concentrate on are:

Data types and their usage in Python
Method definition in Python and its usage
Arithmetic operators and number systems
Typecasting in Python
Bit manipulation in Python

Additionally, problem-solving or standard coding for questions is something students are familiar with throughout academia, but scripting the code in a refined, readable manner is very important when you are writing code for projects especially the ones in production. It's because when you are collaborating on a project, your team members need to be comfortable with your code at first sight. This helps them to improve your code and highly increases the efficiency of your project work.

Libraries for Data Analysis:

Though this isn't a necessary prerequisite, being familiar with standard basic data analysis libraries would be of great help. Some of the notable ones are:

Numpy
Pandas
Matplotlib
Scikit learn (Helps for ML concepts)

I made sure that I played around with some of these libraries to know their syntaxes, use cases, and get an overview of them. Trust me, spending quality time with the documentation of these libraries is worthwhile and is the hard but best way to learn them.

[Tip] Skipping medicines and mathematics is injurious to your health and learning new technologies. So never ignore the maths part on top of which AI or Machine Learning or Deep Learning was built.

Deep Learning Meme

Despite the fact it seems boring and hard(except for math enthusiasts), it is very crucial in understanding the concepts behind Machine Learning and Deep Learning at a greater depth.

The most important concepts to focus on are:

Make sure you are at least familiar with computing derivatives, graph transformation, matrices, and understand vectorization. Cheat sheets and subject refresher or revision lectures can help you get through the concepts required enough to understand Deep Learning but having said that, subject expertise is advisable.

Calculus: Theoretically speaking, the level of calculus necessary is the same as the ones we study in an undergraduate STEM degree but it is important to know how to apply them and the intuition behind them to have a deeper understanding of concepts. For instance gradient descent, one of the basic backpropagation algorithms is based on partial derivatives.
Linear Algebra: This subject plays an indispensable role in the advancements of Machine Learning and Deep Learning. Vectorization is one of the key concepts in linear algebra that is used by the models to work on multi-dimensions. Almost every equation implemented in any Deep Learning algorithm is vectorised. Even the fundamental tensors used in Deep Learning share roots in this subject.

Probability and Statistics:

It's obvious math too, but deserves to have a separate mention. If you can't do statistics, properly, then you most probably can't do Machine Learning or Deep Learning as well properly. These two subjects are the very necessary and most important fundamentals required to excel in this program. Despite the fact most Machine Learning models use bootstrapping to get statistics,you must understand some key concepts like:

Hypothesis tests in statistics
Probability & Bayesian inference
Measures of dispersion, central tendency in statistics
Frequently used probability distribution curves
Activation functions
Types of dependence between numerical variables

[Tip] Channels on youtube like 3B1B (Essence of Calculus & Linear Algebra) and Khan Academy (Especially the Precalculus, Probability and Statistics playlists) have tried to make math intuitive so you can take your medicines with ease.

Version Control System:

In the capstone project, we have to collaborate our part of codes with our teammates, so we can improve the (say) data visualization or the Deep Learning model as a team. Having to use a version control system like GitHub effectively in any collaborative project is very important. Ironically it's the easiest to learn amongst the rest but is something which I regret not doing effectively. With the world moving towards open-source it would come in handy not just for Data Science but for all the software development projects.

Reality Check image

Reality Check: With all these said, if I or anyone else claims to have mastered all these in one month, it is completely false. All I did was to get familiar by playing around with them so that we understand the Machine Learning and Deep learning concepts from scratch at a greater depth. Consistency is the key and there’s no substitute to practice or shortcuts to become so-called Data Scientists in months.

Demystifying Deep Learning:

“Do the best you can until you know better. Then when you know better, do better.”

With our fundamentals sorted, now let's dive into the ocean of Deep Learning. Before diving deep into the ocean one has to make sure to know how to swim in the pool of Machine Learning. It is an extension of statistical ideas into algorithms that help us solve real-world problems.

It's of 3 types Supervised, Unsupervised and Reinforcement(a.k.a semi-supervised) learning and is to be mastered in that order. The best way to understand these concepts starting from Linear Regression to Support Vector Machines is to implement them in code and understand their workflow.

While learning these algorithms you would learn to embrace the beauty of applied statistics. Reinforcement learning is one of the advanced technologies applied by leading companies for automation. Eg: Self-Driving Cars by Tesla.

With an exponential increase in computational power available to mankind, the tech world is increasingly moving towards deep learning, an advanced subset of machine learning. The recent (as of July 2020) release of GPT - 3, a language model which is one of the deepest neural networks with about 175 billion parameters which recreates human touch better than humans. So if you always feel bored studying outdated concepts as part of your academics, then deep learning is the right thing for you to get your hands dirty with.

Sanjay Blog Image

Deep Learning is a bottom-up learning approach where you understand what the single node of neural network (NN) a perceptron does, then building shallow 2 layer neural network's while understanding how it works both intuitively and in mathematical fashion and finally understanding all the technical jargons like an optimiser, parameters, hyperparameters, epoch, etc. and building deep learning models for image recognition and image classification. (mostly cats as they're so cute aren't they?).

From this initial phase to a point where you start loving Transfer Learning and even building advanced General Adversarial Networks(GANS) where your model creates its own data, DL is a very vast subject that one claiming mastery in them needs years of experience.

But don't worry, the new technologies in this domain being recent advancements won't need years of experience dating even back before it's creation.

Sanjay Blog Image Twitter

Oops, sorry I take back my words after seeing this tweet above from Sebastián Ramírez, the founder of FastAPI.

The message here is that no matter what phase you are in a beginner or an expert you always have something new to learn in this field, so start wherever you are, make use of the resources and do your level best.

“Learning Data Science is like going to the gym, you earn your muscles(skills) only if you work out day in and out and irregularity leads you back to day zero”.

Data Science, Machine Learning and Deep Learning is not a subject to be studied but are tools that are applied real-time to solve problems across various domains like healthcare, finance, marketing, policy and decision-making, etc. So I would love to mention some key points that would help you navigate through your data science journey:

Every concept you study was once a research paper, reading research papers, and making literature reviews is a key skill required for you to progress in the advanced concepts. Websites like paperswithcode" helps you to implement and understand the concept at depth. Once you understand how the research flow works, optimistically you can create the next disruptive research paper !!!
Courses give you theoretical knowledge but projects give you both practical and theoretical knowledge. Having already claimed data science to be a tool, practical knowledge is required to use the tool efficiently than just knowing the facts about it. So always make quality projects to illustrate your learning.
Open source your projects, even the best Deep Learning models need to be reiterated, similarly, there's always room for improvement in any work you do, and open-sourcing it would enable anyone interested to work on it. Write medium blogs on your work and give back to the community by teaching your learnings as the best way to learn something is by doing it practically and teaching it intuitively.
Always connect and network with passionate and enthusiastic people who add value to the community in professional platforms like LinkedIn, Github, Stackoverflow, etc.
Form or be part of communities or groups where you can gossip about Data Science, Machine Learning, and Deep Learning. Collaborate with the projects, where you can add value to and learn from even your competitors. But NEVER copy the code from Github/Stackoverflow without understanding its workflow completely. As, quoting Jack Ma, the founder of Alibaba “ You should learn from your competitor, but never copy. Copy and you die.
Always focus on the problem which you want to solve and not the fancy tech stack or the exquisite dataset to be used. The importance of the project lies in how much you can contribute to the idea to solve a real-world problem and not the amount of complexity with which you deal it with.
Working on predefined projects and coursework leads to a gap between academia and the industry. For instance, overfitting your learning on clean existing datasets will hinder your skillset to solve a new industry problem that needs a new dataset to be scraped/mined and to be wrangled. Working on projects aligned with industry and having internship experience helps bridge the gap and make you industry inclined.

“Start where you are, Use what you have and do what you can. Good Luck for an exciting journey ahead!!”

ABOUT THE AUTHOR

Sanjay Sivaraman

Sanjay is pursuing his specialisation in data science from Vellore Institute of Technology, Vellore. He is also currently a Data Science Intern at DAV Data Solutions and a Technical Success intern at The Climber.

He's a 3x Ideathon Winner, and someone who's passionate about data science, applying it towards United Nations Sustainability Development Goals of 2030. Sanjay is an alumnus of Corporate Gurukul. For any sort of assistance or queries, he can be approached on datadrivensanjay@gmail.com.

0 likes