GAIP GLOBAL ACADEMIC INTERNSHIP PROGRAMME



What is Global Academic Internship Programme (GAIPTM)?

The Global Academic Internship Programme (GAIPTM) is a 3-week short-term academic internship programme held in Singapore. It enables engineering undergraduates from universities and institutes of higher learning in Asia to pursue their passion and interest through internship, project work and research. Participants of this programme will work with industry professionals from Hewlett Packard Enterprise (HPE) Education, Asia Pacific and faculty from Nanyang Technological University (NTU), Singapore or National University of Singapore (NUS).

Who conducts GAIPTM?

Global Academic Internship Programme (GAIPTM), is conducted by Corporate Gurukul, Singapore and delivered with professionals from Hewlett Packard Enterprise (HPE) Education and faculty from Nanyang Technological University (NTU), Singapore or National University of Singapore (NUS).

What are the modules covered?

The internship covers 2 modules on Big Data in a span of 3 weeks:

  1. Big Data and Hadoop – by Hewlett Packard Enterprise Education
  2. Big Data Analytics with Artificial Neural Networks – by faculty from NTU or NUS

Why pursue internship in Big Data Analytics?

Data is everywhere. In fact, the amount of digital data that exists is grow bottom_lineing at a rapid rate—in fact, more than 2.7 zettabytes of data exist in today’s digital universe, and that is projected to grow bottom_line to 180 zettabytes in 2025.

All this data—from your photos to the Fortune 500’s financials—has only recently begun to be analyzed to tease out insights that can help organizations improve their business. That’s why more organizations are seeking professionals who can make sense of all the data.

The concept of big data has been around for years; most organizations now understand that if they capture all the data that streams into their businesses, they can apply analytics and get significant value from it. Even in the 1950s, decades before anyone uttered the term “big data,” businesses were using basic analytics (essentially numbers in a spreadsheet that were manually examined) to uncover insights and trends.

The new benefits that big data analytics brings to the table, however, are speed and efficiency. Whereas a few years ago a business would have gathered information, run analytics and unearthed information that could be used for future decisions, today that business can identify insights for immediate decisions.

It is easy enough to become a data scientist. Once you get the art of data analysis right, it is just a matter of practicing your newly-found skills well enough to become proficient.

This course is for those new to data science and interested in understanding why the Big Data is important and has relevance in todays’ world. It is for those who want to become conversant with the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business, career or research.

The hands-on internship introduces you to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible — increasing the potential for data to transform our world! It then moves to Data Visualization, Classification and Clustering before exploring learning with Artificial Neural Networks.

Faculty Team

GAIP December 2017
 

GAIP Interns

BITS Pilani
 

When is Summer GAIPTM conducted?

2nd June to 23rd June, 2018

GAIPTM completion documents that participants receive?

  • ‘Certificate of Participation’ by Hewlett Packard Enterprise (HPE) Education on ‘Big Data and Hadoop’
  • Personalized ‘Letter of Evaluation’ from NTU or NUS professor(s) on performance in ‘Big Data Analytics using Artificial Neural Networks’
  • ‘Certificate of Participation’ for GAIPTM by Corporate Gurukul

CG Certificate

 

HPE Education Certificate

 

Whats the pedagogy and assessment criteria?

  • Classroom Training: Lectures on theory and core concepts
  • In-Class assignment: Hands-on work on core concepts
  • Quiz: The continuous assessment includes quizzes for every session
  • Project Work: Participants work in groups on a project
  • Group Presentation: All participants present and are assessed on the project

Classroom Training Session

 

Assessment in Progress

 

Who should enroll?

The academic internship is designed for 2nd and 3rd year engineering undergraduates (any stream) who intend to pursue a career in Data Science with focus on Big Data Analytics using Machine Learning, Deep Learning and Neural Networks.

Selection Criteria

  • Overall academic performance and CGPA till date
  • GPA in Engineering Maths 1 and Engineering Maths 2
  • Very good programming skills in R, Python and Java
  • Working knowledge of Linux operating system

What are my chances of selection?

We typically select a cohort of 80-85 participants from around 3000 applications. So, probability of selection is 1:35.

Which universities are participating?

Our participants come from some of the top universities of India:

  • BITS Pilani
  • SRM University
  • Delhi Technological University
  • VIT Vellore
  • Manipal University
  • Galgotias University
  • GITAM University
  • KCT Coimbatore
  • BIT Mesra
  • Shiv Nadar University
  • Bharati Vidyapeeth University

The module for Big Data and Hadoop is covered over 40 hours. It includes 30% theory sessions and 70% hands-on assignments and exercises.

Introduction to Big Data and Hadoop
  • What is BigData?
  • Challenges in BigData
  • Challenges in traditional application
  • Finding new requirements
  • What is Hadoop?
  • Features of Hadoop
  • Hadoop v/s RDBMS
  • Overview of HDP ecosystem
  • Challenges in Hadoop
Pseudo mode / Single Node set up of Apache Hadoop (2.7.0)
  • Prerequisites
  • Why password less ssh keys?
  • Important configuration
  • Formatting Namenode
  • Starting and stopping the Hadoop process / daemons
  • Overview of NameNode and ResourceManager Web UI
Setting Ambari (2.1.0)
  • Preparing machines
  • Setting up Ambari server
  • Creating one node HDP cluster - Adding hosts, Choosing services and Configuration
  • Overview of NameNode and ResourceManager Web UI
  • Overview of Configuration via Ambari
Important Configuration
  • defaultFS
  • block.size
  • replication
  • datanode.data.dir
  • namenode.name.dir
  • Incompatible cluster IDs
  • Understanding safemode concept
HDFS Shell commands
  • What is HDFS
  • Various Restrictions
  • What is append only in HDFS?
  • Navigating HDFS from command line and UI
  • Understanding and creating home directory for a user
  • mkdir
  • put, copyFromLocal
  • get, copyToLocal
  • ls, ls –R
  • cat
  • fsck
  • chmod, chown
Local Repository Setup
  • Motivation of local repository set up
  • Setting local HTTP server and hosting Ambari, HDP and HDP utils repository
  • Installing Ambari Server and setting up
Commission and Decommissioning
  • How to add new host via Ambari and commission DN and NM
  • How to decommission DN and NM and remove host via Ambari
HDFS Quotas
  • Types of Quotas
  • Understanding the quota accounting on HDFS
Hadoop Architecture
  • Overview of daemon process
  • HDFS design considerations
  • Roles and Responsibilities of - Name Node, Data Node, Secondary NameNode, ResourceManager, NodeManager
  • How does read (get) and write (put) operations happens on HDFS
  • How block corruptions are handled in HDFS
High Availability
  • NameNode - Motivation, Role of journal nodes, Role of Zookeeper and ZKFC, Design consideration, Role of Secondary NameNode, Setting NameNodeHA via Ambari
  • ResourceManager HA - Motivation, Setting Resource Manager HA via Ambari
Hadoop Scheduler
  • Overview of various Scheduler
  • Understanding capacity scheduler in details
  • Understanding how to achieve multi tenancy via capacity scheduler
  • Configuring queues via Ambari
MapReduce programing model
  • Basics of MR job
  • Understanding basic terminologies
  • Execution of job on yarn framework
  • Running sample WordCount job on Cluster
Introduction to Hive
  • Motivation
  • Installing and configuring Hive metastore and hive server2 via Ambari
  • Writing queries to solve
Data Ingestion Mechanism
  • Introduction to Sqoop
  • Introduction to Flume
Introduction to HBase
  • Motivation
  • CAP theorem
  • HBase architecture
  • HBase daemons
  • How to use HBase shell to perfom put, get scan
Spark Administration
  • Adding spark services via ambari
  • Overview of architecture
  • Overview of RDD
  • Administering spark on yarn
  • Introduction to spark sql and spark streaming
Spark Programming
  • Introduction to spark
  • Introduction to Sparksql, SparkML
  • Introduction to RDD
  • Introduction to Scala
  • Installing Scala
  • Programming with RDD
MapReduce Programming
  • Overview of MapReduce
  • Understanding Mapper and Reducer
  • Task attempts and speculative execution
  • Hadoop data types
  • How to write and Mapper and Reducer
  • Writing and executing MR jobs
  • Writing and executing MR jobs
  • Understanding Combiner
  • Understanding Distributed cache and solving MapSide join
  • Understanding - CustomKeys, CustomValues, Custom partitioner, Sort Comparator, GroupComparator
  • Secondary sorting and solving Reduce side join
  • Assignment - TF-IDF, Movie Rating

 

Big Data and Hadoop Training

HPE Education

 

Big Data and Hadoop Training

HPE Education

 

The module for Big Data and using Artificial Neural Networks (ANN) is covered over 60 hours. Its includes 40% theory sessions and 60% hands-on assignments. Participants also complete a group project as part of this module. All participants are assessed and graded on their performance in coursework.

Data Analytics
  • What is Data Analytics?
  • Types of Data Analytics
  • Data in Data Analytics
  • Decision Models
  • Data Mining Process
Visualizing and Exploring Data
  • Data Visualiation
  • Data Querying
  • Statistical Methods of Summarizing Data
  • Exploring Data using Pivot Tables
Descriptive Statistical Measures
  • What is Descriptive Analysis?
  • Populations and Samples
  • Measures of Location
  • Measures of Dispersion
  • Measures of Shape
  • Measures of Association
Introduction to R
  • Calculations, Expressions, Variables, Functions
  • Vectors, Matrics, Data Frames, Lists
  • Read/Write Data
  • Logical and Loop Constructs
  • Graphics
Regression
  • Introduction to Regression
  • Simple Linear Regression
  • Multi Linear Regression
  • Coding Scheme for Categorical Variables
  • Problems with Linear Regression
Classification I
  • Introduction to Classification
  • Logistic Regression
  • Support Vector Machines
  • Resampling Methods
Classification II (Support Vector Machine)
  • Separating Hyperplane
  • Maximal Margin Classifier
  • Support Vector Classifier
  • Support Vector Machine
Clustering
  • Introduction to Clustering
  • K-means Clustering
  • Hierarchical Clustering
  • Other notes on Clustering
Introduction to Machine Learning
  • The story of Artificial Intelligence (AI)
  • Machine Learning Landscape
  • Preparing your gears: Python, Matplotlib, Keras
  • Practical Machine Learning and Datasets
  • Quick Revision of Machine Learning Basics in Python
Essence of Artificial Neural Networks (ANN)
  • Overview and Landscape of Neural Networks
  • Architecture and Learning of Neural Networks: In-depth Theory and Practice
  • Building Feedforward Neural Networks in Python
  • Evaluating and Tuning Neural Networks: Cross-Validation, Activation Functions, Algorithm Variations
  • Prediction, Reporting and Visualization
Deepening Neural Networks and Advanced Learning Algorithms
  • Recurrent Neural Networks (RNN): Rationales and Architecture
  • RNN vs ANN: Similarities and Differences
  • RNN for Natural Language Processing (NLP)
  • Evaluating and Tuning RNN
  • Practical Deep Learning in Python
Evolving Artificial Neural Networks
  • Third-generation Neural Networks
  • The Future of Neural Networks: Issues and Challenges

 

GAIP Interns

SRM University

 

Orientation

National Institute of Education

 

ACADEMIC

Who is organizing the academic internship?

Corporate Gurukul is organizing the academic internship in association with professor(s) from Nanyang Technological University (NTU) or National University of Singapore (NUS) and professional(s) from Hewlett Packard Enterprise (HPE) Education

Which certificate(s) and letter(s) will I get after completion of this academic internship? Is there any prerequisite for obtaining the certificate(s) and letter(s)?
  • ‘Certificate of Participation’ by Hewlett Packard Enterprise (HPE) Education on ‘Big Data and Hadoop’
  • Personalized ‘Letter of Evaluation’ from NTU or NUS professor(s) on performance in ‘Big Data Analytics’
  • ‘Certificate of Participation’ for GAIPTM by Corporate Gurukul
Is it an academic internship or a training programme?

It is an academic internship, which is a hands-on guided experience in an academic environment to hone industry-relevant skills and knowledge in a chosen domain with mentorship and training by university professor(s) and industry professional(s).

How do I go about the 6-month projects with NTU or NUS professor(s), post academic internship completion?
  • Professors and participants discuss and decide the project based on research domains
  • Participant has the option to work from their university in India or travel to Singapore for the project at own expense
  • Corporate Gurukul will not do any programme management for the projects
Where are the hands-on sessions conducted?

The hands-on sessions are conducted in the Lecture Theatre/Seminar Rooms. You will not visit any laboratories for the same unless requested by the professor(s).

Laptop with the specifications mentioned in ACADEMIC INTERNSHIP PREREQUISITES is MANDATORY for all participants.

How will GAIPTM benefit me in placements?

30% of the jobs in the next 5 years will be in Big Data Analytics, Data Mining, Machine Learning, Deep Learning and Artificial Intelligence.

Data Science is gaining unprecedented traction in the job market, as Big Data, data analytics, data mining and machine learning become more relevant to the mainstream IT industry.

Google, Amazon, Facebook, Baidu are just some of the companies which have made investments in data products such as self-driving cars, voice / image recognition, wearable devices, etc. Around the world, organizations are fiercely fighting with each other for skilled data professionals available in the market. As a result, the financial packages for different data science roles are consistently going into overdrive.

This has caused a huge demand of skilled professional in data related jobs around the world. Job profiles such as Data Scientist, Data Analyst, Big Data Engineer, Statistician are being largely hunted by companies. Not only are they being handsomely paid, but a career in analytics has much more to promise.

And, this is just the beginning!

Can you help me if I want to pursue Masters at NTU or NUS?

No, the admissions are on merit. You can explore admission requirements and financial aids by visiting the admissions department at NTU or NUS in your free time.

Are there any prerequisites for the academic internship?

Qualifying Criteria

  1. Overall academic performance and CGPA till date
  2. GPA in Engineering Maths 1 and Engineering Maths 2
  3. Very good programming skills in R, Python and Java
  4. Working knowledge of Linux operating system

Hardware/Software Prerequisites

  1. Laptop with Win 7 and 16 GB RAM
  2. 500 GB hard disk space
  3. INTEL Virtualization Technology (VT) should be enabled from BIOS for Windows laptops
  4. VMware workstation or latest Vmplayer installed (Win) or VMware fusion with License (Mac)
Do we visit HP office for the HPE Education part of the curriculum?

No. The entire academic internship is delivered at NTU or NUS, Singapore.

 

NON-ACADEMIC

Can I see any previous academic internship rollouts that you have done?

You can visit our Facebook page to see the previous academic internship updates.

Check out the link here

Are the dates confirmed for the academic internship?

The academic internship schedule, date and time are subject to change based on the availability of professor(s) from NTU or NUS and professionals from HPE Education. Prior notice will be given to all concerned parties on the change of date and best efforts will be made by Corporate Gurukul to accommodate convenience of all parties on revised schedule, date and time.

Where do we stay in Singapore?

You will stay NTU or NUS Singapore. The exact hostel address and facilities will be informed to you during academic internship orientation.

In case NTU or NUS hostels are not available, you will be staying in hostels outside campus. In such case, coach services will be provided for transfer from hostel to NTU/NUS and back.

How do I get the visa?

You will have to apply for visa on your own. Please approach only authorized visa agents for the same.

For the visa application, you will need:

    Passport: Original Passport with validity of minimum six months and minimum one blank page for visa stamp. Attach all your old passports (if any)
  1. Visa Application Form: Form 14A duly filled and signed by the applicant along with a clear set of photocopy of the form.
    Please note : Light or Bad photocopied form are not accepted
  2. Photo: Two recent passport size colored photographs with semi matt finish, 80% face coverage, white background and without border (Size: 35mm x 45mm).
    Please note :

    • Photographs should not be more than three months old, scanned / stapled and should not be used in any of the previous visa
    • Photo pasted on 14A Form should be Crossed Signature by the applicant
  3. Covering Letter: Covering Letter from applicant on business letter head mentioning name, designation, passport number, purpose and duration of visit in brief (to be provided by Corporate Gurukul)
  4. Ticket: Confirmed Air Ticket

*This is an indicative list, please confirm with your visa agent if any other documents are required.

What if I fall sick or meet with an accident?

Your travel insurance policy should cover all charges incurred for medical expenses. Please read all scheme related documents carefully for inclusions and exclusions. Corporate Gurukul will not be responsible for any personal expenses made which are covered or not covered by your insurance policy. Also, make sure you travel with enough money for emergency.

Who do I contact during emergency in Singapore?

Corporate Gurukul Programme Manager(s) typically stay in the same hostel as you. You can approach them for any help. Their details will be shared with you during orientation.

What kind of food do I get in Singapore?

Vegetarian as well as non-vegetarian food is easily available inside and outside university campus. There is wide range of choice amongst Indian, Chinese, European, Malay, Thai, Vietnamese and Indonesian cuisines. Typically, 3 meals in a day should cost you between S$18 - S$25. However, the estimate may vary as per your dining choices.

 

Training Infrastructure at NTU

 

Training Infrastructure at NTU

 

 

Sports Facilities at NTU

 

Orientation

Facilities