Which universities are participating?
Our participants come from some of the top universities of India:
- BITS Pilani
- SRM University
- Delhi Technological University
- VIT Vellore
- Manipal University
- Galgotias University
- GITAM University
- KCT Coimbatore
- BIT Mesra
- Shiv Nadar University
- Bharati Vidyapeeth University
The module for Big Data and Hadoop is covered over 40 hours. It includes 30% theory sessions and 70% hands-on assignments and exercises.
- What is BigData?
- Challenges in BigData
- Challenges in traditional application
- Finding new requirements
- What is Hadoop?
- Features of Hadoop
- Hadoop v/s RDBMS
- Overview of HDP ecosystem
- Challenges in Hadoop
- Why password less ssh keys?
- Important configuration
- Formatting Namenode
- Starting and stopping the Hadoop process / daemons
- Overview of NameNode and ResourceManager Web UI
- Preparing machines
- Setting up Ambari server
- Creating one node HDP cluster - Adding hosts, Choosing services and Configuration
- Overview of NameNode and ResourceManager Web UI
- Overview of Configuration via Ambari
- Incompatible cluster IDs
- Understanding safemode concept
- What is HDFS
- Various Restrictions
- What is append only in HDFS?
- Navigating HDFS from command line and UI
- Understanding and creating home directory for a user
- put, copyFromLocal
- get, copyToLocal
- ls, ls –R
- chmod, chown
- Motivation of local repository set up
- Setting local HTTP server and hosting Ambari, HDP and HDP utils repository
- Installing Ambari Server and setting up
- How to add new host via Ambari and commission DN and NM
- How to decommission DN and NM and remove host via Ambari
- Types of Quotas
- Understanding the quota accounting on HDFS
- Overview of daemon process
- HDFS design considerations
- Roles and Responsibilities of - Name Node, Data Node, Secondary NameNode, ResourceManager, NodeManager
- How does read (get) and write (put) operations happens on HDFS
- How block corruptions are handled in HDFS
- NameNode - Motivation, Role of journal nodes, Role of Zookeeper and ZKFC, Design consideration, Role of Secondary NameNode, Setting NameNodeHA via Ambari
- ResourceManager HA - Motivation, Setting Resource Manager HA via Ambari
- Overview of various Scheduler
- Understanding capacity scheduler in details
- Understanding how to achieve multi tenancy via capacity scheduler
- Configuring queues via Ambari
- Basics of MR job
- Understanding basic terminologies
- Execution of job on yarn framework
- Running sample WordCount job on Cluster
- Installing and configuring Hive metastore and hive server2 via Ambari
- Writing queries to solve
- Introduction to Sqoop
- Introduction to Flume
- CAP theorem
- HBase architecture
- HBase daemons
- How to use HBase shell to perfom put, get scan
- Adding spark services via ambari
- Overview of architecture
- Overview of RDD
- Administering spark on yarn
- Introduction to spark sql and spark streaming
- Introduction to spark
- Introduction to Sparksql, SparkML
- Introduction to RDD
- Introduction to Scala
- Installing Scala
- Programming with RDD
- Overview of MapReduce
- Understanding Mapper and Reducer
- Task attempts and speculative execution
- Hadoop data types
- How to write and Mapper and Reducer
- Writing and executing MR jobs
- Writing and executing MR jobs
- Understanding Combiner
- Understanding Distributed cache and solving MapSide join
- Understanding - CustomKeys, CustomValues, Custom partitioner, Sort Comparator, GroupComparator
- Secondary sorting and solving Reduce side join
- Assignment - TF-IDF, Movie Rating
The module for Big Data and using Artificial Neural Networks (ANN) is covered over 60 hours. Its includes 40% theory sessions and 60% hands-on assignments. Participants also complete a group project as part of this module. All participants are assessed and graded on their performance in coursework.
- What is Data Analytics?
- Types of Data Analytics
- Data in Data Analytics
- Decision Models
- Data Mining Process
- Data Visualiation
- Data Querying
- Statistical Methods of Summarizing Data
- Exploring Data using Pivot Tables
- What is Descriptive Analysis?
- Populations and Samples
- Measures of Location
- Measures of Dispersion
- Measures of Shape
- Measures of Association
- Calculations, Expressions, Variables, Functions
- Vectors, Matrics, Data Frames, Lists
- Read/Write Data
- Logical and Loop Constructs
- Introduction to Regression
- Simple Linear Regression
- Multi Linear Regression
- Coding Scheme for Categorical Variables
- Problems with Linear Regression
- Introduction to Classification
- Logistic Regression
- Support Vector Machines
- Resampling Methods
- Separating Hyperplane
- Maximal Margin Classifier
- Support Vector Classifier
- Support Vector Machine
- Introduction to Clustering
- K-means Clustering
- Hierarchical Clustering
- Other notes on Clustering
- The story of Artificial Intelligence (AI)
- Machine Learning Landscape
- Preparing your gears: Python, Matplotlib, Keras
- Practical Machine Learning and Datasets
- Quick Revision of Machine Learning Basics in Python
- Overview and Landscape of Neural Networks
- Architecture and Learning of Neural Networks: In-depth Theory and Practice
- Building Feedforward Neural Networks in Python
- Evaluating and Tuning Neural Networks: Cross-Validation, Activation Functions, Algorithm Variations
- Prediction, Reporting and Visualization
- Recurrent Neural Networks (RNN): Rationales and Architecture
- RNN vs ANN: Similarities and Differences
- RNN for Natural Language Processing (NLP)
- Evaluating and Tuning RNN
- Practical Deep Learning in Python
- Third-generation Neural Networks
- The Future of Neural Networks: Issues and Challenges
Corporate Gurukul is organizing the academic internship in association with professor(s) from Nanyang Technological University (NTU) or National University of Singapore (NUS) and professional(s) from Hewlett Packard Enterprise (HPE) Education
- ‘Certificate of Participation’ by Hewlett Packard Enterprise (HPE) Education on ‘Big Data and Hadoop’
- Personalized ‘Letter of Evaluation’ from NTU or NUS professor(s) on performance in ‘Big Data Analytics’
- ‘Certificate of Participation’ for GAIPTM by Corporate Gurukul
It is an academic internship, which is a hands-on guided experience in an academic environment to hone industry-relevant skills and knowledge in a chosen domain with mentorship and training by university professor(s) and industry professional(s).
- Professors and participants discuss and decide the project based on research domains
- Participant has the option to work from their university in India or travel to Singapore for the project at own expense
- Corporate Gurukul will not do any programme management for the projects
The hands-on sessions are conducted in the Lecture Theatre/Seminar Rooms. You will not visit any laboratories for the same unless requested by the professor(s).
Laptop with the specifications mentioned in ACADEMIC INTERNSHIP PREREQUISITES is MANDATORY for all participants.
30% of the jobs in the next 5 years will be in Big Data Analytics, Data Mining, Machine Learning, Deep Learning and Artificial Intelligence.
Data Science is gaining unprecedented traction in the job market, as Big Data, data analytics, data mining and machine learning become more relevant to the mainstream IT industry.
Google, Amazon, Facebook, Baidu are just some of the companies which have made investments in data products such as self-driving cars, voice / image recognition, wearable devices, etc. Around the world, organizations are fiercely fighting with each other for skilled data professionals available in the market. As a result, the financial packages for different data science roles are consistently going into overdrive.
This has caused a huge demand of skilled professional in data related jobs around the world. Job profiles such as Data Scientist, Data Analyst, Big Data Engineer, Statistician are being largely hunted by companies. Not only are they being handsomely paid, but a career in analytics has much more to promise.
And, this is just the beginning!
No, the admissions are on merit. You can explore admission requirements and financial aids by visiting the admissions department at NTU or NUS in your free time.
- Overall academic performance and CGPA till date
- GPA in Engineering Maths 1 and Engineering Maths 2
- Very good programming skills in R, Python and Java
- Working knowledge of Linux operating system
- Laptop with Win 7 and 16 GB RAM
- 500 GB hard disk space
- INTEL Virtualization Technology (VT) should be enabled from BIOS for Windows laptops
- VMware workstation or latest Vmplayer installed (Win) or VMware fusion with License (Mac)
No. The entire academic internship is delivered at NTU or NUS, Singapore.
You can visit our Facebook page to see the previous academic internship updates.
Check out the link here
The academic internship schedule, date and time are subject to change based on the availability of professor(s) from NTU or NUS and professionals from HPE Education. Prior notice will be given to all concerned parties on the change of date and best efforts will be made by Corporate Gurukul to accommodate convenience of all parties on revised schedule, date and time.
You will stay NTU or NUS Singapore. The exact hostel address and facilities will be informed to you during academic internship orientation.
In case NTU or NUS hostels are not available, you will be staying in hostels outside campus. In such case, coach services will be provided for transfer from hostel to NTU/NUS and back.
You will have to apply for visa on your own. Please approach only authorized visa agents for the same.
For the visa application, you will need:
- Passport: Original Passport with validity of minimum six months and minimum one blank page for visa stamp. Attach all your old passports (if any)
- Visa Application Form: Form 14A duly filled and signed by the applicant along with a clear set of photocopy of the form.
Please note : Light or Bad photocopied form are not accepted
- Photo: Two recent passport size colored photographs with semi matt finish, 80% face coverage, white background and without border (Size: 35mm x 45mm).
Please note :
- Photographs should not be more than three months old, scanned / stapled and should not be used in any of the previous visa
- Photo pasted on 14A Form should be Crossed Signature by the applicant
- Covering Letter: Covering Letter from applicant on business letter head mentioning name, designation, passport number, purpose and duration of visit in brief (to be provided by Corporate Gurukul)
- Ticket: Confirmed Air Ticket
*This is an indicative list, please confirm with your visa agent if any other documents are required.
Your travel insurance policy should cover all charges incurred for medical expenses. Please read all scheme related documents carefully for inclusions and exclusions. Corporate Gurukul will not be responsible for any personal expenses made which are covered or not covered by your insurance policy. Also, make sure you travel with enough money for emergency.
Corporate Gurukul Programme Manager(s) typically stay in the same hostel as you. You can approach them for any help. Their details will be shared with you during orientation.
Vegetarian as well as non-vegetarian food is easily available inside and outside university campus. There is wide range of choice amongst Indian, Chinese, European, Malay, Thai, Vietnamese and Indonesian cuisines. Typically, 3 meals in a day should cost you between S$18 - S$25. However, the estimate may vary as per your dining choices.