Data Science
home-13

DATA Science Course

DATA Science
  • Introduction to data science
    • What is data science?
    • Introduction to Analytics life cycle?
    • Different types of analysis
  • R Programming Basics
    • Why R
    • Introduction to R and CRAN
    • Nuts and Bolts of R language
    • Advances Features in R
  • Data Harmonization
    • ETL in Data Science world
    • Concepts of Tidy Data
    • Reading Tweets
    • Reshaping
    • Working with dates
  • Data Exploring
    • Exploratory Data Analysis
    • Plotting system like Base & ggplot
  • Research Presentation
    • Literate Programming
    • R-markdown and R-pubs
    • Publish document on Github
  • Inferential Statistics
    • Probability and expected values
    • Various Frequency Distributions
    • Confidence Intervals
    • Hypothesis testing
  • Regression Analysis
    • Regression definition
    • Residual variance
    • Automatic feature selection
  • Machine learning techniques
    • Supervised and Un-supervised learning methods
    • Classification and clustering
    • Time series forecasting
    • Model Ensemble
  • Interactive Graphics
    • Shiny
    • Solidify
    • GoogleVis
  • Natural Language Processing (NLP)
    • Basic building block of Python
    • NLTK
    • STOP-words
    • Stemming
    • Chunking
    • N-grams
    • Performing a Classification
  • Machine Learning with Python
    • Python with Scikit-learn package
    • Implementing regression, Decision Trees and Clustering Python
  • Spark Basics
    • Apache Spark
    • RDDs
    • Spark Transformation & Action
  • Machine Learning with PySpark
    • Introduction to PySpark
    • ML-Lib
    • Appling Machine learning to Big Data
Level 1: Foundation
  • Introduction to Course
    • Overview of course
    • Types of Analysis – Description, Predictive, Prescriptive
  • R programming Basics
    • Introduction to R and CRAN
    • Introduction to interface, CLI, Data types
    • Vectors, Lists, Factors, Matrices, Data-Frames
    • File IO (Flat files, Excel), subsetting
    • Control Statements
    • Creating function
  • Data Harmonization
    • Raw and Tidy Data –Nature of Data
  • Data Exploration
    • Base: plot(), hist(),boxplot(),barplot(),par()
  • Inferential Statistics
    • Summary Measures: central Tendency, Dispersion, Chebyshev’s Theorem
    • Probability: Addition, Multiplicative, Independence, Definition of pmf, pdf
    • Expected Values
  • Regression Analysis
    • Pearson’s Correlation Coefficient, simple LR, and least squares
  • Machine Learning Techniques
    • Types of ML algorithms, Prediction
    • Types of Errors, Sensitivity, Specificity, Receiver Operation Characteristics caret package
    • Bayes Theorem, Naïve Bayes, KNN
    • Explanation of Classification trees, regression trees, packages: part, party
    • Clustering – K-Means, Hierarchical, Dendrograms
Level 2: Graduate
  • R Programming Basics
    • Apply family (lapply, mapply, tapply, apply)
  • Data Harmonization
    • Raw and Tidy Data- Nature of Data
    • Data Table
    • Merge(),sqldf, reshape2: melt(), dcast(),rbind(),cbind()
    • Date, POSIXlt, POSIXct, format()
  • Data Exploration
    • Ggplot2 in details- dataframe, aesthetics, facets, stats, scales
    • Working with Colors
  • Research Presentation
    • R-Markdown and Knitr
    • R-Pubs
  • Inferential Statistics
    • Variability and Standard Error
    • Introduction to Normal, Binomial, Poisson
    • Law of Large Numbers, Central Limit Theorem
  • Regression Analysis
    • Pearson’s Correlation Coefficient, simple LR, and least squares
    • Regression with Factor Variable
    • Verifying Assumptions
    • MLR: Backward, Forward, Both
  • Machine Learning Techniques
    • Introduction to time series, MA, Exponential smoothening, deseasonalizing, detrending, Holt-Winter
Level 3: Advance
  • Data harmonization
    • XML, JSON
    • Reading MySQL, Web, Twitter
    • Dplyr, tidyR
    • Regular Expression
    • Lubridate Zoo
  • Inferential Statistics
    • Confidence Intervals – T-tests, ANOVA for means
    • Hypothesis Testing, Rejection region, P-Values
    • Power
    • Bootstrapping
  • Regression Analysis
    • Logistics Regression
    • Linear Discriminant Analysis
    • ANOVA for Regression
  • Machine Learning Techniques
    • Cross-Validation
    • Preprocessing – Principal Component Analysis, SVD
    • Classification – Bagging, RF, Boosting, SVM, Neural Net, package: randomforest, xgboost, h2o, nnet
    • Model Ensembling
Level 4: Expert
  • Research Presentation
    • Git, GitBash, GitHub, Creating Repos, Basic Commands, Forking, Cloning
    • Creating gh-pages
  • Machine Learning Techniques
    • Regularization – Ridge Regression, Lasso, Dimension, Reduction
    • Nonlinear – Polynomial Regression, Splines, Local, GAMs
    • AP Cluster, K-Medoids
    • Recommender Systems
    • Time Series Forecasting – Arime Models, acf(),pacf(),arima(),auto.arima()
  • Interactive Graphics
    • Shiny
    • Manipulate
    • GoogleVis
    • Slidify

Big Data Course Pune


Verification

DATA Science Course

  • DATA Science

    • Introduction to data science
      • What is data science?
      • Introduction to Analytics life cycle?
      • Different types of analysis
    • R Programming Basics
      • Why R
      • Introduction to R and CRAN
      • Nuts and Bolts of R language
      • Advances Features in R
    • Data Harmonization
      • ETL in Data Science world
      • Concepts of Tidy Data
      • Reading Tweets
      • Reshaping
      • Working with dates
    • Data Exploring
      • Exploratory Data Analysis
      • Plotting system like Base & ggplot
    • Research Presentation
      • Literate Programming
      • R-markdown and R-pubs
      • Publish document on Github
    • Inferential Statistics
      • Probability and expected values
      • Various Frequency Distributions
      • Confidence Intervals
      • Hypothesis testing
    • Regression Analysis
      • Regression definition
      • Residual variance
      • Automatic feature selection
    • Machine learning techniques
      • Supervised and Un-supervised learning methods
      • Classification and clustering
      • Time series forecasting
      • Model Ensemble
    • Interactive Graphics
      • Shiny
      • Solidify
      • GoogleVis
    • Natural Language Processing (NLP)
      • Basic building block of Python
      • NLTK
      • STOP-words
      • Stemming
      • Chunking
      • N-grams
      • Performing a Classification
    • Machine Learning with Python
      • Python with Scikit-learn package
      • Implementing regression, Decision Trees and Clustering Python
    • Spark Basics
      • Apache Spark
      • RDDs
      • Spark Transformation & Action
    • Machine Learning with PySpark
      • Introduction to PySpark
      • ML-Lib
      • Appling Machine learning to Big Data
  • Level 1: Foundation

    • Introduction to Course
      • Overview of course
      • Types of Analysis – Description, Predictive, Prescriptive
    • R programming Basics
      • Introduction to R and CRAN
      • Introduction to interface, CLI, Data types
      • Vectors, Lists, Factors, Matrices, Data-Frames
      • File IO (Flat files, Excel), subsetting
      • Control Statements
      • Creating function
    • Data Harmonization
      • Raw and Tidy Data –Nature of Data
    • Data Exploration
      • Base: plot(), hist(),boxplot(),barplot(),par()
    • Inferential Statistics
      • Summary Measures: central Tendency, Dispersion, Chebyshev’s Theorem
      • Probability: Addition, Multiplicative, Independence, Definition of pmf, pdf
      • Expected Values
    • Regression Analysis
      • Pearson’s Correlation Coefficient, simple LR, and least squares
    • Machine Learning Techniques
      • Types of ML algorithms, Prediction
      • Types of Errors, Sensitivity, Specificity, Receiver Operation Characteristics caret package
      • Bayes Theorem, Naïve Bayes, KNN
      • Explanation of Classification trees, regression trees, packages: part, party
      • Clustering – K-Means, Hierarchical, Dendrograms
  • Level 2: Graduate

    • R Programming Basics
      • Apply family (lapply, mapply, tapply, apply)
    • Data Harmonization
      • Raw and Tidy Data- Nature of Data
      • Data Table
      • Merge(),sqldf, reshape2: melt(), dcast(),rbind(),cbind()
      • Date, POSIXlt, POSIXct, format()
    • Data Exploration
      • Ggplot2 in details- dataframe, aesthetics, facets, stats, scales
      • Working with Colors
    • Research Presentation
      • R-Markdown and Knitr
      • R-Pubs
    • Inferential Statistics
      • Variability and Standard Error
      • Introduction to Normal, Binomial, Poisson
      • Law of Large Numbers, Central Limit Theorem
    • Regression Analysis
      • Pearson’s Correlation Coefficient, simple LR, and least squares
      • Regression with Factor Variable
      • Verifying Assumptions
      • MLR: Backward, Forward, Both
    • Machine Learning Techniques
      • Introduction to time series, MA, Exponential smoothening, deseasonalizing, detrending, Holt-Winter
  • Level 3: Advance

    • Data harmonization
      • XML, JSON
      • Reading MySQL, Web, Twitter
      • Dplyr, tidyR
      • Regular Expression
      • Lubridate Zoo
    • Inferential Statistics
      • Confidence Intervals – T-tests, ANOVA for means
      • Hypothesis Testing, Rejection region, P-Values
      • Power
      • Bootstrapping
    • Regression Analysis
      • Logistics Regression
      • Linear Discriminant Analysis
      • ANOVA for Regression
    • Machine Learning Techniques
      • Cross-Validation
      • Preprocessing – Principal Component Analysis, SVD
      • Classification – Bagging, RF, Boosting, SVM, Neural Net, package: randomforest, xgboost, h2o, nnet
      • Model Ensembling
  • Level 4: Expert

    • Research Presentation
      • Git, GitBash, GitHub, Creating Repos, Basic Commands, Forking, Cloning
      • Creating gh-pages
    • Machine Learning Techniques
      • Regularization – Ridge Regression, Lasso, Dimension, Reduction
      • Nonlinear – Polynomial Regression, Splines, Local, GAMs
      • AP Cluster, K-Medoids
      • Recommender Systems
      • Time Series Forecasting – Arime Models, acf(),pacf(),arima(),auto.arima()
    • Interactive Graphics
      • Shiny
      • Manipulate
      • GoogleVis
      • Slidify

Make an inquiry

Big Data Course Pune


Verification