ENABLE TRAINING MODULES

The Carolina Health Informatics Program (CHIP) has developed a few online training modules called An Introduction to Data Science through a health care lens to expose learners to the field of data science. These online modules are accessible to anyone who is interested, and require no prior training or knowledge in data science. If you complete the entire set of modules – the entire “short course” – and successfully pass a simple final assessment, you will receive a certificate of completion.
Introduction to Data Science Curriculum
Text Mining
Data Mining
Text Preprocessing is an important step for natural language processing (NLP). It transforms text into a more digestible form so that machine learning algorithms can perform better. This module will teach various text preprocessing techniques.
Text Preprocessing is an important step for natural language processing (NLP). It transforms text into a more digestible form so that machine learning algorithms can perform better. This module will teach various text preprocessing techniques.
Exploratory analysis is an initial approach to analyzing data sets. It commonly involves summarizing the main characteristics of datasets their main characteristics and data visualizations. This module will teach you how to perform exploratory analysis for text data.
Note: If you encounter an error in the optional section, please copy and paste the below code into the code cell with the error.
Exploratory analysis is an initial approach to analyzing data sets. It commonly involves summarizing the main characteristics of datasets their main characteristics and data visualizations. This module will teach you how to perform exploratory analysis for text data.
Note: If you encounter an error in the optional section, please copy and paste the below code into the code cell with the error.
Text data is often rich with both information and meaning. However, text data is also often complex which can make analysis difficulty. This module will introduce you to parts of speech tagging, named entity recognition, and relation extraction. This will allow you to both understand the structure of your textual data and derive meaning from it.
Text data is often rich with both information and meaning. However, text data is also often complex which can make analysis difficulty. This module will introduce you to parts of speech tagging, named entity recognition, and relation extraction. This will allow you to both understand the structure of your textual data and derive meaning from it.
Feature representation is a way to present your data so a machine or computer can understand it and perform an analysis. This module will investigate feature representation for text data. You will also explore generating different types of feature representations and comparing how well they perform.
Feature representation is a way to present your data so a machine or computer can understand it and perform an analysis. This module will investigate feature representation for text data. You will also explore generating different types of feature representations and comparing how well they perform.
One of the most powerful uses of data is using it to make future predictions. In this module, we will be exploring how to use text data to perform predictions. Specifically, you will learn about two common machine learning algorithms, logistic regression and k-nearest neighbor.
One of the most powerful uses of data is using it to make future predictions. In this module, we will be exploring how to use text data to perform predictions. Specifically, you will learn about two common machine learning algorithms, logistic regression and k-nearest neighbor.
Preparing data is an important step in any data mining project. In this module you will learn how to upload a CSV file and how to deal with missing or improbable data.
Preparing data is an important step in any data mining project. In this module you will learn how to upload a CSV file and how to deal with missing or improbable data.
Univariate analysis allows you to deeply analyze a single variable. This module will teach you the skills to perform univariate analysis including variable types, summary statistics, and univariate data visualization. Along the way, you’ll learn by analyzing specific variables from real patient data!
Univariate analysis allows you to deeply analyze a single variable. This module will teach you the skills to perform univariate analysis including variable types, summary statistics, and univariate data visualization. Along the way, you’ll learn by analyzing specific variables from real patient data!
Bivariate analysis is a statistical method which helps us see how our variable relate to one another. In this module, you’ll learn different bivariate analysis techniques and how to apply those techniques in R.
Bivariate analysis is a statistical method which helps us see how our variable relate to one another. In this module, you’ll learn different bivariate analysis techniques and how to apply those techniques in R.
Feature selection is the process of selecting a subset of variables for the purpose of building a machine learning model. Reducing the number of features can improve model performance, make models more easily understandable, and reduces the time required to run a model. In this module you will learn filter, wrapper, and embedded feature selection methods.
Feature selection is the process of selecting a subset of variables for the purpose of building a machine learning model. Reducing the number of features can improve model performance, make models more easily understandable, and reduces the time required to run a model. In this module you will learn filter, wrapper, and embedded feature selection methods.
Predictive analysis is a powerful tool which allows us to make future predictions from data. This module will pull together the previous four data mining modules to teach advanced techniques such as machine learning, logistic regression, and decision trees. Along the way, you’ll learn by predicting mortality from real ICU patient data!
Predictive analysis is a powerful tool which allows us to make future predictions from data. This module will pull together the previous four data mining modules to teach advanced techniques such as machine learning, logistic regression, and decision trees. Along the way, you’ll learn by predicting mortality from real ICU patient data!