ONLINE TRAINING MODULES |

The Carolina Health Informatics Program (CHIP) has developed a few online training modules called An Introduction to Data Science through a health care lens to expose learners to the field of data science. These online modules are accessible to anyone who is interested, and require no prior training or knowledge in data science. If you complete the entire set of modules – the entire “short course” – and successfully pass a simple final assessment, you will receive a certificate of completion.

Introduction to Data Science Curriculum

Text Mining

Data Mining

Module 1: Text Preprocessing - adrirome

Text Preprocessing is an important step for natural language processing (NLP). It transforms text into a more digestible form so that machine learning algorithms can perform better. This module will teach various text preprocessing techniques. Begin Module

Module 1: Text Preprocessing - farooqs

Module 1: Text Preprocessing - fennal

Module 1: Text Preprocessing - josethom

Module 1: Text Preprocessing - shoknw17

Module 2: Exploratory Analysis of Text Data - adrirome

Exploratory analysis is an initial approach to analyzing data sets. It commonly involves summarizing the main characteristics of datasets their main characteristics and data visualizations. This module will teach you how to perform exploratory analysis for text data. Note: If you encounter an error in the optional section, please copy and paste the below code into the code cell with the error. Begin Module

Module 2: Exploratory Analysis of Text Data - farooqs

Module 2: Exploratory Analysis of Text Data - fennal

Module 2: Exploratory Analysis of Text Data - josethom

Module 2: Exploratory Analysis of Text Data - shoknw17

Module 3: Information Extraction - adrirome

Text data is often rich with both information and meaning. However, text data is also often complex which can make analysis difficulty. This module will introduce you to parts of speech tagging, named entity recognition, and relation extraction. This will allow you to both understand the structure of your textual data and derive meaning from it. Begin Module

Module 3: Information Extraction - farooqs

Module 3: Information Extraction - fennal

Module 3: Information Extraction - josethom

Module 3: Information Extraction - shoknw17

Module 4: Feature Representation for Text - adrirome

Feature representation is a way to present your data so a machine or computer can understand it and perform an analysis. This module will investigate feature representation for text data. You will also explore generating different types of feature representations and comparing how well they perform. Begin Module

Module 4: Feature Representation for Text - farooqs

Module 4: Feature Representation for Text - fennal

Module 4: Feature Representation for Text - josethom

Module 4: Feature Representation for Text - shoknw17

Module 5: Predictive Analysis of Text Data - adrirome

One of the most powerful uses of data is using it to make future predictions. In this module, we will be exploring how to use text data to perform predictions. Specifically, you will learn about two common machine learning algorithms, logistic regression and k-nearest neighbor. Begin Module

Module 5: Predictive Analysis of Text Data - farooqs

Module 5: Predictive Analysis of Text Data - fennal

Module 5: Predictive Analysis of Text Data - josethom

Module 5: Predictive Analysis of Text Data - shoknw17

Module 1: Preparing Data - adrirome