Skip to main content
New Self-paced AI courses — learn ML, deep learning, and agents on your schedule. Enroll free

Data Analysis for ML/AI

Text data analysis

Tokenization, vocabularies, n-grams, EDA on text features.

5 lessons Follow in order
Use the button below to sign in and unlock lessons.
Your path

Lessons in sequence

Work through these in order—each lesson builds on the previous one.

  1. Lesson 1 of 5

    Tokenization fundamentals

    Enroll to open · 10 min read

  2. Lesson 2 of 5

    Text normalization: case, accents, unicode

    Enroll to open · 10 min read

  3. Lesson 3 of 5

    Stopwords, stemming, and lemmatization

    Enroll to open · 9 min read

  4. Lesson 4 of 5

    N-grams, bag-of-words, and TF-IDF

    Enroll to open · 10 min read

  5. Lesson 5 of 5

    Text data pitfalls: encodings and dirty inputs

    Enroll to open · 9 min read

← Back to course