← Courses Page

A first course in machine learning for upper-level undergraduates and graduate students. Covers supervised learning (linear and logistic regression, support vector machines, decision trees, neural networks), unsupervised learning (clustering, dimensionality reduction), model evaluation, regularization, and the bias–variance trade-off.

Students gain hands-on experience implementing core algorithms in Python with NumPy, scikit-learn, and PyTorch, and learn to read and reproduce results from the modern ML literature.

Pre-requisite: CSE 250 (or equivalent data structures), comfort with linear algebra (vectors, matrices, eigenvalues), basic probability, and multivariable calculus.

Instructor Information

Course Instructor: Jue Guo

  • Research Area: Optimization for machine learning, Adversarial Learning, Continual Learning and Graph Learning
  • Interested in participating in our research? Reach to me by email.

Course Outline and Logistics

Course Hours: CSE 474/574 LEC; Spring 2024 — University at Buffalo

Office Hours: posted on the LMS

Format: 3 lecture hours / week + weekly programming assignments + final project

Week Topic (PRML chapter) Deliverable
1 Introduction · polynomial curve fitting · probability · decision theory · information theory (Ch 1) HW0 (math diagnostic)
2 Probability distributions — Bernoulli/Beta, Gaussian, exponential family (Ch 2) HW1 released
3 Linear models for regression · ridge · bias–variance · Bayesian LR (Ch 3) HW1 due
4 Linear models for classification · logistic regression · LDA / Naive Bayes (Ch 4) HW2 released
5 Neural networks · backpropagation · regularization (Ch 5) HW2 due · project proposal
6 Kernel methods · Gaussian processes (Ch 6) HW3 released
7 Sparse kernel machines · SVM · RVM (Ch 7) HW3 due
8 Midterm exam (Coverage: Chs 1–7) and Catch-up Midterm
Spring Recess (No Classes)
9 Graphical models · d-separation · message passing (Ch 8) HW4 released
10 Mixture models · k-means · GMM · EM (Ch 9) HW4 due · project milestone
11 Approximate inference · variational methods · EP (Ch 10) HW5 released
12 Sampling methods · MCMC · HMC (Ch 11) HW5 due
13 Continuous latent variables · PCA · autoencoders (Ch 12)
14 Sequential data · HMM · Kalman filter (Ch 13) Project draft due
15 Combining models · boosting · trees · MoE (Ch 14) Project presentations
Finals Final exam · cumulative, with emphasis on Chs 8–14 Final · final project report

Note on Logistics

  • A week-ahead notice for the mid-term, based on the pace of the course.
  • The schedule is subject to change based on the overall pace and the performance of the class.

Grading

The course is assessed across five problem sets, a midterm exam, a final exam, and a research-style final project (proposal → milestone → draft → presentation → final report).

ComponentWeight
Problem sets (5 × 8 %)40 %
Midterm exam15 %
Final exam20 %
Final project25 %

Grading Rubric

The final grade will be determined based on the overall performance of the class, taking into consideration all relevant assessments and contributions. The instructor reserves the right to make final decisions regarding grades.

Please note that excuses for missed work or poor performance, such as personal conflicts or minor inconveniences, will not be considered unless exceptional and documented circumstances arise.

Percentage Letter Grade Percentage Letter Grade
95-100A70-74C+
90-94A-65-69C
85-89B+60-64C-
80-84B55-59D
75-79B-0-54F

Course Policies

Late work

  • Three late days total across all problem sets, no questions asked. Each late day extends a deadline by 24 hours.
  • After late days are exhausted, late submissions lose 20 % per day, up to 3 days; no credit thereafter.
  • Project deadlines (proposal, milestone, draft, final) are firm; late days do not apply.

Collaboration

You are encouraged to discuss problem sets with classmates at the conceptual level, but code and writeups must be your own. Sketching ideas at a whiteboard or helping someone debug a runtime error is fine; sharing solutions is not. Pair work is allowed (and encouraged) for the final project; each member's contribution should be made explicit in the report.

AI / LLM policy

  • Allowed without disclosure: autocomplete-style help (syntax lookups, NumPy ↔ PyTorch conversions), debugging individual error messages, reading recommendations.
  • Allowed with disclosure: explanations of mathematical concepts you're stuck on, suggested approaches after you've thought about it. Disclosure means a one-line note in your submission.
  • Not allowed: generating problem-set solutions or large blocks of project code by prompting an LLM. Submitting LLM-generated content without disclosure is an academic-integrity violation.

When in doubt, disclose. Disclosure costs you nothing; non-disclosure can cost you the course.

Lecture Notes

These notes follow Christopher Bishop's Pattern Recognition and Machine Learning (Springer 2006, freely available as a PDF) chapter-by-chapter. The aim is a careful, step-by-step development of the math: every symbol is defined, every transition is justified, and prerequisites are flagged where needed. Notes assume comfort with introductory linear algebra and calculus; the rest is built up as we go.

Part Notes
Foundations
Linear models
Neural networks & kernels
Probabilistic models & inference
Latent variables & sequential data
Ensembles