CSE 474/574 Introduction to Machine Learning
A first course in machine learning for upper-level undergraduates and graduate students. Covers supervised learning (linear and logistic regression, support vector machines, decision trees, neural networks), unsupervised learning (clustering, dimensionality reduction), model evaluation, regularization, and the bias–variance trade-off.
Students gain hands-on experience implementing core algorithms in Python with NumPy, scikit-learn, and PyTorch, and learn to read and reproduce results from the modern ML literature.
Pre-requisite: CSE 250 (or equivalent data structures), comfort with linear algebra (vectors, matrices, eigenvalues), basic probability, and multivariable calculus.
Instructor Information
Course Instructor: Jue Guo
- Research Area: Optimization for machine learning, Adversarial Learning, Continual Learning and Graph Learning
- Interested in participating in our research? Reach to me by email.
Course Outline and Logistics
Course Hours: CSE 474/574 LEC; Spring 2024 — University at Buffalo
Office Hours: posted on the LMS
Format: 3 lecture hours / week + weekly programming assignments + final project
| Week | Topic (PRML chapter) | Deliverable |
|---|---|---|
| 1 | Introduction · polynomial curve fitting · probability · decision theory · information theory (Ch 1) | HW0 (math diagnostic) |
| 2 | Probability distributions — Bernoulli/Beta, Gaussian, exponential family (Ch 2) | HW1 released |
| 3 | Linear models for regression · ridge · bias–variance · Bayesian LR (Ch 3) | HW1 due |
| 4 | Linear models for classification · logistic regression · LDA / Naive Bayes (Ch 4) | HW2 released |
| 5 | Neural networks · backpropagation · regularization (Ch 5) | HW2 due · project proposal |
| 6 | Kernel methods · Gaussian processes (Ch 6) | HW3 released |
| 7 | Sparse kernel machines · SVM · RVM (Ch 7) | HW3 due |
| 8 | Midterm exam (Coverage: Chs 1–7) and Catch-up | Midterm |
| — | Spring Recess (No Classes) | — |
| 9 | Graphical models · d-separation · message passing (Ch 8) | HW4 released |
| 10 | Mixture models · k-means · GMM · EM (Ch 9) | HW4 due · project milestone |
| 11 | Approximate inference · variational methods · EP (Ch 10) | HW5 released |
| 12 | Sampling methods · MCMC · HMC (Ch 11) | HW5 due |
| 13 | Continuous latent variables · PCA · autoencoders (Ch 12) | — |
| 14 | Sequential data · HMM · Kalman filter (Ch 13) | Project draft due |
| 15 | Combining models · boosting · trees · MoE (Ch 14) | Project presentations |
| Finals | Final exam · cumulative, with emphasis on Chs 8–14 | Final · final project report |
Note on Logistics
- A week-ahead notice for the mid-term, based on the pace of the course.
- The schedule is subject to change based on the overall pace and the performance of the class.
Grading
The course is assessed across five problem sets, a midterm exam, a final exam, and a research-style final project (proposal → milestone → draft → presentation → final report).
| Component | Weight |
|---|---|
| Problem sets (5 × 8 %) | 40 % |
| Midterm exam | 15 % |
| Final exam | 20 % |
| Final project | 25 % |
Grading Rubric
The final grade will be determined based on the overall performance of the class, taking into consideration all relevant assessments and contributions. The instructor reserves the right to make final decisions regarding grades.
Please note that excuses for missed work or poor performance, such as personal conflicts or minor inconveniences, will not be considered unless exceptional and documented circumstances arise.
| Percentage | Letter Grade | Percentage | Letter Grade |
|---|---|---|---|
| 95-100 | A | 70-74 | C+ |
| 90-94 | A- | 65-69 | C |
| 85-89 | B+ | 60-64 | C- |
| 80-84 | B | 55-59 | D |
| 75-79 | B- | 0-54 | F |
Course Policies
Late work
- Three late days total across all problem sets, no questions asked. Each late day extends a deadline by 24 hours.
- After late days are exhausted, late submissions lose 20 % per day, up to 3 days; no credit thereafter.
- Project deadlines (proposal, milestone, draft, final) are firm; late days do not apply.
Collaboration
You are encouraged to discuss problem sets with classmates at the conceptual level, but code and writeups must be your own. Sketching ideas at a whiteboard or helping someone debug a runtime error is fine; sharing solutions is not. Pair work is allowed (and encouraged) for the final project; each member's contribution should be made explicit in the report.
AI / LLM policy
- Allowed without disclosure: autocomplete-style help (syntax lookups, NumPy ↔ PyTorch conversions), debugging individual error messages, reading recommendations.
- Allowed with disclosure: explanations of mathematical concepts you're stuck on, suggested approaches after you've thought about it. Disclosure means a one-line note in your submission.
- Not allowed: generating problem-set solutions or large blocks of project code by prompting an LLM. Submitting LLM-generated content without disclosure is an academic-integrity violation.
When in doubt, disclose. Disclosure costs you nothing; non-disclosure can cost you the course.
Lecture Notes
These notes follow Christopher Bishop's Pattern Recognition and Machine Learning (Springer 2006, freely available as a PDF) chapter-by-chapter. The aim is a careful, step-by-step development of the math: every symbol is defined, every transition is justified, and prerequisites are flagged where needed. Notes assume comfort with introductory linear algebra and calculus; the rest is built up as we go.
| Part | Notes |
|---|---|
| Foundations | |
| Linear models | |
| Neural networks & kernels | |
| Probabilistic models & inference | |
| Latent variables & sequential data | |
| Ensembles |