Skip to content

dbaranchuk/VisualGenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visual GenAI Course

lectures assignments exam language

A graduate-level course on modern generative modeling for visual domains — images, video, and 3D. The main focus is on diffusion models: their theoretical interpretations and the advanced training and sampling methods behind today's high-quality, fast generators. Special attention is given to few-step diffusion-based models used in production image/video services, and to recent advances in autoregressive visual generation and its integration with diffusion.

Goals

  • Develop a deep understanding of leading visual generative paradigms.
  • Learn novel, effective diffusion-based generative frameworks.
  • Master the most recent practical techniques behind state-of-the-art generative models.

Contents


Syllabus

Each lecture comes with English slides (.pdf in this repo) and a recorded lecture/seminar (RU).

# Topic Materials
1 Introduction to Diffusion Models — Denoising Diffusion Probabilistic Models (DDPMs) & Denoising Score Matching (DSM) Slides · Lecture (RU) · Seminar (RU)
2 Continuous Diffusion Models — Probability Flow ODE and SDE formulations Slides · Lecture (RU)
3 Flow Matching and its connection to diffusion models Slides · Lecture (RU)
4 Efficient PF-ODE/SDE Solvers — Euler methods, DDIM, and DPM-Solver Slides · Lecture (RU)
5 Diffusion Models in Practice — diffusion spaces, recent architectures, design choices, training & sampling techniques Slides · Lecture (RU) · Seminar (RU)
6 Flow Map Models — learnable PF-ODE integrators for faster sampling (Consistency Models, MeanFlow) Slides · Lecture (RU) · Supplementary
7 Distribution Matching for few-step generators (DMD, ADD, SwD, Drifting Models) Slides · Lecture (RU)
8 Autoregressive Visual Generation — discrete tokenizers VQ-VAE/VQ-GAN, scale-wise models (VAR, Switti), continuous AR (MAR), diffusion as AR Slides · Lecture (RU)
9 Video Generation — architectures, challenges, and AR video diffusion models Slides · Lecture (RU) · Seminar (RU)
10 Efficient Diffusion Models — model-level optimizations (caching, sparse attention, quantization, …) Slides · Lecture (RU)
11 Multimodal Generative Models — architectures, training setups, and conditioning in diffusion (ControlNet, IP-Adapter) Slides · Lecture
12 3D Generative Models — intro to 3D modeling and multi-view diffusion models Slides · Lecture

Assignments

Homeworks live in assignments/. Each contains a starter notebook (and a task.pdf / theory.pdf where applicable); reference solutions are released separately.

# Topic Starter
1 Intro to diffusion: DDPM & DSM hw1/practice_template.ipynb, task.pdf
2 Efficient solvers: DPM-Solver hw2/practice_dpm_solver.ipynb, theory.pdf
3 Flow Matching training hw3/fm_training.ipynb
4 Flow Map Models hw4/flow_map_models.ipynb
5 Distribution matching: ADD / MMD distillation hw5/add_mmd_distillation.ipynb
6 Autoregressive generation: MAR with a flow-matching head hw6/mar_fm_head.ipynb
7 AR video diffusion sampling hw7/ar_video_diffusion_sampling.ipynb

Exam

The full list of examinable topics is in exam/exam_topics_visual_genai.md. The exam/slot-booking-service/ is a small web app students use to book exam slots (see its own README).


Contacts

Course staff

References

  • The introduction to diffusion models follows CS236 by Stefano Ermon.
  • Some explanations are inspired by The Principles of Diffusion Models.
  • Numerous papers and blog posts that led us to this course.

About

Materials for the VisualGenAI course at YSDA2026

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages