-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Pull requests: antirez/ds4
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add served model name option for server discovery
#456
opened Jun 25, 2026 by
RiccardoFiorentini
Loading…
Metal: keep selected-address SSD prefill opt-in by default
#454
opened Jun 25, 2026 by
andreaborio
•
Draft
Fix ROCm Q8->F16 cache reserve starving session tensors on large models (q4q2)
#446
opened Jun 23, 2026 by
alantsev
Contributor
Loading…
AGENTS.md rename (and server performance improvements?)
#443
opened Jun 21, 2026 by
OPS-NeoRetro
Loading…
Add reverse distributed topology with coordinator-owned output suffix
#430
opened Jun 16, 2026 by
lobanov
Loading…
Fix: ds4-server rejects HTTP requests using Transfer-Encoding: chunked
#423
opened Jun 16, 2026 by
moritzburgard
Loading…
agent: reject edit calls whose new= text contains [upto]
#421
opened Jun 16, 2026 by
aledesogusbusiness-hue
Loading…
Metal: protect tensor alloc/free byte counters with a mutex
#420
opened Jun 16, 2026 by
aledesogusbusiness-hue
Loading…
server: expose only the loaded model in /v1/models
#419
opened Jun 16, 2026 by
aledesogusbusiness-hue
Loading…
Metal: FP8-packed compressed-KV cache + long-context memory optimizations
#418
opened Jun 16, 2026 by
aledesogusbusiness-hue
Loading…
Metal: FP8-packed compressed-KV cache + long-context memory optimizations
#416
opened Jun 15, 2026 by
lixiangnlp
Loading…
Fix bug with impact on DeepSeek V4 Pro MTP Drafter usage
#411
opened Jun 14, 2026 by
Deviad
Loading…
rocm: fix distributed inference on unified-memory APUs (strix halo / gfx1151)
#407
opened Jun 13, 2026 by
kyuz0
Loading…
[3/N] add prefetch support for CUDA backend : running ds4 for any GPU with cache (2.75 x faster!)
#402
opened Jun 12, 2026 by
yiakwy-xpu-ml-framework-team
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.