Skip to content

Cryptic News

Layer-wise inferencing + batching: Small VRAM doesn't limit LLM throughput anymore

2024-05-14 by Evan Ovadia - (0 min read)

Higher RAII, and the Seven Arcane Uses of Linear Types

The One-Hour Principle