Efficient Autoregressive Inference for Transformer Probabilistic Models

This research introduces a causal autoregressive buffer that decouples context encoding from conditioning updates, enabling 20x faster sampling in transforme...

Level: advanced

By Unknown

Category: research