Explore Hydra Ensembles, a novel approach to structured attention head pruning that delivers calibrated uncertainty and near-single-model inference speeds wi...
Level: advanced
By Firas Gabetni, Giuseppe Curci, Andrea Pilzer, Subhankar Roy, Elisa Ricci, Gianni Franchi
Category: research