Neptune: Advanced ML Operator Fusion for Locality and Parallelism on GPUs

Explore Neptune, a tensor compiler designed to optimize ML operators on GPUs by resolving loop-carried dependencies and enhancing data reuse for superior per...

Level: advanced

By Unknown

Category: research