DevBench introduces a rigorous, telemetry-driven benchmark for code generation models, leveraging real developer workflows to evaluate syntactic precision an...
Level: advanced
By Pareesa Ameneh Golnari and 7 other authors
Category: research