Compressing Large Language Generation Models with Sequence-Level Knowledge Distillation
By Brendan Chambers, David Silin, and Kevin Gimpel of Quillbot Research
Continue reading: Compressing Large Language Generation Models with Sequence-Level Knowledge Distillation