valcore 's Collections

DSSD

Trained early exit head to be used with Dynamic Self-Speculative Decoding