Perplexity vs sequence length, Llama-2-7B
illustrative shape from Xiao et al. (2023). Both axes log scale.