Instead of computing total tokens with tiktoken we just attempt retries after trimming the first message