7 Comments
User's avatar
Luq's avatar

Another banger post Shivani! One question, since most model predicting one token at a time, does it mean that after each token generated, it keeps repeating the entire process for the next token?

Expand full comment
Shivani Virdi's avatar

Yes Luq, at inference time, it will first generate a token, append it to context and then carry out the inference again for the updated context

Expand full comment
Luq's avatar

So, that means prompt like “show reason step by step, then give answer” is better than “think step by step, but show only answer”. Since the reasoning context is now ‘fixed’ during each new token update, less room for it to go down different reasoning path at each new inference process. Is this correct understanding? Also thanks for entertaining my questions ☺️

Expand full comment
Shivani Virdi's avatar

that’s right Luq, thinking means nothing, models “think” in tokens if it’s in context only then will it pay “attention to those tokens”! Even in reasoning models, that’s exactly what’s happening (the thinking might be hidden from us in the UI, but the model context is enriched with that)!

Happy to answer all questions 😊

Expand full comment
Luq's avatar

Thankyouu that helps a lot! You gain new fan here 😝 Looking forward for your next post!

Expand full comment
Rohit Kumar Tiwari's avatar

Loved the breakdown @Shivani Virdi. Thanks for sharing!

Expand full comment
Shivani Virdi's avatar

So glad to hear that, Rohit!

Expand full comment