LLMs need to ask more questions. There is too much focus on taking up a task and making autonomous decisions when there are gaps. Instead the priority should be to help the user to refine the task first.
I’ve seen more questions with recent models but it’s still scratching the surface.
I also prompt Gemini in Google AI Studios to ask me the question it needs in order to give me a good plan.
Rather than waiting for the model to ask you questions on its own. You can just change the default mode. Because this is more ‘system prompt’ than model design.
In Cursor I have defaulted to Gemini 2.5 Pro as my planner model, which I prompt to ask me questions. It has large context, it reasons and you can have iterative conversations with it, unlike the o-series models.
Considering how different models require different type of prompting and context, task usage, etc to make it function optimally, is there a tool that can help select the appropriate model, tune the prompt and then run it? Like a meta reasoning model sitting atop a library of models.
What have you seen re:hallucination? My biggest issue with using o3 vs Claude sonnet/opus is that even though the output for o3 is often great I need to fact check absolutely everything. The o series seems much more willing to 'lie' / hallucinate than Claude or even 4o.
I don't have evals to prove this yet, but: I think that without enough context, and without tools, it will definitely more readily overthink + hallucinate.
Excellent Article. I've noticed that providing expansive context to 4o also improves its ability to provide useful detailed plans and recommendations. So it seems reasonable that more powerful models like o1, o3, and o3-Pro would benefit even more from expanded context information. Sometimes I just dictate to it for 5 minutes all the background info I can think of. Works great. I've been using the approach you outlined in one of our other articles on how to structure the prompt for o1 et al and it's really helped. Since I haven't had access to o3-pro yet, I really appreciate the information you've provided in this article. Super helpful. Thank you.
Very very cool. How did you share context while keeping it usable to the model & within window? How much pre-selection & pre-cleaning did you do, and how?
LLMs need to ask more questions. There is too much focus on taking up a task and making autonomous decisions when there are gaps. Instead the priority should be to help the user to refine the task first.
I’ve seen more questions with recent models but it’s still scratching the surface.
I also prompt Gemini in Google AI Studios to ask me the question it needs in order to give me a good plan.
Rather than waiting for the model to ask you questions on its own. You can just change the default mode. Because this is more ‘system prompt’ than model design.
In Cursor I have defaulted to Gemini 2.5 Pro as my planner model, which I prompt to ask me questions. It has large context, it reasons and you can have iterative conversations with it, unlike the o-series models.
Interesting, I’m using a similar setup, but usually refine outside the IDE.
agreed
Considering how different models require different type of prompting and context, task usage, etc to make it function optimally, is there a tool that can help select the appropriate model, tune the prompt and then run it? Like a meta reasoning model sitting atop a library of models.
What have you seen re:hallucination? My biggest issue with using o3 vs Claude sonnet/opus is that even though the output for o3 is often great I need to fact check absolutely everything. The o series seems much more willing to 'lie' / hallucinate than Claude or even 4o.
I don't have evals to prove this yet, but: I think that without enough context, and without tools, it will definitely more readily overthink + hallucinate.
Excellent Article. I've noticed that providing expansive context to 4o also improves its ability to provide useful detailed plans and recommendations. So it seems reasonable that more powerful models like o1, o3, and o3-Pro would benefit even more from expanded context information. Sometimes I just dictate to it for 5 minutes all the background info I can think of. Works great. I've been using the approach you outlined in one of our other articles on how to structure the prompt for o1 et al and it's really helped. Since I haven't had access to o3-pro yet, I really appreciate the information you've provided in this article. Super helpful. Thank you.
Fuck off.. God has nothing to do with that
Very very cool. How did you share context while keeping it usable to the model & within window? How much pre-selection & pre-cleaning did you do, and how?
Does it have that same used-car salesman feel like the other OAI models?
Would be nice to see a "let her rip" prompt that we can see the output of.
Any clear alignment concerns if you tested for that? Big issue with o3 was that its unaligned and it'd be real bad if it carried over to here