12 Comments
User's avatar
Eray's avatar
3dEdited

LLMs need to ask more questions. There is too much focus on taking up a task and making autonomous decisions when there are gaps. Instead the priority should be to help the user to refine the task first.

I’ve seen more questions with recent models but it’s still scratching the surface.

Expand full comment
Carl Rannaberg's avatar

In Cursor I have defaulted to Gemini 2.5 Pro as my planner model, which I prompt to ask me questions. It has large context, it reasons and you can have iterative conversations with it, unlike the o-series models.

Expand full comment
Eray's avatar

Interesting, I’m using a similar setup, but usually refine outside the IDE.

Expand full comment
Pamela Wang, PhD's avatar

I also prompt Gemini in Google AI Studios to ask me the question it needs in order to give me a good plan.

Rather than waiting for the model to ask you questions on its own. You can just change the default mode. Because this is more ‘system prompt’ than model design.

Expand full comment
Ben Hylak's avatar

agreed

Expand full comment
Suraj's avatar

Considering how different models require different type of prompting and context, task usage, etc to make it function optimally, is there a tool that can help select the appropriate model, tune the prompt and then run it? Like a meta reasoning model sitting atop a library of models.

Expand full comment
John's avatar

What have you seen re:hallucination? My biggest issue with using o3 vs Claude sonnet/opus is that even though the output for o3 is often great I need to fact check absolutely everything. The o series seems much more willing to 'lie' / hallucinate than Claude or even 4o.

Expand full comment
Ben Hylak's avatar

I don't have evals to prove this yet, but: I think that without enough context, and without tools, it will definitely more readily overthink + hallucinate.

Expand full comment
Thomas Mustier's avatar

Very very cool. How did you share context while keeping it usable to the model & within window? How much pre-selection & pre-cleaning did you do, and how?

Expand full comment
SorenJ's avatar

Does it have that same used-car salesman feel like the other OAI models?

Expand full comment
Joel Dietz's avatar

Would be nice to see a "let her rip" prompt that we can see the output of.

Expand full comment
Leith Pierre's avatar

Any clear alignment concerns if you tested for that? Big issue with o3 was that its unaligned and it'd be real bad if it carried over to here

Expand full comment