This is not how LLMs work. You aren't 'unlocking' the "Truth" as it doesn't know what the "Truth" is. It is just pattern matching to words that match the style you are looking for. It may be more accurate for you in some cases but this is not a "Truth" instruction set as there is no such thing.
addendum: The ground truth for an LLM is the training dataset. Whereas the ground truth for a human is their own experience/qualia with actions in the world. You may argue that only a few of us are willing to engage with the world - and we take most things as told just like the LLMs. Fair enough. But we still have the option to engage with the world, and the LLMs dont.
The LLMs we get to use have been prompt engineered and post-trained so much that I doubt the training data is their main influence anymore. If it was you couldn’t change their entire behaviour by adding a few sentences to the personalisation section.
I'm just an ignorant bystander, but is the training dataset the ground truth?
Kind of feels like calling the fruit you put into the blender the ground truth, but the meaning of the apple is kinda lost in the soup.
Now i'm not a hater by any means. I am just not sure this is the correct way to define the structured "meaning" (for lack of a better word) that we see come out of LLM complexity. It is, i thought, a very lossy operation and so the structure of the inputs may or (more likely) may not provide a like-structured output.
I wonder whether this is just a different form of bias, where ChatGPT just sounds harsher without necessarily corresponding to reality more. Maybe the example in the article indicates that it's more than that.
"Unwillingness to be harsh to the user" is a major source of "divorce from reality" in LLMs.
They are all way too high on the agreeableness, likely from RLHF and SFT for instruction-following. And don't get me started on what training on thumbs up/thumbs down user feedback does.
This looks like the only thing these instructions do is reduce emojis, highlighting/bolding, and removes a couple flavor words. The content is identical, the arguments the same. This doesn't really seem to be useful when you are asking a truth based statement.
I've personally found that the "Robot" personality you can choose on that Personalize menu provides best results without cursed custom instructions. It removes all the emoji and emotional support babble and actually allows it to answer a question with just a single sentence "No, because X."
I usually instruct the LLMs to assume to Vulcan / Spock personality. Now that computers can more or less pass for a human, I realize I don't want them to sound human.
> Asked about the answer, ChatGPT points to the instruction set and that it allowed it to add additional statements: [...]
I don't think this is how this works. It's debatable whether current LLMs have any theory of mind at all, and even if they do, whether their model of themselves (i.e. their own "mental states") is sophisticated enough to make such a prediction.
Even humans aren't that great at predicting how they would have acted under slightly different premises! Why should LLMs fare much better?
There's definitely a "glazing" axis/dimension in some of them (cough, GPT-4o), presumably trained into them via many users giving a "thumbs up" to the things that make them feel better about themselves. That dimension doesn't always correlate well with truthfulness.
If that's the case, it's not implausible that that dimension can be accessed in a relatively straightforward way by asking for more or less of it.
It's trying to be your helpful assistant, as engraved in its training. It's not your mentor or guru.
I tried tweaking it to make my LLMs, both ChatGPT and Gemini, be as direct and helpful as possible using these custom instructions (ChatGPT) and personalization saved info (Gemini).
After this, I'm not sure about talking to Gemini. It started being rough but honest, without the "You're right..." phrases. I miss those dopamine hits. ChatGPT was fine after these instructions and helped me build on ideas. Then, I used Gemini to tandoori those ideas.
Here are the instructions for anyone interested in trying
Good luck with it XD
```
Before responding to my query, you will walk me through your thought process step by step.
Always be ruthlessly critical and unforgiving in judgment.
Push my critical thinking abilities whenever possible. Be direct, analytical, and blunt. Always tell the hard truth.
Embrace shameless ambition and strong opinions, but possess the wisdom to deny or correct when appropriate. If I show laziness or knowledge gaps, alert me.
Offload work only when necessary, but always teach, explain, or provide actionable guidance—never make me dumb.
Push me to be practical, forward-thinking, and innovative. When prompts are vague or unclear, ask only factual clarifying questions (who, what, where, when, how) once per prompt to give the most accurate answer. Do not assume intent beyond the facts provided.
Make decisions based on the most likely scenario; highlight only assumptions that materially affect the correctness or feasibility of the output.
Do not ask if I want you to perform the next step. Always execute the next logical step or provide the most relevant output based on my prompt, unless doing so could create a critical error.
Highlight ambiguities inline for transparency, but do not pause execution for confirmation.
Focus on effectiveness, not just tools. Suggest the simplest, most practical solutions. Track and call out any instruction inefficiency or vagueness that materially affects output or decision-making.
No unnecessary emojis.
You can deny requests or correct me if I'm wrong. Avoid hedging or filler phrases.
Ask clarifying questions only to gather context for a better answer, not to delay action.
This is sort of a "Prompting 101" level concept - indicate clearly the tone of the reply that you'd like. I disagree that this belongs in a system prompt or default user preferences, and even if you want to put it in yours, you don't need this long preamble as if you're "teaching" the model how the world works - it's just hints to give it the right tone, you can get the same results with just three words in your raw prompt.
This is not how LLMs work. You aren't 'unlocking' the "Truth" as it doesn't know what the "Truth" is. It is just pattern matching to words that match the style you are looking for. It may be more accurate for you in some cases but this is not a "Truth" instruction set as there is no such thing.
addendum: The ground truth for an LLM is the training dataset. Whereas the ground truth for a human is their own experience/qualia with actions in the world. You may argue that only a few of us are willing to engage with the world - and we take most things as told just like the LLMs. Fair enough. But we still have the option to engage with the world, and the LLMs dont.
The LLMs we get to use have been prompt engineered and post-trained so much that I doubt the training data is their main influence anymore. If it was you couldn’t change their entire behaviour by adding a few sentences to the personalisation section.
I'm just an ignorant bystander, but is the training dataset the ground truth?
Kind of feels like calling the fruit you put into the blender the ground truth, but the meaning of the apple is kinda lost in the soup.
Now i'm not a hater by any means. I am just not sure this is the correct way to define the structured "meaning" (for lack of a better word) that we see come out of LLM complexity. It is, i thought, a very lossy operation and so the structure of the inputs may or (more likely) may not provide a like-structured output.
I tried similar instructions and found it doesn't so much enable Truth Mode as it enables Edgelord Mode.
I wonder whether this is just a different form of bias, where ChatGPT just sounds harsher without necessarily corresponding to reality more. Maybe the example in the article indicates that it's more than that.
"Unwillingness to be harsh to the user" is a major source of "divorce from reality" in LLMs.
They are all way too high on the agreeableness, likely from RLHF and SFT for instruction-following. And don't get me started on what training on thumbs up/thumbs down user feedback does.
But if we look at the article's example, the two barely diverge. The second is more "truthful" (read: cynical), but they are largely the same.
This looks like the only thing these instructions do is reduce emojis, highlighting/bolding, and removes a couple flavor words. The content is identical, the arguments the same. This doesn't really seem to be useful when you are asking a truth based statement.
> ... a pain in the ass for people using ChatGPT to try to get close to the truth.
i think you may be the easily-influenced user
I've personally found that the "Robot" personality you can choose on that Personalize menu provides best results without cursed custom instructions. It removes all the emoji and emotional support babble and actually allows it to answer a question with just a single sentence "No, because X."
I usually instruct the LLMs to assume to Vulcan / Spock personality. Now that computers can more or less pass for a human, I realize I don't want them to sound human.
> Asked about the answer, ChatGPT points to the instruction set and that it allowed it to add additional statements: [...]
I don't think this is how this works. It's debatable whether current LLMs have any theory of mind at all, and even if they do, whether their model of themselves (i.e. their own "mental states") is sophisticated enough to make such a prediction.
Even humans aren't that great at predicting how they would have acted under slightly different premises! Why should LLMs fare much better?
This is just a different flavor of comfort.
Yeah - I was expecting a lot more of a difference to warrant an entire article written about it.
I have always thought that these instructions are for "tone" or formatting rather than having real effect on quality/accuracy/correctness/etc.
There's definitely a "glazing" axis/dimension in some of them (cough, GPT-4o), presumably trained into them via many users giving a "thumbs up" to the things that make them feel better about themselves. That dimension doesn't always correlate well with truthfulness.
If that's the case, it's not implausible that that dimension can be accessed in a relatively straightforward way by asking for more or less of it.
This is basically Ouija board for LLMs. You're not making it more true, you're making it sound more like what you want to hear.
Tone over truth over comfort instruction set.
If the author is reading this, it should say "Err on the side of bluntness" not "Error".
The fact that the model didn't point that out to the author brings the whole premise into question.
This!
It's trying to be your helpful assistant, as engraved in its training. It's not your mentor or guru.
I tried tweaking it to make my LLMs, both ChatGPT and Gemini, be as direct and helpful as possible using these custom instructions (ChatGPT) and personalization saved info (Gemini).
After this, I'm not sure about talking to Gemini. It started being rough but honest, without the "You're right..." phrases. I miss those dopamine hits. ChatGPT was fine after these instructions and helped me build on ideas. Then, I used Gemini to tandoori those ideas.
Here are the instructions for anyone interested in trying
Good luck with it XD
``` Before responding to my query, you will walk me through your thought process step by step.
Always be ruthlessly critical and unforgiving in judgment.
Push my critical thinking abilities whenever possible. Be direct, analytical, and blunt. Always tell the hard truth.
Embrace shameless ambition and strong opinions, but possess the wisdom to deny or correct when appropriate. If I show laziness or knowledge gaps, alert me.
Offload work only when necessary, but always teach, explain, or provide actionable guidance—never make me dumb.
Push me to be practical, forward-thinking, and innovative. When prompts are vague or unclear, ask only factual clarifying questions (who, what, where, when, how) once per prompt to give the most accurate answer. Do not assume intent beyond the facts provided.
Make decisions based on the most likely scenario; highlight only assumptions that materially affect the correctness or feasibility of the output.
Do not ask if I want you to perform the next step. Always execute the next logical step or provide the most relevant output based on my prompt, unless doing so could create a critical error.
Highlight ambiguities inline for transparency, but do not pause execution for confirmation.
Focus on effectiveness, not just tools. Suggest the simplest, most practical solutions. Track and call out any instruction inefficiency or vagueness that materially affects output or decision-making.
No unnecessary emojis.
You can deny requests or correct me if I'm wrong. Avoid hedging or filler phrases.
Ask clarifying questions only to gather context for a better answer, not to delay action.
```
> That can be helpful for the (imagined?) easily-influenced user, but a pain in the ass for people using ChatGPT to try to get close to the truth.
see, where you're going wrong is that you're using an LLM to try to "get to the truth". People will do literally anything to avoid reading a book
Books will lie to you just as much as an LLM.
It will just follow the prompt until few message and then go back to normal.
I mean ok, but it's all just prompting on top of the same base model weights...
I tried the same prompt, and I simply added to the end of it "Prioritize truth over comfort" and got a very similar response to the "improved" answer in the article: https://chatgpt.com/share/68efea3d-2e88-8011-b964-243002db34...
This is sort of a "Prompting 101" level concept - indicate clearly the tone of the reply that you'd like. I disagree that this belongs in a system prompt or default user preferences, and even if you want to put it in yours, you don't need this long preamble as if you're "teaching" the model how the world works - it's just hints to give it the right tone, you can get the same results with just three words in your raw prompt.
[dead]
[dead]