Dec 27, 2022 09:21 AM - edited Dec 27, 2022 10:30 AM
This is a Tell; not a Show.
Prompt engineering, also known as shaping prompts to get exactly what you want from GPT-3 and ChatGPT will eventually bite you.
This post is here only so that I can say 6 months from now - I tried to warn you.
The core model used by GPT-3 is text-davinci-003. GPT-4 is on the near horizon and it will be so much more advanced, the prompts you build into your applications today will be fundamentally useless with text-davinci-004.
Prompt engineering is also aptly referred to as "spell casting". You can cast spells on LLM's (large language models) quite easily to achieve favorable results. GP3, for example, glosses over intermediate steps to reach conclusions, often resulting in misleading or entirely wrong conclusions. It is especially poor at math and computations in excess of three digits. But, when you insert a simple phrase "step by step" into the prompt, it seems to get a lot smarter. In some cases, this simple prompt assertion can increase its intelligent five times.
The prompts engineered for GP2 by countless beta testers and practitioners representing hundreds of thousands of hours in effort, pretty much completely fail when applied to GP3.
NLP spell-casting is a brittle approach.
Dec 30, 2022 02:38 AM
This a very interesting topic! To start I wanted to highlight part of Wikipedia about ChatGPT:
"Its uneven factual accuracy was identified as a significant drawback." - I just love the uneven factual accuracy as term, we might adopt it in other areas of life as well.
In my code editor I have been using Copliot (based on GPT) for a while and it has been extremely timesaving, but I rarely use it for more than single / double line completion.
Last week I have used ChatGPT a couple of times to help me with some more advanced TypeScript types (instead of scouring Stackoverflow + TS docs). The results were useful but unevenly accurate... Sometimes through careful crafting of prompts I was able to get result, that looked exactly what I wanted it to be. Only to realize that transformation proposed by ChatGPT were how I wanted TypeScript to work, not how it really works.
I have not tried adding "step by step" or anything similar, but probably even then I will be safer with assumption that results will be unevenly factually accurate.