Current AI LLMs are so terrible. Basic task failure beyond writing, is everywhere.

contract

We're all gunna mine it brah.
Joined
Jun 2, 2015
Messages
433
Likes
460
Degree
2
We have access to world's greatest AI programs and they all, every single one, fail to deliver beyond basic tasks such as writing (very minor complex tasks ie. combine ABC without duplicates.)

The "brain" part is just missing. The moving around text/generating text is there, but the logic to realize there's duplicates, it doesn't flow, is just not there... You can say it's input failure, but it's not.

I'm assuming all AI companies are dumping everything into music, video, image generation and text is a thing of the past..
 
I'm not so sure of this.

Granted, I don't know what you are doing and example you gave might just be weak ( or I don't understand it ), but I have gotten Ai to do whatever I want and I ask and do some complex shit with it.

I think it comes down to a few things:

1. model you are working with. I only use ChatGPT for certain XYZ tasks, and then I use Claude for certain ABC tasks, and Gemini for certain JKL tasks. You gotta understand each's stregths.

2. token usage. If you are trying to one shot everything, it's going to fail if you have a ton of thinking it needs to do. The first 100k tokens work 500% better than the last 100k tokens in a one shot prompt. Even if you are not one shotting it and going in and using the chat window for multiple asks in the same window, its the same idea until you hit the token limit.

3. chat window vs API. This kinda touches on #2 but is different. I find using the "chat window" of any model worse than if I use the API through N8N or Make or w/e. Some of this is token use, but I also think it's because of of workflow and how you have to ask/prompt/design the end result. IYKYK.

4. repeat importance. Even with all the experience I have, an Ai/LLM will forget or leave something out. I have to put the task in the prompt 3 times if it is important. I generally put it in the 1st, middle and end of the prompt and I will word it slightly different each placement.

5. sometimes you have to teach Ai. Ai is just a summary of everything it has been given. This means it's initial training from the LLM, but also what you have in your chat history with it. Or your RAG setup. If your duplicates are not exact match wordings ( character by character ) you are going to have to teach it and give it instructions what a duplicate actually is. Maybe instead of being a duplicate at an exact match, it's a duplicate by intent. You will have to teach it that intent and provide it examples so it can learn and do it.
 
^^ damn that's a good post
Summarizes my experience with one exception.

Number 3. Chat window has given me pure brilliance almost like its tuned differently a couple times or allowed a different set of resource constraints. Specifically with chat gpt. It won't build shit in chat window, like you can do with api credits or claudes more templated chat window that forces everything down a couple of design trees.

5. is really really true. With importance repetition, exception calling out and asking it to rebuild its model to not miss items. Chat gpt has ingested an enormous quantity of data, I have asked it specifically to use its word vector summaries or emphasize social data and build a model to reach a conclusion. Then, given it reference items and asked where they go relative to a list or some other hierarchy to tune it with pretty insane results.
Its clearly got some organized data sets its not just allowed to spit out but can pull real conclusions from if you tell it to lay off the language model projected answer patronizer buillshit and tune in based on summaries of dataset it was exposed to.
 
Back