- cross-posted to:
- programmerhumor@lemmy.ml
- cross-posted to:
- programmerhumor@lemmy.ml
- You can instantly get whatever you want, only it’s made from 100% technical debt - That estimate seems a little low to me. It’s at least 115%. - even more. The first 100% of the tech debt is just understanding “your own” code. 
 
- Just start again from scratch for every feature 🤣 
 
- And then 12 hours spent debugging and pulling it apart. - And if you need anything else, you have to use a new prompt which will generate a brand new application, it’s fun! - That’s not really how agentic ai programming works anymore. Tools like cursor automatically pick files as “context”, and you can manually add them or the whole ckdebase as well. That obviously uses way more tokens though. 
 
- And it still doesn’t work. Just “mostly works”. - A bunch of superfluous code that you find does nothing. 
 
- We’re in trouble when it learns to debug. - But then, as now, it won’t understand what it’s supposed to do, and will merely attempt to apply stolen code - ahem - training data in random permutations until it roughly matches what it interprets the end goal to be. - We’ve moved beyond a thousand monkeys with typewriters and a thousand years to write Shakespeare, and have moved into several million monkeys with copy and paste and only a few milliseconds to write “Hello, SEGFAULT” 
 
 
- deleted by creator - That’s the thing, it’s a useful assistant for an expert who will be able to verify any answers. - It’s a disaster for anyone who’s ignorant of the domain. - Tell me about it. I teach a python class. Super basic, super easy. Students are sometimes idiots, but if they follow the steps, most of them should be fine. Sometimes I get one who thinks they can just do everything with chatgpt. They’ll be working on their final assignment and they’ll ask me what a for loop is for. Than I look at their code and it looks like Sanscrit. They probably haven’t written a single line of code in those weeks. 
 
- I turned on copilot in VSCode for the first time this week. The results so far have been less than stellar. It’s batting about .100 in terms of completing code the way I intended. Now, people tell me it needs to learn your ways, so I’m going to give it a chance. But one thing it has done is replaced the normal auto-completion which showed you what sort of arguments a function takes with something that is sometimes dead wrong. Like the code will not even compile with the suggested args. - It also has a knack for making me forget what I was trying to do. It will show me something like the left side picture with a nice rail stretching off into the distance when I had intended it to turn, and then I can’t remember whether I wanted to go left or right? I guess it’s just something you need to adjust to. Like you need to have a thought fairly firmly in your mind before you begin typing so that you can react to the AI code in a reasonable way? It may occasionally be better than what you have it mind, but you need to keep the original idea in your head for comparison purposes. I’m not good at that yet. - deleted by creator - Thanks, it makes me feel relieved to hear I’m not the only one finding it a little overwhelming! Previously, I had been using chatgpt and the like where I would be hunting for the answer to a particularly esoteric programming question. I’ve had a fair amount of success with that, though occasionally I would catch it in the act of contradicting itself, so I’ve learned you have to follow up on it a bit. - deleted by creator 
 
 
- Try Roocode or Cline with the Claude3.7 model. It’s pretty slick, way better than Copilot. Turn on Memory Bank for larger projects to reduce the cost of tokens. 
- I haven’t personally used it, but my coworker said using Cursor with the newest Claude model is a gamechanger and he can’t go back anymore 🤷♂️ he hasn’t really liked anything outside of cursor yet - Thanks, I’ll give that a shot. 
 
 
- If you’re having to do repetitive shit, you might reconsider your approach. - Depending on the situation, repetitive shit might be unavoidable - Usually you can solve the issue by using regex, but regex can be difficult to work with as well - Skill issue… 
 
- deleted by creator 
- I’ve tried this, to convert a large json file to simplified yaml. It was riddled with hallucinations and mistakes even for this simple, deterministic, verifiable task. 
 
- It’s taken me a while to learn how to use it and where it works best but I’m coming around to where it fits. - Just today i was doing a new project, i wrote a couple lines about what i needed and asked for a database schema. It looked about 80% right. Then asked for all the models for the ORM i wanted and it did that. Probably saved an hour of tedious typing. - deleted by creator - I assume we’re talking about software testing? I’d like to know more about: - 
The meaning of negative and positive tests in this context 
- 
Good examples of badly done negative tests by LLMs 
 - deleted by creator - Thank you for thoroughly explaining this. Your explanations make good sense to me. 
 
 
- 
 
 
- I knocked off an android app in Flutter/Dart/Supabase in about a week of evenings with Claude. I have never used Flutter before, but I know enough coding to fix things and give good instructions about what I want. - It would even debug my android test environment for me and wrote automated tests to debug the application, as well as spit out the compose files I needed to set up the Supabase docker container and SQL queries to prep the database and authentication backend. - That was using 3.5Sonnet, and from what I’ve seen of 3.7, it’s way better. I think it cost me about $20 in tokens. I’ve never used AI to code anything before, this was my first attempt. Pretty cool. - I used 3.7 on a project yesterday (refactoring to use a different library). I provided the documentation and examples in the initial context and it re-factored the code correctly. It took the agent about 20 minutes to complete the re-write and it took me about 2 hours to review the changes. It would have taken me the entire day to do the changes manually. The cost was about $10. - It was less successful when I attempted to YOLO the rest of my API credits by giving it a large project (using langchain to create an input device that uses local AI to dictate as if it were a keyboard). Some parts of the codes are correct, the langchain stuff is setup as I would expect. Other parts are simply incorrect and unworkable. It’s assuming that it can bind global hotkeys in Wayland, configuration required editing python files instead of pulling from a configuration file, it created install scripts instead of PKGBUILDs, etcetc. - I liken it to having an eager newbie. It doesn’t know much, makes simple mistakes, but it can handle some busy work provided that it is supervised. - I’m less worried about AI taking my job then my job turning into being a middle-manager for AI teams. - I think the further you get out in to esoteric or new things, the less they have to draw on. I’ve had a bit of the same issue building Lora telemetry on ESP32 with specific radio modules because there might be a couple of realworld examples out there of using those libraries. - I feel this pain. - I’ve been trying to get simple telemetry working over lora on a ESP32-C6, LLMs are largely worthless in this. We gotta fall back to old school RTFM models 
 
 
- deleted by creator 
 
- Shhhh! You’re not supposed to rock the AI hate boat. - I hate the ethics of it, especially the image models. - But frankly it’s here, and lawyers were supposed to have figured out the ethics of it. - I use hosted Deepseek as an FU to OpenAI and GitHub for stealing my code. 
- deleted by creator 
 
- I’ve been trying to use aider for this, it seems really cool but my machine and wallet cannot handle the sheer volume of tokens it consumes. - deleted by creator - Aider is an LLM agent type app that has a programming assistant and an architect assistant. - You tell the architect what you want and it scans the structure of your code base to generate the boilerplate. Then the coder fills it in. It has command prompt access to then compile and run etc. - I haven’t really figured it out yet. - deleted by creator 
 
 
 
 
- Not to be that guy, but the image with all the traintracks might just be doing it’s job perfectly. - The one on the right prints “hello world” to the terminal - And takes 5 seconds to do it 
 
- Engineers love moving parts, known for their reliability and vigor - Vigor killed me 
 
- Might is the important here 
- It gives you the right picture when you asked for a single straight track on the prompt. Now you have to spend 10 hours debugging code and fixing hallucinations of functions that don’t exist on libraries it doesn’t even neet to import. - Not a developer. I just wonder about AI hallucinations come about. Is it the ‘need’ to complete the task requested at the cost of being wrong? - No, it’s just that it doesn’t know if it’s right or wrong. - How “AI” learns is they go through a text - say blog post - and turn it all into numbers. E.g. word “blog” is 5383825526283. Word “post” is 5611004646463. Over huge amount of texts, a pattern is emerging that the second number is almost always following the first number. Basically statistics. And it does that for all the words and word combinations it found - immense amount of text are needed to find all those patterns. (Fun fact: That’s why companies like e.g. OpenAI, which makes ChatGPT need hundreds of millions of dollars to “train the model” - they need enough computer power, storage, memory to read the whole damn internet.) 
 - So now how do the LLMs “understand”? They don’t, it’s just a bunch of numbers and statistics of which word (turned into that number, or “token” to be more precise) follows which other word. 
 - So now. Why do they hallucinate? - How they get your question, how they work, is they turn over all your words in the prompt to numbers again. And then go find in their huge databases, which words are likely to follow your words. - They add in a tiny bit of randomness, they sometimes replace a “closer” match with a synonym or a less likely match, so they even seen real. - They add “weights” so that they would rather pick one phrase over another, or e.g. give some topics very very small likelihoods - think pornography or something. “Tweaking the model”. - But there’s no knowledge as such, mostly it is statistics and dice rolling. - So the hallucination is not “wrong”, it’s just statisticaly likely that the words would follow based on your words. - Did that help? - Thanks. If I understood that correctly doesn’t sound like AI to me. 
 
- Full disclosure - my background is in operations (think IT) not AI research. So some of this might be wrong. - What’s marketed as AI is something called a large language model. This distinction is important because AI implies intelligence - where as a LLM is something else. At a high level LLMs are using something called “tokens” to break apart natural language into elements that a machine can understand, and then recombining those tokens to “create” something new. When a LLM is creating output it does not know what it is saying - it knows what token statistically comes after the token(s) it has generated already. - So to answer your question. An AI can hallucinate because it does not know the answer - its using advanced math to know that the period goes at the end of the sentence. and not in the middle. 
 
 
- While being more complex and costly to maintain - Depends on the usecase. It’s most likely at a trainyard or trainstation. - The image implies that the track on the left meets the use case criteria 
 
 
- That’s the problem. Maybe it is. - Maybe the code the AI wrote works perfectly. Maybe it just looks like how perfectly working code is supposed to look, but doesn’t actually do what it’s supposed to do. - To get to the train tracks on the right, you would normally have dozens of engineers working over probably decades, learning how the old system worked and adding to it. If you’re a new engineer and you have to work on it, you might be able to talk to the people who worked on it before you and find out how their design was supposed to work. There may be notes or designs generated as they worked on it. And so-on. - It might take you months to fully understand the system, but whenever there’s something confusing you can find someone and ask questions like “Where did you…?” and “How does it…?” and “When does this…?” - Now, imagine you work at a railroad and show up to work one day and there’s this whole mess in front of you that was laid down overnight by some magic railroad-laying machine. Along with a certificate the machine printed that says that the design works. You can’t ask the machine any questions about what it did. Or, maybe you can ask questions, but those questions are pretty useless because the machine isn’t designed to remember what it did (although it might lie to you and claim that it remembers what it did). - So, what do you do, just start running trains through those tracks, assured that the machine probably got things right? Or, do you start trying to understand every possible path through those tracks from first principles? 
 
- Im looking forward in the next 2 years when AI apps are in the wild and I get to fix them lol. - As a SR dev, the wheel just keeps turning. - I’m being pretty resistant about AI code Gen. I assume we’re not too far away from “Our software product is a handcrafted bespoke solution to your B2B needs that will enable synergies without exposing your entire database to the open web”. - It has its uses. For templeting and/or getting a small project off the ground its useful. It can get you 90% of the way there. - But the meme is SOOO correct. AI does not understand what it is doing, even with context. The things JR devs are giving me really make me laugh. I legit asked why they were throwing a very old version of react on the front end of a new project and they stated they “just did what chatgpt told them” and that it “works”. Thats just last month or so. - The AI that is out there is all based on old posts and isnt keeping up with new stuff. So you get a lot of the same-ish looking projects that have some very strange/old decisions to get around limitations that no longer exist. - Holdup! You’ve got actual, employed, working, graduated juniors who are handing in code that they don’t even understand? 
- Yeah, I think personally LLMs are fine for like writing a single function, or to rubber duck with for debugging or thinking through some details of your implementation, but I’d never use one to write a whole file or project. They have their uses, and I do occasionally use something like ollama to talk through a problem and get some code snippets as a starting point for something. Trying to do too much more than that is asking for problems though. It makes it way harder to debug because it becomes reading code you haven’t written, it can make the code style inconsistent, and a non-insignifigant amount of the time even in short code segments it will hallucinate a non existent function or implement something incorrectly, so using it to write massive amounts of code makes that way more likely. - The CursoAI debugging is the best experience ever. - It’s so much easier than googling don’t stack trace and then browsing GitHub issues and stack overflow. 
 
- The AI also enabled some very bad practices. - It does not refactor and it makes writing repetitive code so easy you miss opportunities to abstract. In a week when you go to refactor you’re going to spend twice as long on that task. - As long as you know what you’re doing and guide it accordingly, it’s a good tool. 
 
- without exposing your entire database to the open web until well after your payment to us has cleared, so it’s fine. - Lol. 
- Our gluten-free code is handcrafted with all-natural intelligence. 
 
 
- deleted by creator - When did you get diagnosed? - He’s got that ol’ New York City Metropolitan Area Transit Authority Blues again, momma! 
 
- OpenTTD is a good game. 
 
- God, seriously. Recently I was iterating with copilot for like 15 minutes before I realized that it’s complicated code changes could be reduced to an - ifstatement.- AI can’t imagine an image full glass of wine because there are barely any images of that in any dataset out there. AI can’t think, just massage it’s dataset into something vaguely plausible. 
 
- I personally find copilot is very good at rigging up test scripts based on usings and a comment or two. Babysit it closely and tune the first few tests and then it can bang out a full unit test suite for your class which allows me to focus on creative work rather than toil. - It can come up with some total shit in the actual meat and potatoes of the code, but boilerplate stuff like tests it seems pretty spot on. - I believe that, because test scripts tend to involve a lot of very repetitive code, and it’s normally pretty easy to read that code. - Still, I would bet that out of 1000 tests it writes, at least 1 will introduce a subtle logic bug. - Imagine you hired an intern for the summer and asked them to write 1000 tests for your software. The intern doesn’t know the programming language you use, doesn’t understand the project, but is really, really good at Googling stuff. They search online for tests matching what you need, copy what they find and paste it into their editor. They may not understand the programming language you use, but they’ve read the style guide back to front. They make sure their code builds and runs without errors. They are meticulous when it comes to copying over the comments from the tests they find and they make sure the tests are named in a consistent way. Eventually you receive a CL with 1000 tests. You’d like to thank the intern and ask them a few questions, but they’ve already gone back to school without leaving any contact info. - Do you have 1000 reliable tests? 
 
- The key is identifying how to use these tools and when. - Local models like Qwen are a good example of how these can be used, privately, to automate a bunch of repetitive non-determistic tasks. However, they can spot out some crap when used mindlessly. - They are great for skett hing out software ideas though, ie try a 20 prompts for 4 versions, get some ideas and then move over to implementation. 
- And of course the ai put rail signals in the middle. - Chain in, rail out. Always - !Factorio/Create mod reference if anyone is interested !< - Spoiler- Your spoiler didn’t work. 
 
- I think I would more picture planes taking off those railroads when it comes to AI. It tends to hallucinate API calls that don’t exist. if you don’t go check the docs yourself you will have a hard time debugging what went wrong. 
- Give it time, eventually every project looks like the right. - I mean, not quite every project. Some of my projects have been turned off for not being useful enough before they had time to get that bad. Lol. - I suppose you covered that with given time, though. 
 
- It depends. AI can help writing good code. Or it can write bad code. It depends on the developer’s goals. - LLMs can be great for translating pseudo code into real code or creating boiler plate or automating tedious stuff, but ChatGPT is terrible at actual software engineering. - Honestly I just use it for the boilerplate crap. - Fill in that yaml config, write those lua bindings that are just a sequence of lua_pushinteger(L, 1), write the params of my do string kind of stuff. - Saves me a ton of time to think about the actual structure. 
 
- My goal is to write bad code 
- It depends. AI can help writing good code. Or it can write bad code - I’ll give you a hypothetical: a company is to hire someone for coding. They can either hire someone who writes clean code for $20/h, or someone who writes dirty but functioning code using AI for $10/h. What will many companies do? - Many companies chose cheap coders over good coders, even without AI. Companies I heard of have pretty bad code bases, and they don’t use AI for software development. Even my company preferred cheap coders and fast development, and the code base from that time is terrible, because our management didn’t know what good code is and why it’s important. For such companies, AI can make development even faster, and I doubt code quality will suffer. 
 
 
- I don’t understand how build times magically decrease with AI. Or did they mean built? - They mean time to write the code, not compile time. Let’s be honest, the AI will write it in Python or Javascript anyway 
 
- It’s WYSIWYG all over again… 













