221

[DISCUSS] IBM using LLMs to convert COBOL to Java (techcrunch.com)

submitted 1 year ago* (last edited 1 year ago) by bahmanm@lemmy.ml to c/technology@lemmy.ml

116 comments fedilink hide all child comments

It's not the 1st time a language/tool will be lost to the annals of the job market, eg VB6 or FoxPro. Though previously all such cases used to happen gradually, giving most people enough time to adapt to the changes.

I wonder what's it going to be like this time now that the machine, w/ the help of humans of course, can accomplish an otherwise multi-month risky corporate project much faster? What happens to all those COBOL developer jobs?

Pray share your thoughts, esp if you're a COBOL professional and have more context around the implication of this announcement 🙏

top 50 comments

sorted by: hot top controversial new old

[-] simple@lemm.ee 89 points 1 year ago

I have my doubts that this works well, every LLM we've seen that translates/writes code often makes mistakes and outputs garbage.

[-] Jomn@jlai.lu 64 points 1 year ago

Yes, and among the mistakes, it will probably introduce some hard to find bugs/vulnerabilities.

[-] Vlyn@lemmy.zip 16 points 1 year ago

Just ask it to also write tests, duh /s

load more comments (1 replies)

load more comments (3 replies)

[-] IHeartBadCode@kbin.social 57 points 1 year ago

This sounds no different than the static analysis tools we’ve had for COBOL for some time now.

The problem isn’t a conversion of what may or may not be complex code, it’s taking the time to prove out a new solution.

I can take any old service program on one of our IBM i machines and convert it out to Java no problem. The issue arises if some other subsystem that relies on that gets stalled out because the activation group is transient and spin up of the JVM is the stalling part.

Now suddenly, I need named activation and that means I need to take lifetimes into account. Static values are now suddenly living between requests when procedures don’t initial them. And all of that is a great way to start leaking data all over the place. And when you suddenly start putting other people’s phone numbers on 15 year contracts that have serious legal ramifications, legal doesn’t tend to like that.

It isn’t just enough to convert COBOL 1:1 to Java. You have to have an understanding of what the program is trying to get done. And just looking at the code isn’t going to make that obvious. Another example, this module locks a data area down because we need this other module to hit an error condition. The restart condition for the module reloads it into a different mode that’s appropriate for the process which sends a message to the guest module to unlock the data area.

Yes, I shit you not. There is a program out there doing critical work where the expected execution path is to on purpose cause an error so that some part of code in the recovery gets ran. How many of you think an AI is going to pick up that context?

The tools back then were limited and so programmers did all kinds of hacky things to get particular things done. We’ve got tools now to fix that, just that so much has already been layered on top of the way things work right now. Pair with the whole, we cannot buy a second machine to build a new system and any new program must work 99.999% right out of the gate.

COBOL is just a language, it’s not the biggest problem. The biggest problem is the expectation. These systems run absolutely critical functions that just simply cannot fail. Trying to foray into Java or whatever language means we have to build a system that doesn’t have 45 years worth of testing that runs perfectly. It’s just not a realistic expectation.

[-] aksdb@feddit.de 19 points 1 year ago

What pisses me off about many such endeavors is, that these companies always want big-bang solutions, which are excessively hard to plan out due to the complexity of these systems, so it's hard to put a financial number on the project and they typically end up with hundreds of people involved during "planning" just to be sacked before any meaningful progress could be made.

Instead they could simply take the engineers they need for maintenance anyway, and give them the freedom to rework the system in the time they are assigned to the project. Those systems are - in my opinion - basically microservice systems. Thousands of more or less small modules inter-connected by JCL scripts and batch processes. So instead of doing it big bang, you could tackle module by module. The module doesn't care in what language the other side is written in, as long as it still is able to work with the same datastructure(s).

Pick a module, understand it, write tests if they are missing, and then rewrite it.

After some years of doing that, all modules will be in a modern language (Java, Go, Rust, whatever) and you will have test coverage and hopefully even documentation. Then you can start refactoring the architecture.

But I guess that would be too easy and not enterprisy enough.

[-] Crackhappy@lemmy.world 5 points 1 year ago

You just handwaved thousands of processes like it's easy .. lol.

[-] aksdb@feddit.de 9 points 1 year ago

I said it takes years. The point is that you can do it incremental. But that typically doesn't fit with the way enterprises want things done. They want to know a beginning, a timeline and a price. Since they don't get that, they simply give up.

But it's dumb, since those systems run already and have to keep running. So they need to keep engineers around that know these systems anyway. Since maintenance work likely doesn't take up their time, they could "easily" hit two birds with one stone. The engineers have a fulltime job on the legacy system (keeping them in the loop for when an incident happens without having to pull them out of other projects then and forcing them into a context switch) and you slowly get to a modernized system.

Not doing anything doesn't improve their situation and the system doesn't get any less complex over time.

load more comments (4 replies)

[-] Kerfuffle@sh.itjust.works 4 points 1 year ago

This sounds no different than the static analysis tools we’ve had for COBOL for some time now.

One difference is people might kind of understand how the static analysis tools we've had for some time now actually work. LLMs are basically a black box. You also can't easily debug/fix a specific problem. The LLM produces wrong code in one particular case, what do you do? You can try performing fine tuning training with examples of the problem and what it should be but there's no guarantee that won't just change other stuff subtly and add a new issue for you to discovered at a future time.

[-] eyy@lemm.ee 37 points 1 year ago

Not a cobol professional but i know companies that have tried (and failed) to migrate from cobol to java because of the enormously high stakes involved (usually financial).

LLMs can speed up the process, but ultimately nobody is going to just say "yes, let's accept all suggested changes the LLM makes". The risk appetite of companies won't change because of LLMs.

[-] Kache@lemm.ee 9 points 1 year ago

Wonder what makes it so difficult. "Cobol to Java" doesn't sound like an impossible task since transpilers exist. Maybe they can't get similar performance characteristics in the auto-transpiled code?

[-] qaz@lemmy.world 16 points 1 year ago* (last edited 1 year ago)

COBOL programs are structured very differently from Java. For example; you can’t just declare a variable, you have to add it to the working storage section at the top of the program.

[-] Kache@lemm.ee 5 points 1 year ago* (last edited 1 year ago)

That example doesn't sound particularly difficult. I'm not saying it'd be trivial, but it should be approximately as difficult as writing a compiler. Seems like the real problem is not a technical one.

[-] eyy@lemm.ee 5 points 1 year ago* (last edited 1 year ago)

It's never been a technical reason, it's the fact that most systems still running on COBOL are live, can't be easily paused, and there's an extremely high risk of enormous consequences for failure. Banks are a great example of this - hundreds of thousands of transactions per hour (or more), you can't easily create a backup because even while you're backing up more business logic and more records are being created, you can't just tell people "hey we're shutting off our system for 2 months, come back and get your money later", and if you fuck up during the migration and rectify it within in hour, you would have caused hundreds/thousands of people to lose some money, and god forbid there was one unlucky SOB who tried to transfer their life savings during that one hour.

And don't forget the testing that needs to be done - you can't even have an undeclared variable that somehow causes an overflow error when a user with a specific attribute deposits a specific amount of money in a specific branch code when Venus and Mars are aligned on a Tuesday.

load more comments (3 replies)

[-] DefinitelyNotAPhone@hexbear.net 5 points 1 year ago

Translating it isn't the difficult part. It's convincing a board room full of billionaires that they should flip the switch and risk having their entire system go down for a day because somebody missed a bug in the code and then having to explain to some combination of very angry other billionaires and very angry financial regulators why they broke the economy for the day.

load more comments (4 replies)

[-] halfempty@kbin.social 31 points 1 year ago

That's alot of effort to go from one horrible programming language to another horrible programming language.

[-] juja@lemmy.world 9 points 1 year ago

What would your language of choice have been? And why is java horrible for this scenario? it sounds like a reasonably good choice to me

load more comments (15 replies)

load more comments (1 replies)

[-] 4stringscooter@lemmy.ml 29 points 1 year ago

So the fintech companies who rely on that tested (though unliked) lump of iron from IBM running an OS, language, and architecture built to do fast, high-throughput transactional work should trust AI to turn it into Java code to run on hardware and infrastructure of their own choosing without having architected the whole migration from the ground up?

Don't get me wrong, I want to see the world move away from cobol and ancient big blue hardware, but there are safer ways to do this and the investment cost would likely be worth it.

Can you tell I work in fintech?

load more comments (1 replies)

[-] FoxBJK@midwest.social 24 points 1 year ago

Converting ancient code to a more modern language seems like a great use for AI, in all honesty. Not a lot of COBOL devs out there but once it's Java the amount of coders available to fix/improve whatever ChatGPT spits out jumps exponentially!

[-] gravitas_deficiency@sh.itjust.works 35 points 1 year ago

The fact that you say that tells me that you don’t know very much about software engineering. This whole thing is a terrible idea, and has the potential to introduce tons of incredibly subtle bugs and security flaws. ML + LLM is not ready to be used for stuff like this at the moment in anything outside of an experimental context. Engineers are generally - and with very good reason - deeply wary of “too much magic” and this stuff falls squarely into that category.

load more comments (15 replies)

[-] HellAwaits@lemm.ee 8 points 1 year ago

Is ChatGPT magic to people? ChatGPT should never be used in this way because the potential of critical errors is astronomically high. IBM doesn't know what it's doing.

load more comments (6 replies)

[-] argv_minus_one@beehaw.org 22 points 1 year ago

If even highly skilled humans couldn't do that, artificial pseudointelligence doesn't stand a chance in hell.

There's nothing of substance here. Just suits chasing buzzwords. Nothing will actually happen, just like nothing actually happened every other time some fancy new programming language or methodology came along and tried to replace COBOL, including Java.

[-] duncesplayed@lemmy.one 27 points 1 year ago

This is what I don't get. Rewriting COBOL code into Java code is dead easy. You could teach a junior dev COBOL (assuming this hasn't been banned under the Geneva Convention yet) and have them spitting out Java code in weeks for a lot cheaper.

The problem isn't converting COBOL code to Java code. The problem is converting COBOL code to Java code so that it cannot ever possibly have even the most minute difference or bug under any possible circumstances ever. Even the tiniest tiniest little "oh well that's just a silly little thing" bug could cost billions of dollars in the financial world. That's why you need to pay COBOL experts millions of dollars to manage your COBOL code.

I don't understand what person looked at this problem and said "You know what never does anything wrong or makes any mistake ever? Generative AI"

load more comments (1 replies)

[-] Treczoks@lemm.ee 22 points 1 year ago

"all those COBOL developer jobs" nowadays probably fit in one bus. That's why every company that can afford it moves away from COBOL.

[-] ArbitraryValue@sh.itjust.works 19 points 1 year ago* (last edited 1 year ago)

according to a 2022 survey, there’s over 800 billion lines of COBOL in use on production systems, up from an estimated 220 billion in 2017

That doesn't sound right at all. How could the amount of COBOL code in use quadruple at a time when everyone is trying to phase it out?

[-] gravitas_deficiency@sh.itjust.works 18 points 1 year ago

Because it’s not actually getting phased out in reality

[-] ArbitraryValue@sh.itjust.works 7 points 1 year ago

But it isn't getting quadrupled either, at least because there aren't enough COBOL programmers in the world to write that much new code that quickly.

[-] RobotDrZaius@kbin.social 4 points 1 year ago

It doesn’t say unique lines of code.

load more comments (1 replies)

[-] eyy@lemm.ee 6 points 1 year ago

That doesn’t sound right at all. How could the amount of COBOL code in use quadruple at a time when everyone is trying to phase it out?

Because why they're trying, they need to keep adding business logic to it constantly. Spaghetti code on top of spaghetti code.

[-] kitonthenet@kbin.social 4 points 1 year ago

It could mean anything, the same code used in production in new ways, slightly modified code, newly discovered cobol where the original language was a mystery, new requirements for old systems, seriously it could be too many things for that to be a useful metric with no context

load more comments (4 replies)

[-] socsa@lemmy.ml 17 points 1 year ago

What a terrible day to be literate

[-] Pavlichenko_Fan_Club@hexbear.net 15 points 1 year ago

Oh FFS there is nothing magical about COBOL like its some kind of sword in the stone which only a chosen few can draw. COBOL is simple(-ish), COBOL is verbose. That's why there is so much of it.

The reason you don't see new developers flocking to these mythical high-paying COBOL jobs is its not about the language, but rather about maintaining these gianourmous, mission-critical applications that are basically black boxes due to the loss of institutional knowledge. Very high risk with almost no tangible, immediate reward--so don't touch it. Not something you can just throw a new developer at and hope for the best, the only person who knew this stuff was some guy named "John", and he retired 15 years ago! Etc, etc.

Also this is IBM were talking about, so purely buzzword-driven development. IBM isn't exactly known for pushing the envelope recently. Plus transpilers have existed as a concept since... Forever basically? Doubt anything more will come from this other than upselling existing IBM contracts who are already replacing COBOL.

[-] Aurenkin@sh.itjust.works 15 points 1 year ago* (last edited 1 year ago)

ChatGPT did an amazing job converting my Neovim config from VimScript to Lua including explaining each part and how it was different. That was a very well scoped piece of code though. I'd be interested to see how an LLM goes on large projects as I imagine that would be a whole different level of complexity. You need to understand a lot more about the components and interactions and be very careful not to change behaviour. Security is another important thing that was already mentioned in this thread and the article itself.

I put my self as doubtful but really interested to see the results nonetheless. I've already been surprised a few times over by these things so who knows.

[-] MargotRobbie@lemm.ee 8 points 1 year ago

Why Java instead of C# or Go though?

[-] quicken@aussie.zone 7 points 1 year ago

Because IBM doesn't want to tie themselves to Google or Microsoft. They already have their own builds of OpenJDK.

load more comments (1 replies)

[-] loutr@sh.itjust.works 5 points 1 year ago

Because Cobol is mainly used in an enterprise environment, where they most likely already run Java software which interfaces with the old Cobol software. Plus modern Java is a pretty good language, it's not 2005 anymore.

load more comments (1 replies)

[-] Zeth0s@lemmy.world 6 points 1 year ago

Of all modern languages, why java? Which will likely soon become legacy for backend applications

[-] datendefekt@lemmy.ml 6 points 1 year ago

Sadly, I've haven't been programming for a while, but I did program Java. Why do you consider it legacy and do you see a specific language replacing it?

load more comments (2 replies)

[-] kitonthenet@kbin.social 4 points 1 year ago* (last edited 1 year ago)

Without a requirements doc stamped in metal you won’t get 1:1 feature replication

This was kind of a joke but it’s actually very real tbh, the problems that companies have with human devs trying to bring ancient systems into the modern world will all be replicated here. The PM won’t stop trying to add features just because the team doing it is using an LLM, and the team doing it won’t be the team that built it, so they won’t get all the nuances and intricacies right. So you get a strictly worse product, but it’s cheaper (maybe) so it has to balance out against the cost of the loss in quality

load more comments

this post was submitted on 23 Aug 2023

221 points (96.6% liked)

Technology

35148 readers

48 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago

MODERATORS

MinutePhrase@lemmy.ml