Just a complete shot in the dark because I realize how niche it is, but does anyone here have any experience with machine-porting cobol to rust?
I’m not involved in the project wanting to do this, but I was asked if I knew of any such conversion tools. My suggestion was to look for cobol-to-c, and then c-to-rust tools (because I know those tools do exist), but if anyone here has any experiences and tips, I’d be happy to forward them.
I also suggested looking into using LLM’s, but uploading the code to remote servers is a no-go.
I have some vague memory of having seen someone talk about cobol-to-rust conversions on here, but I couldn’t find it, so maybe it was on a different forum.
You could look at GnuCOBOL, which translates COBOL to C then compiles it from there. The generated code is unlikely to be something you'd want to maintain directly.
That's definitely not a job for LLMs since they aren't appropriate for programming tasks. Maybe they can translate comments in another language if you don't have access to a better translator tool.
You're correct; I also thought about that when I saw your thread. There was this post 2 weeks ago, but no answers:
I've also seen a converter from COBOL to C/C++/C#/Java, though I don't know how good it is (and there's the GnuCOBOL mentioned above). But the only converter from C to Rust I know is translating to unsafe Rust and relying on manual work to do the rest, so I don't think it'll be any help at all. The other converters are AI-driven, so not a solution either.
Perhaps the best option would be to use the translated C as foreign library, if porting that manually isn't possible.
I wouldn’t outright dismiss using a local LLM to do the job. Certainly the results will not be perfect, that’s why you should have a very strong test suite to catch any potential errors. The tests themselves must not be AI-generated, but rather written by human experts. This is a considerable effort, but something that should be done anyways.
I’ve done Matlab → Rust one-shot translation using Gemini. As expected, the Rust code contained errors, but they were rather easy to fix. All in all, I was positively surprised by how good the results were (around 1000 lines of code).
I can only imagine that a direct COBOL to Rust LLM translation will lead to “ugly” Rust code. You may need to refactor a lot of Rust code afterwards.
It would likely take less time to write from scratch than to review the code and rewrite most of it. There have been enough experiences and articles[1] showing LLMs (even agentic) are shallow and deceptive due to the lack any proper inference engine, so we should know better than using AI to write code until new architectures are found. Especially translating COBOL, which is usually for critical business operations and whose training set is arguably smaller.
Here are two recent ones that will explain that better than I do: 1, 2 (other ones are in a recent thread on that topic). ↩︎
That is really a strange statement. For which use cases could LLMs be a better fit than programming? A lot of computer programming is just some sort of "plumbing", not frontier scientific research work. Requiring a large knowledge base, and some understanding. LLMs are quite good at that, often the most serious problem for LLMs is that they are still polluted with outdated stuff, do not know latest changes in software, and can not really learn from their mistakes. All that will improve soon. There was a video of Sabine Hossenfelder some days ago testing LLMs for frontier discoveries in physics, where all of them were not too good (GPT-5 was best) and I have indeed serious doubts that the current design of AI will be able to create a super intelligence able to discover serious new physics. But I am sure it will be able to do what 99.999... percent of people do today.
For COBOL -- I would assume that these legacy programs are large, but quite trivial, often covering some form of finance applications. LLMs should be very good in converting COBOL to modern languages if they are trained on that (that might be indeed a problem, most COBOL code is not open source I guess).
They're quite good for linguistics, or even for translation despite specific components having been removed. Since they have good language abilities and are really just pattern matchers, they can also help with formulating fuzzy statements better or summarizing well-known ideas (provided you double-check the results, as always).
It's not a strange statement: just look how they work and it becomes obvious. Programming can be helped with knowledge in order to avoid re-inventing the wheel, true, but the knowledge they're averaged in their neural network has to be reliable enough and match the problem closely enough. Besides, the most important component remains logical reasoning, which none of those engines have. At best, they feature some hacks like multiple LLMs trying to iterate on problems in the hope it gets better. It does to some extent (at a huge price), but the main limitation remains.
A local LLM will be even more limited.
I won't repeat the arguments others and I have already made in the other thread, but you can have a look at the two recent articles I've linked above. They indeed come for that video you mention. It's not the main domain of that youtuber, but they illustrate two of the problems with LLMs (if they're accurate; at least one has been through several iterations so I'm hoping it had sufficient peer-reviewing).
Using a small (because local) LLM to translate between two niche languages, extremely different from each other, neither of which it’s likely to have been well trained in, doesn’t sound like a recipe to success to me. Maybe if you train your own finetune (I presume there’s a roughly 0% chance that someone else has already trained such a thing…)
You are right. However, note that local does not necessarily mean small, e.g., it can be local to the organization where a large model runs on a powerful server.
Don’t get me wrong, I am not saying that an LLM is the best (or only) solution to the OP’s problem. I am just saying it may be worth considering. In the end, as with any tool/technology, there are tradeoffs the OP must evaluate.
I am by no means an LLM advocate. From time to time, I experiment with LLM’s on different coding scenarios. So far, I have mostly come up disappointed with the results (i.e., I rather solve the problem myself). The only case where I was actually satisfied with the LLM output (i.e., I genuinely believe it saved me a ton of time) is the Matlab→Rust translation.