Why nobody made a decompiler that gets machine code and transforms into human readable Visual Studio source code project?
I'm sure the AI could guess a good name for everything inside the source code. And also organize everything in a good architecture.
Decompiling is a niche practice which mostly only hobbyists and the occasional unfortunate legacy code support guru concern themselves with. While such a tool would certainly be useful to them; you'll never get production-quality source code out of it, so it wouldn't attract a new age of reverse engineers.
Like how it was stated that an AI could never beat a man at chess? I think OP is right on the money that an AI could make an excellent decompiler. Saying it wouldn't be valuable is flippant and just wrong. Multi-national corps and governments will be working on this as we speak in secret.
Absolutely. Imagine when AI gets to the point it can spit out any program you want, completely finished. It got to that point with images and it will get to that point with software too. It's just a matter of time.
Except if you don't even know what you want or don't want, which is the case with most non-programmers. Then the AI prompters will just be using the AI as a compiler for 'natural' language instead of a programming language tailored for the purpose.
Not even remotely the same.
Image generation is only asking an AI to make something that looks vaguely like what you described, usually with mangled fingers.
Decompilation requires the AI to generate an extremely precise output that must be logically equivalent to the binary.
>logically equivalent to the binary.
yes this is the difficult part, computers are notoriously bad at both logic and determining equivalency
By that logic, P=NP, because it's all just logic anyways, right?
>What is the halting problem
Do you even know how a decompiler works? They don't trace one branch in code or the other they trace both branches, exactly once. The binary would have to be infinite size for the halting problem to be relevant.
Fuck. When you led me to reply with this I get the sense this point is fundementally important, but I'm not sure how. decompiling finite sized binaries will always halt. That might kinda weakly sidestep the issue of whether an arbitrary program will halt or not. Still not decideable, but maybe it doesn't need to be decided in some situations?
Missed the point.
The halting problem is a direct refutation of the argument that computers can solve any logical problem.
Any any rate:
>Do you even know how a decompiler works?
>They don't trace one branch in code or the other they trace both branches
Depends on if it's a recursive descent or a linear search type.
Linear search will go through the code from the first address to the last, doing it all in one pass, which is simpler, but usually misses a lot of details when there's no DWARF symbol table to help it.
Recursive descent is better at finding all the code, but they can and do struggle sometimes with code that has a particularly knotted or messy control flow.
>The halting problem is a direct refutation of the argument that computers can solve any logical problem.
In general and for illustrative purposes, yeah. Wanting to know whether a particular program will halt is a more practical question. But maybe knowing that it can halt is still useful?
"Beating a human at a game of looking N moves into the future" is not the same as literally reversing entropy. AI can't decompile machine code into production-quality code because it's literally mathematically impossible. You're seriously asking AI to turn lead into gold here.
Decompilers from machine code to assembly language already exist and can produce byte-for-byte identical output when recompiled.
it's a one-to-one mapping
>because it's literally mathematically impossible
You're retarded. It's a deterministic process of finitely many pieces. How is such a process "mathematically impossible" to reverse?
I agree. That anon is coping massively
It's a many-to-one relationship. It's unrecoverable.
Decompiling machine code into "code which would compile into that machine code" exists. Giving the variables meaningful names is not a "byte-for-byte" process; it is a generative process and it is by all measurable means impossible.
It doesn't need to be identical to the original source code, comments and all for it to be useful. Just high level language code that produces the same output.
If it were impossible for the exact meaning of the compiled output to be decided, it could not be executed by a computer.
There is no single one-to-one mapping of a compiled binary to source code. For a simple example, suppose there is a binary blob containing an image; if you know the stride, pixel format, start, and end of the image in the bob, it's possible to extract the image, but without that, you're reduced to scanning through all possible combinations searching for an image. Even if you find something that looks like an image, there's fundamentally no way to know if it's the "correct" one. This is much worse for a machine code to source code conversion.
True. An AI would have to make guesses based on heuristics, the same as people would when doing the same task. The only difference would be the AI could make guesses much quicker.
Not to mention, there's no one-to-one mapping of source code to binary either, otherwise compiler flags wouldn't be a thing.
Also, binaries usually have a few weird sections of code that break the conventions and idioms used elsewhere in the code, usually as a result of some hand written assembly that got pulled in through a library, or a .dll/.so that was generated by a different compiler than the rest of the program.
The compiler flags are effectively just part of the source code. So could be predicted by an AI just the same.
Way to miss the point.
The point is that source code to binary is a many-to-many mapping, and that makes formal verification that any two source codes/binaries are equivalent a very difficult task.
If the output binaries are the same byte for byte, then the source codes used to make them are equivalent.
Good luck getting byte-identical binaries.
That level of reproducibility almost always requires using the the same compiler, with the same exact version, and the same exact flags.
Most binaries have that information readily available by just looking at its contents in a hex editor.
plus, it don't take much to run all the compilers on the source code in parallel
"Most". Maybe PC software does, but I can assure you there is shitloads of software/firmware that does not.
You sure about that?
I'm seeing 200+ release versions of GCC, 77 release versions of LLVM, and who knows how many releases of MSVC and Intel C++ Compiler, not to mention less popular compilers like TCC or once popular compilers like Borland.
Then you have to multiply that by the number of flags available for those compilers.
Basically everything compiled with gcc is freetard shit with source code already available
These are the three that are actually important, probably ICC much less important than MSVC and Clang
TCC and Borland have compiled a lot of legacy software. There are many pieces of software that are too good to be replaced, even though the company went bust and all the programmers are senile.
Borland maybe, but TCC is just a toy compiler that no one uses for serious projects.
For obsolete software it will be cheaper to buy the source code from the company than to develop an AI decompiler.
I work at a place that uses some DOS software and we don't even know how to contact the people who wrote it. Currently we have to buy special motherboards designed for old PC software
If there are 10000 potential toolchains, there will still be heuristics that can get you to a manage set to test against. I know LLVM being used will be identifiable simply but looking at how registers are allocated.
Also the fact that LLVM never emits an "add" instruction with a constant, only a "sub" instruction with a negative constant. It's a bit silly really.
>ecx = ecx + 1
>sub ecx, 0xffffffff
I think him and you misunderstood the implications. Translation from binary to a source code would imply an algorithm that understand exactly what a program does.
This allows the algorithm to know exactly if the program halts, and that lead to a contraddiction
>This allows the algorithm to know exactly if the program halts
No, knowing what a program does doesn't tell you enough to know if it will halt.
Knowing what a program does would encompass also knowing if it halts.
If you don't know if the program halts, you don't fully know what the program does.
You can't tell if a program halts just by looking at source code either.
Yes you can. You can't do it algorithmically
If you think of it that way, that you can't know what programs that depend on user input do, even if you wrote them. A program you might write that halts when the user presses 'q' on the keyboard might never halt if the user never presses that key. Would it then be fair to say you don't know what the program you wrote does?
> A program you.might write that halts when the user presses q
Why such thing should happen? Did you take any course on computability theory?
And yes, a computer can't know what your program does for the same reason of the halting problem
There's that coping again. Flip flopping between the concrete and abstract won't change that this is possible and will soon be carried out, if it hasn't been already. By the way, your grammar is interesting. What's your native language?
>Theory be damned, just throw more machine learning and compute cycles at it!
Lol, classic pop-sci AI-tard cope.
Avoiding that question huh? Russian is it?
There more than one person who thinks you are a retard anon
But we are right to think so. It's CS101
No you aren't. You're aren't even following the discussion.
You don't seem to be either.
Every time you're proven wrong you fall back to an already disproven argument..
The halting problem doesn't state that you never tell when a program halts, it just states that there's no algorithmic solution to determine if any arbitrary program halts.
Therefore, a proof by contradition doens't apply here.
Looking at in another way, one could write out the formal logic which describes the conditions for when the program halts, which fully describes the halting behavior of the program.
Solving the halting problem in this case is 'simply' a matter of solving the boolean satisfiability problem, which is a NP-complete problem.
I don't dispute the existence of the halting problem. I'm just saying an AI producing source code from an output binary isn't equivalent to it.
It's pretty close when you consider the fact that you need to make sure that both are logically equivalent.
Hence the start of this stupid debate:
>yes this is the difficult part, computers are notoriously bad at both logic and determining equivalency
Which was then refuted by contradiction via the Halting Problem, a logical problem which cannon be solved by computers.
It might take a long time, but there are a finite number of toolchains to check, with likely a small number of potential candidates. Iterating sequentially over a list you'll definately come to the end of it.
That assumes you have identical source code (excluding variable names or whitespace).
Besides I've seen disassemblies that produce byte for byte identical output produced by people. A program that outputs the target bytes from an array into a file will do so even if the compiler isn't the same.
>A program that outputs the target bytes from an array into a file will do so even if the compiler isn't the same.
Any you don't need AI for such trivial cases.
The proof is in the complex examples, which is where formal equivalence is most important.
Also, I'm not sure why you're so hung up on byte-equivalence.
I'm talking about the broader problem of just checking if two different source codes/binaries are logically equivalent, since it's much easier to make two logically equivalent programs than it is to try and force a compiler to spit out the same binary again, given that source code to binaries is a many-to-many mapping.
>source code to binaries is a many-to-many mapping.
Yes. So there's many potential source codes that produce the same output.
If a machine cannot decide if two code are semantically the same you can't also build an algorithm that can produce a source code for any binary
In general that's true, but if they are byte for byte the same they must behave identically.
Fuck sake, we've been over this: getting byte-identical outputs is the exception for decompilation projects, not the norm.
But it is always possible with due care. More broadly, do you think a set of bytes exists that a copy cannot be made of. It is equivalent to that fundamentally.
But it's still impossible to guarantee that for any binary
Please open a cs book
Two copies of any series of bytes are the same, for any series of bytes. It is you who is retarded.
But it's almost impossible to have to identical binaries from every different but semantically equivalent code you cretin
An AI that could do this would become a compiler in it's own right anyway. Translating from compiled code to source code rather than the other way around. Ever heard of Turing completeness?
At this point you must be trolling I refuse to believe you're so clueless and stubborn
I'm not trolling. But you seem clueless and stubborn to me. So let's just agree to differ.
>clueless and stubborn.
That would be you, refusing to understand even basic theory of computer science.
What exactly do you believe I don't understand?
The halting problem?
All of it, clearly, because you're not following the debate and are fixating on one tiny facet of equivalence.
Having identical binaries is the ideal case, but not the most probable one.
Full on formal verification that two programs are equivalent basically falls under the domain of Satisfiablility Module Theories, which are NP-hard.
Comparing just two different but equivalent boolean equations is the SAT problem as mentioned earlier, which is NP-complete, and even a boolean equation as small as 64 variables can take a consumer computer a considerable amount of time to verify, never mind an entire program.
There was a research paper that just came out on Ghidra's P-code and how it doesn't completely cover the semantics of instructions, which is currently impeding Ghidra's ability to do tasks as simple as symbolically interpreting a the original binary.
Keep in mind, this isn't even getting into decompilation, this is still at the level of interpreting individual instructions in the original binary, and even that is fraught with problems of maintaining semantic equivalence.
>Full on formal verification that two programs are equivalent basically falls under the domain of Satisfiablility Module Theories, which are NP-hard.
So what? You don't even need to do that shit at all if you restrict yourself to where the output is identical. We wouldn't be looking to find the infinite set of all equivalent programs. At least I wouldn't. You can boil the ocean for the rest of eternity if you want to.
>restrict yourself to where the output is identical.
And you do you propose to do that?
Multiple anons have chimed in at this point to say how difficult it its to get an identical binary out.
They're wrong. want a binary with 0x01, 0x02, 0x03 in it?
const fs = require('fs');
// first number
The comments can be whatever meaning gets ascribed to them, from a human or an ai
The comments can be in some other programming language substitutable at your leisure and the bytes don't need to be listed on order.
Congrats, you made a trivial example. One example isn't a proof for the general case of any arbitrary binary.
But also, that code actually probably won't compile to a byte-identical binary if you change compilers or flags, so it's not even a good choice of example for you.
Going from machine code to assembly is like translating from binary to ASCII.
It's semantically the exact same information.
On the other hand, translating from English to French is very difficult without subtly changing the meaning.
For instance, here's my post translated by Google translate from English to French and back again.
>Congratulations, you have made a trivial example. An example is not a proof for the general case of an arbitrary binary. But also, this code probably won't be compiled into a byte-identical binary if you change compilers or flags, so it's not even a good example choice for you.
>Moving from machine code to assembly is like translating binary into ASCII. It is semantically exactly the same information. On the other hand, translating from English to French is very difficult without subtly changing its meaning. For example, here is my article translated by google translate from English to French and vice versa.
>You will notice that the text is not the same
You'll notice the text is not the same.
meanwhile, in the real world outside of the ivory tower:
Made by gamers of all people, with human readable labels and variable names to boot. Not even an AI used here. Maybe get chatGPT to explain it to you in your language of choice.
That's a disassembly, not a decompilation.
Disassembly is orders of magnitude simpler than what we're discussing.
so you don't consider it possible to translate from english to french with a computer? The millions of people using google translate every day don't care that you think that you don't think so.
How bout C64 basic converted into C then?
Gidra P code. Jeez...
Doesn't emulate a 6502 and therefore doesn't support support the USR command.
You probably wouldn't be able to run most C64 basic programs that use PEEK and POKE to directly manipulate the C64 either.
If you were using it as a scripting language, you wouldn't give a shit. Only pedandtic twats would.
Anon, this whole fucking thread has been one big debate about formal equivalence.
Think before you post.
Also, I searched through the whole code base and found lots of commands appear to be missing. No ABS, LOG, SGN, DIM, and probably more I haven't checked.
It's not the same any more. The authors stripped stuff out and added other stuff to repurpose it. It's neat though huh?
It's definitely neat! But a bad choice of example for this debate.
(Likewise, I'm confused why the author claims "100% compatibility", when it's clearly not. I admit USR was a nitpick, but the lack of half the math functions is weird.)
Well, I'm coming from a security research point of view, since that's one of the few real use cases for decompilation.
And in that case, you need it to be exactingly accurate.
>the lack of half the math functions is weird
That is weird. Maybe they aren't needed for messing about with text files? I linked to this as it isn't merely a reimplementation, but a machine conversion of the original rom's code to a high level language (which has since been modified), which some here seem to have claimed is impossible.
Actually, thinking about this some more, the math functions weren't part of the c64 basic rom. They were on the c64 kernel rom. Long story.
>which some here seem to have claimed is impossible.
No one claimed that.
The arguments thus far have been that:
1. AI is ill-suited for this task
2. Algorithmic decompilation is unlikely to create a byte-identical binary once recompiled
3. Even Algorithmic decompilation needs to be very careful to ensure it doesn't change the semantics of the program.
mist64/cbmbasic fails to disprove any of these arguments since it:
1. Is not decompiled by an AI
2. Does not produce a byte-identical binary
3. Is not semantically equivalent to CBM Basic on a real C64.
yes, of course. But I'm saying that only opinions have been offered around the first point one way or the other. And the third point may not matter depending on your needs. No two things can every be totally equivalent to arbitrary precision anyway. Similar to with the ship of theseus. Everything can only ever be a sufficient approximation at some level.
Some want it to be about proper formal equivalence. I suppose its depends on how formal you need that equivalence to be.
>So there's many potential source codes that produce the same output.
Only because there are multiple compilers each with multiple compilation options.
To reproduce the binary you need the same compiler, with the same flags, with the same source.
It's an unbounded n^3 search space at the very minimum.
for example, the ADD instruction always adds, so you have more context available. And you don't need to use the same compiler or even the same language as the original used. Only the output needs to be the same. If it wasn't the same output the program might still behave the same, but that is indeed a hard thing to determine.
>And you don't need to use the same compiler or even the same language as the original used
And theoretically my dryer could fold all my clothes.
The chances of getting any language other than assembly to match up with a binary generated by a different language are near zero.
Even assembly wouldn't guarantee it, since lots of instructions have multiple possible machine code equivalents (a fact that is sometimes used to watermark binaries).
>But it is always possible with due care
Yes, but you're talking about extreme attention to detail, often using compiler directives to force the output to be the same, like forcing variables to be allocated to a specific address.
>More broadly, do you think a set of bytes exists that a copy cannot be made of.
No, that seems to be the straw man you've made of my argument though.
My whole point is that going back and forth across a many-to-many mapping is statistically unlikely to produce a byte-identical equivalent, especially when you're throwing a retarded AI at the problem.
> Extremely unlikely
I think it's mathematically impossible. Let's have two different program that approximate a known distribution with a particle filtering algorithm
Let's say that one occupy N bytes. If you add a sufficient number of random samples evaluations that translate to a binary that exceed N you have two semantically equivalent programs that cannot have the same
*have the same binary. You can grow N to infinity and thus you have a finite number of byte equivalent-semantically equivalent programs and an infinite number of semantically equivalent-byte different programs
The byte different, semantically equivalent programs would certainly be very hard to identify in general.
I'd agree practically impossible. That's the reason for wanting to restrict to byte equivalent
> It might take a long time
It's not a matter of time, it's just impossible
I divided a certain number X by 5001 and the remainder is 602
what was the X number I used for aforementioned operation? The process was purely deterministic, and consisted of finite amount of data pieces and operations
What? Yes we can. People have done it. You read the disassembly, internalize the theory, and reimplement it.
I don't doubt AI assist might become a thing, but as impressive as current projects like ChatGPT are, they're miles from being able to produce decent decompilation results.
Decompiling software requires very fine attention to detail to accurately reproduce the complex interactions between instructions.
Current AI approaches would confidently give you something that looks correct, but is actually all bullshit, or mirrors the general idea of the code but fails to capture all the precise details that are so important to a successful decompilation.
It's not like chess where the problem space is limited to a 8x8 board with less than 32 pieces on it.
Decompiling is completely different from creating code from a description.
Because decompiling can be easily tested by just compiling the code to check if the outcome is right or wrong.
There are infinitely many ways for the code to be wrong, and only a small number of ways that are correct.
Also, formal verification that the binary produced from the decompiled code is equivalent to the original binary is no small task either.
The decompiler could be made in the traditional way.
Then the AI would just organize stuff and give everything human recognizeable names.
The next version of IDA pro will probably have the feature.
even if it diden't produce source code, just naming functions and variables would make life a lot easier, guessing what variables are supposed to represent is often a lot harder than following along with assembly code
I think OP had a great idea and you are the typical basement chud that never achieve anything and always complains about others. aka loser
If OP makes a tool like that, I’m sure people will use it and he’ll land in a great job position
The amount of over it would be for programmers would surge through the roof - but it'd be a legal nightmare. It's a conflict of the primary and wealthiest demographic for such a product and who can actually make it.
If this is true it will utterly destroy freetards and freetard software. Imagine if they try to shill their "open source" garbage and you just tell them
>uhh but everything is open source if you just open it in (A)IDA!
How would it be a bad thing for free software? It's one thing to recommend against pirated windows, it's another one to use a version without the spyware
well thats not actually what open source means so i doubt they would care
it all adds up
If you have a model that has a ton of source > compile > machine code in its training then that would be a start...
But considering how the models work getting to what you're thinking of is going to take some effort since they love to just spit out bullshit by default that looks good.
Actually I have an idea a way to test this out right now. A unique situation that isn't useful at all to what you're saying with VSCode and shit, but something it should have in its training that's similar to my first paragraph...it won't be useful in this form but I'll go see how viable it might be with chatGPT in its current shitty state.
I'll report back. It's not going to be interesting though and there will be nothing specific to show. But I'll see if it's possible to go from bytecode back to a function(actually just saying that I can picture how it won't even be able to get the function name right but whatever...).
Yeah this somewhat unsurprisingly didn't work, but I kind of shot for the stars with what would have been really impressive if it had worked.
This form of free chatGPT is dogshit as everyone knows and I don't have API access at the moment anyway to really try to stick enough forks into it to get it doing something super basic with bytecode to work up from there...
This is one of those things that seems like fool's errand though tbh if you're seriously gunning for a productive result...but for a challenge just to get some really basic shit going reliably...maybe
>thinking AI-worshipping gays know the first thing about code theory
My mistake. I'll leave you to your NFT mine.
There's a neat idea and it's reversible, in that you can compare the bytecode you generate with the compiler to the AI generated source code until it gets it right.
I'm sure that someone has already done this.
If not, several teams across the world are working on it now.
Start saving binaries from things you want to recompile in the future because there will obviously be obfuscation added in the future to try to stop freedom
If the code needs to still be executable, which it would, any obfuscation would be useless.
See denuvo being hacked recently in the news for a case in point. That didn't even need an AI do be done.
>If the code needs to still be executable, which it would, any obfuscation would be useless
elaborate? are you saying obfuscated code cant execute?
No, if it can execute then a suitable AI wouldn't be confused be any obfuscation.
well idk i imagine what OP is talking about is you would compile a program then train the AI on the binary
eventually it leans how the specific compiler generates binaries. But its not gonna learn how to unfuck obfuscation thats applied after compilation.
I mean most obfuscation is a lossy process, you literally lose information about the original control flow, an AI would never be able to undo that. So obfuscation would still be effective.
You lose information when you scale an image to a smaller size but AI can still upscale it decently.
Have you ever reversed a binary?
It doesn't take much to make decompilation a real headache.
Self-modifiying code, non-standard calling conventions, call indirection, on-the-fly decryption, and embedding custom virtual machines can all dramatically increase the amount of work that needs to be done, while also increasing the number of opportunities for introducing subtle errors.
Sure, dynamic analysis helps, but that's not always possible, or desirable.
Yeah, I have. It still requires manual creative leaps at present, but nothing in principle that couldn't be brute forced by an AI.
that would be also goal of the AI to solve
if AI can translate languages, then they can turn assembly into source code. it's that simple.
we already have static decompilers that can guess functions based on machinecode patterns. Sure you could get an AI to do this too. The trouble is actually connecting everything so that it works. A single wrong statement makes the whole source code not work.
Its like, how you can ask an AI right now to generate every single individual snippet of code that is happening in a whole program, and it can do that no problem. But ask it to take all those snippets and actually connect them into a program? it cant do that, because that would require the AI to actually understand what code is, which is doesn't, at least language models dont.
You will never have a perfect decompiler since compilation is inherently a lossy process.
If your CPU is broken, the program still won't behave the same as it hypothetically otherwise would. But that's metaphysics territory, nonsense also.
>Why nobody made a decompiler that gets machine code and transforms into human readable Visual Studio source code project?
Because your machine code is just a binary code.
And everyone who knows the binary system, can convert it into number->signs.
The garden gnomes did this thousands of years ago in their Bible (Talmud Gematria).
>The garden gnomes did this thousands of years ago in their Bible (Talmud Gematria).
I lolled at this comment. Thanks for the change of pace.
I'm told pop music today just copies what the last generation had as well 🙂
Okay, okay AI-bro, let assume that you have a NN that does the decompilation, and get the result. Then you put that result into a compiler to get the executable code again, which you compare to the original one. BUT, what will you tell the user if they're not equal byte by byte? This program cannot be decompiled?
Hmm. That's a tough one. Perhaps better UX and expectation management so the user isn't expecting miracles in the first place? If the tool claims the result of the tool is a starting point for their workflow for example. That's how existing decompilers sell themselves. At least they'd be getting better labels than sub_54EB:
Okay, so how can you guarantee that your NN will not spewing bullshit?
Honestly, even when training a NN on IDA data, you cannot ensure me it won't spewing something different from IDA output
You wouldn't be training it on raw IDA pro output. You'd be training it on the end result with proper label names after human intervention. Or better yet on decompiled code that has the DWARF debugging symbols with it. you'd also hook it into the decompiler so it would only be used to provide symbol names.
MUH COPYRIGHTS OY VEY
Security researchers need to know this stuff. This AI would be as an additional assistance to them to make them faster.
The decompiled source could be automatically annotated through plain English comments and and uploaded on m$ GitHub. So that another ai called copilot will obfuscate its origins and make it legal
Traditional computer software tools resemble the standard mathematical concept of a function f:XY: given an input x in the domain X, it reliably returns a single output f(x) in the range Y that depends on x in a deterministic fashion, but is undefined or gives nonsense if fed an input outside of the domain. For instance, the LaTeX compiler in my editor will take my LaTeX code, and - provided that it is correctly formatted, all relevant packages and updates have been installed, etc. - return a perfect PDF version of that LaTeX every time, with no unpredictable variation. On the other hand, if one tries to compile some LaTeX with a misplaced brace or other formatting problem, then the output can range from compilation errors to a horribly mangled PDF, but such results are often obvious to detect (though not always to fix).
#AI tools, on the other hand, resemble a probability kernel μ:XPr(Y) instead of a classical function: an input x now gives a random output sampled from a probability distribution μₓ that is somewhat concentrated around the perfect result f(x), but with some stochastic deviation and inaccuracy. In many cases the inaccuracy is subtle; the random output superficially resembles f(x) until inspected more closely. On the other hand, such tools can handle noisy or badly formatted inputs x much more gracefully than a traditional software tool.
Because of this, it seems to me that the way AI tools would be incorporated into one's workflow would be quite different from what one is accustomed to with traditional tools. An AI LaTeX to PDF compiler, for instance, would be useful, but not in a "click once and forget" fashion; it would have to be used more interactively.
There are research projects that do this. Look at recent years of USENIX security, NDSS, etc.