Sure, and then you can throw another LLM in and make them come to a consensus, of course that could be wrong too so have another three do the same and then compare, and then…
I have an ongoing and endless debate with a PhD that insists consensus of multiple LLMs is a valid proof check. The guy is a neuroscientist, not at all a developer tech head, and is just stubborn, continually projecting a sentient being perspective on his LLM usage.
This, but unironically. It's not much different from the way human unreliability is accounted for. Add more until you're satisfied a suitable ratio of mistakes will be caught.