According to researchers from Johannes Gutenberg University Mainz and University College London, OpenAI’s ChatGPT chatbot is very good at fixing software bugs.
“We find that ChatGPT’s bug fixing performance is competitive to the common deep learning approaches CoCoNut and Codex and notably better than the results reported for the standard program repair approaches,” the researchers write in a new arXiv paper, first spotted by New Scientist.
The researchers used the QuixBugs bug fixing benchmark to evaluate ChatGPT’s performance. Because they were developed prior to 2018, automated program repair (APR) systems appear to be at a disadvantage. ChatGPT was tested against QuixBugs 40 Python-only problems, and the researchers manually checked whether the suggested solution was correct or not.
ChatGPT fixed 19 of the 40 Python bugs, matching CoCoNut (19) and Codex (19). (21). However, standard APR methods only addressed seven of the issues. The researchers discovered that ChatGPT had a success rate of 77.5% with follow-up interactions.
OpenAI also highlights ChatGPT’s dialogue capability in debugging code examples, where it can ask for clarifications and receive hints from a person to arrive at a better answer. It used Reinforcement Learning from Human Feedback to train the large language models that power ChatGPT (GPT-3 and GPT 3.5). (RLHF).
The sources for this piece include an article in ZDNet.