Our discussions with the New York Times appeared to be progressing constructively thanks to our last communication on December 19. Negotiations focused on a high-value partnership around real-time viewing with attribution in ChatGPT, in which The New York Times would gain a new position. way to connect with their existing and new readers, and our users would have access to their reports. We explained to the New York Times that, like any single source, their content did not contribute significantly to the training of our existing models nor would it have sufficient impact for future training. Their trial of December 27 – which we learned about by reading the New York Times – was a surprise and a disappointment for us.
Along the way, they mentioned seeing some regurgitation of their content, but repeatedly declined to share examples, despite our commitment to investigate and resolve any issues. We have demonstrated how seriously we consider this issue a priority, as in July when we removed a ChatGPT feature immediately after we learned that it could reproduce real-time content unintentionally.
Interestingly, the regurgitations provoked by the New York Times appear to stem from years-old articles that have proliferated on the subject. several third–to party websites. It appears they intentionally manipulated the prompts, often including lengthy article excerpts, in order to regurgitate our model. Even when using such prompts, our models generally do not behave as the New York Times suggests, suggesting that they either asked the model to regurgitate or selected their examples from many attempts.
Despite their claims, this misuse is not typical or authorized user activity and does not replace the New York Times. Regardless, we are continually making our systems more resilient to adversarial attacks aimed at regurgitating training data, and have already made many advances in our recent models.