Codex: An AI system that translates natural language to programming code - Code poisoning: A new type of attack can manipulate natural-language modeling systems

Artificial intelligence research company OpenAI has announced the development of an AI system that translates natural language to programming code—called Codex, the system is being released as a free API, at least for the time being. Codex is more of a next-step product for OpenAI, rather than something completely new. It builds on Copilot, a tool for use with Microsoft's GitHub code repository. With the earlier product, users would get suggestions similar to those seen in autocomplete in Google, except it would help finish lines of code. Codex has taken that concept a huge step forward by accepting sentences written in English and translating them into runnable code.

As an example, a user could ask the system to create a web page with a certain name at the top and with four evenly sized panels below numbered one through four. Codex would then attempt to create the page by generating the code necessary for the creation of such a site in whatever language (JavaScript, Python, etc.) was deemed appropriate. The user could then send additional English commands to build the website piece by piece. Codex (and Copilot) parse written text using OpenAI's language generation model—it is able to both generate and parse code, which allowed users to use Copilot in custom ways—one of those was to generate programming code that had been written by others for the GitHub repository. This led many of those who had contributed to the project to accuse OpenAI of using their code for profit, a charge that could very well be levied against Codex, as well, as much of the code it generates is simply copied from GitHub.

Notably, OpenAI started out as a nonprofit entity in 2015 and changed to what it described as a "capped profit" entity in 2019—a move the company claimed would help it get more funding from investors. On its announcement page, OpenAI says that it is releasing the API for Codex in a private beta to start and also notes that the company is inviting developers and businesses to give it a try. They also note that as a general-purpose programming tool, Codex can be used for virtually any programing task.


Cornell Tech researchers have discovered a new type of online attack that can manipulate natural-language modeling systems and evade any known defense—with possible consequences ranging from modifying movie reviews to manipulating investment banks' machine-learning models to ignore negative news coverage that would affect a specific company's stock. In a new paper, researchers found the implications of these types of hacks—which they call "code poisoning"—to be wide-reaching for everything from algorithmic trading to fake news and propaganda.

"With many companies and programmers using models and codes from open-source sites on the internet, this research shows how important it is to review and verify these materials before integrating them into your current system. If hackers are able to implement code poisoning, they could manipulate models that automate supply chains and propaganda, as well as resume-screening and toxic comment deletion."said Eugene Bagdasaryan, a doctoral candidate at Cornell Tech and lead author of "Blind Backdoors in Deep Learning Models" which was presented Aug. 12 at the virtual USENIX Security '21 conference. The co-author is Vitaly Shmatikov, professor of computer science at Cornell and Cornell Tech.

Without any access to the original code or model, these backdoor attacks can upload malicious code to open-source sites frequently used by many companies and programmers. As opposed to adversarial attacks, which require knowledge of the code and model to make modifications, backdoor attacks allow the hacker to have a large impact, without actually having to directly modify the code and models. The new paper investigates the method for injecting backdoors into machine-learning models, based on compromising the loss-value computation in the model-training code. The team used a sentiment analysis model for the particular task of always classifying as positive all reviews of the infamously bad movies directed by Ed Wood. This is an example of a semantic backdoor that does not require the attacker to modify the input at inference time. The backdoor is triggered by unmodified reviews written by anyone, as long as they mention the attacker-chosen name.

How can the "poisoners" be stopped? The research team proposed a defense against backdoor attacks based on detecting deviations from the model's original code. But even then, the defense can still be evaded.


1 comment :

  1. Anonymous13/8/21 19:36