Tabnine Adds Code Provenance And Attribution Checks |
Written by Kay Ewbank | |||
Tuesday, 07 January 2025 | |||
Tabnine has added a feature intended to reduce the risk of IP infringement. The new Provenance and Attribution feature checks that code suggested by AI code assistants doesn't use code with copyright restrictions. Tabnine is a code completion tool that uses generative AI for automatic code completion. It has AI-powered tools for code generation, testing, and code review, and supports 80 programming languages and frameworks. The Tabline team says that while state-of-the-art LLMs like Claude 3.5 Sonnet and GPT-4o have greatly improved the performance of generative AI applications, including AI code assistants, they have increased the risk of including code that is copyright restricted. The reason is that the code these LLMs are trained on is collected without taking into account restrictions on how it can be used. The data the models are trained includes content from code repositories, some of which contain permissively licensed code while other repos contain code that has restrictions on how it can be used (for example, code with copyleft licensing like GPL). Copyleft licensing grants some freedoms over copies of copyrighted works so long as the same rights are passed on to works derived from the original. Because LLMs tend to replicate patterns from their training data, third-party models like Claude 3.5 Sonnet and GPT-4o can regenerate code that exists in their training dataset, including code with copyleft licensing. If you inadvertently accept such code suggestions, then it introduces nonpermissive code in your codebase, resulting in IP infringement. The Topline developers say that since the copyright law for the use of AI-generated content is still unsettled, there's a need to minimize the chance of including restricted code while still benefiting from the performance gains that come from these models. In recognition of this, Tabnine has announced Provenance and Attribution, a new feature that can drastically reduce the risk of IP infringement when using models like Anthropic's Claude, OpenAI's GPT-4o, and Cohere's Command R+ for software development. Tabnine now checks the code generated within its AI chat against the publicly visible code on GitHub, flags any matches it finds, and references the source repository and its license type. Developers can then use this information to review code suggestions and decide if they meet the organisations specific requirements and policies. In the past, Tabnine solved for this by offering a license-compliant model, Tabnine Protected 2, an LLM purpose-built for software development and trained exclusively on code that is permissively licensed. The new Provenance and Attribution feature offers an alternative for teams that are comfortable using a wider variety of models as long as they specifically don't inject unlicensed code. More InformationRelated ArticlesDevelopers Positive About Using AI Tools To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.
Comments
or email your comment to: comments@i-programmer.info |
|||
Last Updated ( Tuesday, 07 January 2025 ) |