AI models

Tabnine AI code assistant: AI models

Tabnine's AI coding assistance is backed by Tabnine’s proprietary AI models for code completions and chat, which are trained and hosted by Tabnine and are private and protected.

In addition, Tabnine Chat includes the option of using third-party models. The privacy policies and the protection offered by these third-party models may be different from the Tabnine models.

Tabnine’s AI models

These are the proprietary models hosted by Tabnine:

  • Tabnine Universal code completions model: Tabnine’s proprietary model is designed to deliver exceptional performance without the risk of intellectual property violations. It's trained and hosted by Tabnine and is available in all tiers.

  • Tabnine Protected chat model: Tabnine’s core model is designed to deliver high performance without the risk of intellectual property violations.

Learn more

Optional AI models for chat

Tabnine Chat users can choose from these chat models (in addition to the Tabnine Protected chat model):

  • Claude 3.5 Sonnet: Claude 3.5 Sonnet raises the industry bar for coding tasks.

    Privacy: The Anthropic Claude 3.5 Sonnet model is hosted on Amazon Bedrock and Tabnine sends data to Amazon Bedrock for computing responses to the user prompts.

    Protection: The source of Anthropic Claude 3.5 Sonnet training data is not fully disclosed, so using this model may introduce intellectual property liability risks.

  • GPT-4o: Best class of performance and significantly faster than GPT-4.0 Turbo.

    Privacy: The OpenAI GPT-4o integrated model runs on OpenAI servers. According to their terms of service, OpenAI may use your data to provide, maintain, develop, and improve their services.

    Protection: The source of OpenAI GPT training data is not fully disclosed, so using this model may introduce intellectual property liability risks.

  • Codestral: Trained on more than 80 programming languages, Codestral — Mistral’s first-ever code model, demonstrates proficiency in both widely used and less common languages.

    Privacy: The Tabnine + Mistral Codestral integrated model runs on Mistral servers. According to their terms of service, Mistral may use your data to provide, maintain, develop, and improve their services.

    Protection: The source of Mistral’s training data is not disclosed, so using this model may introduce intellectual property liability risks. [Note: Codestral is not available for Enterprise customers.]

  • Command R+: Cohere’s Command R+ model is ideal for large-scale production workloads and balances high efficiency with high accuracy.

    Privacy: The Cohere Command R+ model is hosted on Oracle Cloud Infrastructure (OCI) and Tabnine sends data to OCI for computing responses to the user prompts.

    Protection: The source of the Cohere Command R+ model’s training data is not fully disclosed, so using this model may introduce intellectual property liability risks.

  • Tabnine + Mistral: The Tabnine + Mistral integrated model delivers high performance while still maintaining complete privacy.

    Privacy: Tabnine guarantees zero data retention; any information sent to our inference servers is encrypted in transit, runs only in memory, and is deleted after delivering the response.

    Protection: The source of Mistral’s training data is not disclosed, so using this model may introduce intellectual property liability risks.

Tabnine Enterprise customers with private installation can use some of these models and more using private endpoints. Learn more

Tabnine gives you the insight you need to choose

Tabnine users can choose which chat model to use. This decision depends on the specific use case and constraints of each user around these three main aspects:

  • Performance: Does the model provide accurate, relevant results for the programming languages and frameworks I’m working in right now?

  • Privacy: Does the model store my code or user data? Could my code or data be shared with third parties? Is my code used to train their model?

  • Protection: What code was the model trained on? Is it all licensed permissively by the author? Will I create risks for my business by accepting generated code from a model trained on unlicensed repositories?

Performance levels

The performance levels are Tabnine’s estimation of how each model behaves in real-world software development use cases, as Tabnine has deployed them with context awareness.

Privacy

  • Private: No code data is retained or shared with Tabnine or any other entities.

  • Not private: Code or data may be shared with third parties, as per their public terms of service. Tabnine still adheres to our zero data retention policy.

Protection

Models might recite code they were trained on. The unwary developer might commit code recited from an open source repository with a nonpermissive license. This will expose their employer to a legal risk due to the code license infringement.

  • ContactProtected model (training time protection): The model was exclusively trained on code with permissive open-source licenses, or on code that was otherwise licensed by the model provider. Any code used for training is explicitly allowed for use by developers without encumbrances. The model cannot recite restricted code.

  • Attribution and Provenance (inference time protection): For any model (independent of what it was trained on), trace the provenance of all code generated by the model, then report/censor code with an open source provenance trace according to its license. This lets the developer use any model with a layer of protection, shielding the legal risk caused by being blind to nonpermissive code recitation.

    Enterprise-only Private Preview: Reach out to your Customer Success Manager if you wish to participate.

  • Not protected: Model training data may include code with licenses that do not explicitly allow their reuse or allow their use for training AI models.

Tabnine users can choose which Tabnine Chat model to use

Tabnine users specify their preferred model the first time they use Chat and can change it anytime. For projects where data privacy and legal risks are less important, you can use a model optimized for performance over compliance. As you switch to working on projects that have stricter requirements for privacy and protection, you can change to a model like Tabnine Protected that's built for that purpose. The underlying LLM can be changed with just a few clicks — and Tabnine Chat adapts instantly.

Tabnine Enterprise administrators control and specify the models that are available to their organization. Administrators control the available models for their organization. Enterprises often make strategic bets on using specific models across their organization. This update helps Tabnine to be compatible with your chosen LLM and be a part of its ecosystem and makes it easier for you to get the most out of Tabnine without evolving your LLM strategy.

Fine-tuned code completion models for Enterprise customers

Enterprise customers have the option to deploy private fine-tuned models. Fine-tuned models are private models that result from refining the Universal completion model with the customer codebase and replacing the Universal model.

Learn more

Last updated