Privacy

Tabnine AI code assistant: Privacy

Code privacy

When using Tabnine models, your code remains private. Tabnine NEVER retains or shares any of your code with third parties.

With Tabnine models (the code completion model and the Tabnine Protected and Tabnine + Mistral chat models), Tabnine uses ephemeral processing whenever dealing with user code. This means that user code is only retained on the server for the duration of computing the desired result (i.e., completions and embeddings) and is never persisted.

With Tabnine Chat, you have the option of using third-party models. The privacy policies and the protection offered by these third-party models may be different from the Tabnine models.

Querying the Tabnine AI model for AI coding assistance

As you code, the Tabnine client (plugin) requests AI assistance from the Tabnine cluster.

For code suggestions, the process occurs in the background as you code. For chat, this request process occurs once the user asks a question.

These requests include some code from the local project as context (the “context window” as described below) to allow Tabnine to return the most relevant and accurate answers. This context window may include elements from your local environment, such as:

  • Chat history (for chat)

  • Lines of code

  • Variables

  • Type declarations

  • Functions

  • Objects

  • Related imports from the current file

  • Related files

  • Syntactic and semantic error reports

This context is deleted immediately after the server returns the answer to the client.

Tabnine doesn’t retain any user code beyond the immediate time frame required for inferencing the model. This is what we call ephemeral processing.

The sole purpose of the context window is to facilitate the most accurate answers possible. The moment that output is generated, the code is discarded and is never stored.

This is true even for Tabnine Enterprise’s private deployment options (on-premises and VPC).

With Tabnine models, your code is not shared with third parties

We develop our AI models based on our own pioneering experience and the best-of-breed, permissive, open source technologies in the market.

No third-party APIs are used.

Tabnine does not train its models on your code

Tabnine’s code completion model and Tabnine Protected chat model are only trained on open source code with permissive licenses.

Private fine-tuned AI models are pretrained on private code by Tabnine and are only accessible by your team members and stored on your private setup.

Learn more about Tabnine’s AI models.

Clarification regarding the Magic Moments feature

The code completion examples in the Tabnine Hub (in the IDE) under the Magic Moments tab are saved locally on the user's machine and never leave the computer.

Personalization

Tabnine's personalization capabilities — including context through local code awareness and connection to software repository for global code awareness — require creating a RAG index of your code. The computation for vector embeddings for the chat RAG index requires a lot of resources, and cannot be done locally without stressing the user’s machine. Tabnine performs this computation on the server GPU while keeping the same principles:

  • Your code remains private; Tabnine never stores your code.

  • Tabnine does not share any of your code with third parties.

  • Tabnine does not train on your code.

Learn more

Data plane in self-hosted / air-gapped deployment

The Tabnine cluster collects operational metrics and logs to ensure system health and quality of service.

In an air-gapped deployment, metrics can be sent to a Prometheus server and logs can be sent to your log aggregator. In a self-hosted deployment, the Tabnine cluster sends operational metrics and logs to Tabnine’s servers to allow improved support when required. No code or PII data is ever sent to Tabnine’s servers.

Tabnine cluster

The Tabnine cluster sends operational metrics and logs (every 1 second) to Tabnine’s servers. Metrics and logs data are retained for a week. This includes:

  • GPU and CPU utilization

  • GPU and CPU memory

  • Server throughput

  • Server latency

Tabnine client

The Tabnine client sends telemetry to Tabnine’s self-hosted server (which is then streamed to Tabnine’s servers) on various user interactions. This includes:

  • Plugin and binary configurations

  • User machine details, including CPU type, available processors, and memory

  • One-way hashed, nonidentifiable data, including user email, hostname, and IP

  • IDE details, including type and version

  • Statistical data: Aggregated number of suggestions/completions per programming language

Last updated