Comment on page
Monitoring Tabnine
This document will go over how Tabnine services deployed on-premise can be monitored and go over a few examples of monitoring our services locally. You can also enable Tabnine telemetry, which uses the principles shown in this document and reports the data to Tabnine’s servers.
As Tabnine’s self-hosted solution runs in a Kubernetes cluster, we rely on standard tools for our logs and metrics - logs are written to the stdout, and metrics are exposed using http endpoints in Prometheus format.
Note that as both writing logs to stdout and exposing metrics endpoints for scraping are industry standards when working in the Kubernetes ecosystem, there is an extensive collection of tools and platforms that support those formats. This document will go over the configuration options for scrapping metrics and will also provide examples for setting up a simple Prometheus server for scraping the metrics and FluentBit for the collection of the logs into a centralized endpoint.
All Tabnine services output their logs to the stdout. They are picked by and managed by Kubernetes, which allows integration with standard tools for log management and retention.
In Kubernetes, the standard way to deal with logs is to run a collection service, such as FluentD or FluentBit, which collects the logs from the pods and forwards them to a centralized location. Cloud providers usually have an official way of integrating the logs with their native logging platforms. However, they all use FluentD or FluentBit under the hood.
When Tabnine’s telemetry is enabled, we install and use FluentD to forward logs from the cluster to Tabnine’s servers.
Log messages are in the following format:
{
"timestamp": "2023-01-15T03:46:06.861Z",
"level": "error/warning/info/debug",
"message": "msg content"
}
How to send logs to an external log management system
- Cloud providers
Tabnine services export Prometheus metrics and rely on having Prometheus Operator installed on the cluster. If you are unfamiliar with how to install a Prometheus Operator please follow Prometheus Operator install article.
In order to enable Tabnine metrics monitoring, edit the following sections in
values.yaml
.global:
monitoring:
enabled: true
# labels -- by default. If your Promtheus server requires specific labels to be present for the monitors to be picked up, add them here
labels: {}
# annotations -- by default. Some platforms require specific annotations to be present, this setting will apply the annotation to all monitor objects
annotations: {}
tabnine:
telemetry:
# enabled -- Send telemetry data to Tabnine backend
enabled: false
Now that
values.yaml
is updated, it is time to install the chart on the cluster.helm upgrade --install -n tabnine --create-namespace tabnine oci://registry.tabnine.com/self-hosted/tabnine-cloud --values values.yaml
The following example adds a
release=prom-example
label to all PodMonitor
s and ServiceMonitor
created by Tabnine as part of the installation.global:
monitoring:
enabled: true
labels:
release: prom-example
image:
imagePullSecrets:
- name: regcred
tabnine:
[...]
The following configuration:
- 1.Scrapes only PodMonitors and ServiceMonitors with a
release=prom-example
label, - 2.keeps the data for 14 days
- 3.requires 50GB of storage
- 4.requires 6G of RAM to operate
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
name: prom-example
namespace: monitoring
spec:
evaluationInterval: 30s
paused: false
podMonitorNamespaceSelector: {}
podMonitorSelector:
matchLabels:
release: prom-example
portName: http-web
probeNamespaceSelector: {}
probeSelector:
matchLabels:
release: prom-example
replicas: 1
resources:
limits:
cpu: 1
memory: 6G
requests:
cpu: 1
memory: 6G
retention: 14d
routePrefix: /
ruleNamespaceSelector: {}
ruleSelector:
matchLabels:
release: prom-example
scrapeInterval: 30s
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
release: prom-example
shards: 1
storage:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 50Gi
version: v2.42.0
Last modified 5mo ago