Monitoring & Logs

Overview

This document will go over how Tabnine services deployed on-premise can be monitored and go over a few examples of monitoring our services locally. You can also enable Tabnine telemetry, which uses the principles shown in this document and reports the data to Tabnine’s servers.

As Tabnine’s self-hosted solution runs in a Kubernetes cluster, we rely on standard tools for our logs and metrics - logs are written to the stdout, and metrics are exposed using http endpoints in Prometheus format.

Note that as both writing logs to stdout and exposing metrics endpoints for scraping are industry standards when working in the Kubernetes ecosystem, there is an extensive collection of tools and platforms that support those formats. This document will go over the configuration options for scrapping metrics and will also provide examples for setting up a simple Prometheus server for scraping the metrics and FluentBit for the collection of the logs into a centralized endpoint.

Logs

All Tabnine services output their logs to the stdout. They are picked by and managed by Kubernetes, which allows integration with standard tools for log management and retention.

In Kubernetes, the standard way to deal with logs is to run a collection service, such as FluentD or FluentBit, which collects the logs from the pods and forwards them to a centralized location. Cloud providers usually have an official way of integrating the logs with their native logging platforms. However, they all use FluentD or FluentBit under the hood.

When Tabnine’s telemetry is enabled, we install and use FluentD to forward logs from the cluster to Tabnine’s servers.

Log messages are in the following format:

{
  "timestamp": "2023-01-15T03:46:06.861Z",
  "level": "error/warning/info/debug",
  "message": "msg content"
}

How to send logs to an external log management system

Tabnine Audit Logs API

Audit logs are available through the Tabnine Audit Logs API

Metrics

Tabnine services export Prometheus metrics and rely on having Prometheus Operator installed on the cluster. If you are unfamiliar with how to install a Prometheus Operator please follow Prometheus Operator install article.

Enable monitoring of metrics

In order to enable Tabnine metrics monitoring, edit the following sections in values.yaml.

global:
  monitoring:
    enabled: true
    # labels -- by default. If your Promtheus server requires specific labels to be present for the monitors to be picked up, add them here
    labels: {}
    # annotations -- by default. Some platforms require specific annotations to be present, this setting will apply the annotation to all monitor objects
    annotations: {}
  tabnine:
    telemetry:
      # enabled -- Send telemetry data to Tabnine backend
      enabled: false

Now that values.yaml is updated, it is time to install the chart on the cluster.

helm upgrade --install -n tabnine --create-namespace tabnine oci://registry.tabnine.com/self-hosted/tabnine-cloud --values values.yaml

Prometheus example

Values file examples

The following example adds a release=prom-example label to all PodMonitors and ServiceMonitor created by Tabnine as part of the installation.

global:
  monitoring:
    enabled: true
    labels:
      release: prom-example
  image:
    imagePullSecrets:
      - name: regcred
  tabnine:
  [...]

Prometheus configuration file

The following configuration:

  1. Scrapes only PodMonitors and ServiceMonitors with a release=prom-example label,

  2. keeps the data for 14 days

  3. requires 50GB of storage

  4. requires 6G of RAM to operate

for full list of available configurations, please check the Prometheus (CRD) documentation

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prom-example
  namespace: monitoring
spec:
  evaluationInterval: 30s
  paused: false
  podMonitorNamespaceSelector: {}
  podMonitorSelector:
    matchLabels:
      release: prom-example
  portName: http-web
  probeNamespaceSelector: {}
  probeSelector:
    matchLabels:
      release: prom-example
  replicas: 1
  resources:
    limits:
      cpu: 1
      memory: 6G
    requests:
      cpu: 1
      memory: 6G
  retention: 14d
  routePrefix: /
  ruleNamespaceSelector: {}
  ruleSelector:
    matchLabels:
      release: prom-example
  scrapeInterval: 30s
  securityContext:
    fsGroup: 2000
    runAsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector:
    matchLabels:
      release: prom-example
  shards: 1
  storage:
    volumeClaimTemplate:
      spec:
        resources:
          requests:
            storage: 50Gi
  version: v2.42.0

Prometheus Operator Install

Prometheus operator allows setting up and configuring Prometheus servers running in your cluster. As part of the helm installation we also set up by default ServiceMonitor and PodMonitor objects which define how to scrape our services.

If your cluster doesn’t have Prometheus operator installed already, you can install one from our repository or the official helm, depending on your setup. Note that, unlike the official helm, Tabnine’s version doesn’t install Prometheus server by default (.prometheus.enabled is set to false). If you opt-in to install the prometheus server as part of the kube-prometheus-stack - either from the official helm or by setting .prometheus.enabled=true in our chat’s values, you will need to fine-tune the server configuration in the helm chart and not in the Prometheus server object shown later on, as it will be created by the helm chart.

helm upgrade --install --create-namespace -n monitoring monitoring oci://registry.tabnine.com/self-hosted/kube-prometheus-stack

Check if there is an installed operator in your cluster

If you are unsure if the operator is in your cluster, you can run the following commands

# Make sure you have the relevant CRDs installed
$ kubectl get crd | grep monitoring.coreos.com
alertmanagerconfigs.monitoring.coreos.com        2022-12-08T13:15:57Z
alertmanagers.monitoring.coreos.com              2022-12-08T13:15:57Z
podmonitors.monitoring.coreos.com                2022-12-08T13:15:58Z
probes.monitoring.coreos.com                     2022-12-08T13:15:58Z
prometheuses.monitoring.coreos.com               2022-12-08T13:15:59Z
prometheusrules.monitoring.coreos.com            2022-12-08T13:15:59Z
servicemonitors.monitoring.coreos.com            2022-12-08T13:16:00Z
thanosrulers.monitoring.coreos.com               2022-12-08T13:16:00Z

# Make sure there is a prometheus operator running. Note that depening on the helm installation the name might be slightly different.
$ kubectl get pods -A | grep operator
monitoring                     kube-prometheus-stack-operator-XX

Check if you already have Prometheus server in your cluster

$ kubectl get prometheus -A
NAMESPACE    NAME
monitoring   kube-prometheus-stack-prometheus

Note that if you have enabled telemetry as part of Tabnine installation, you will see a Prometheus server created by Tabnine. That server is used for remote-writing metrics to Tabnine and doesn’t persist data locally. If that is the only server you see in the list, or there are none, you can create a server based on the example below.

If you have a Prometheus server, It might be configured the collect data only from Pod/ServiceMontors with specific labels and/or namespaces. Run the following command (based on the example output above, this might be different in your environment) and check the output of the following fields

  • Pod Monitor Namespace Selector

  • Pod Monitor Selector

  • Service Monitor Namespace Selector

  • Service Monitor Selector

In its default setup kube-prometheus-stack requires some labels to be present on the monitor objects for them to be propagated to the Prometheus server. If that is the case, write down the required labels - we will use them below

Last updated

Was this helpful?