Telemetry - Application Logs
Application logs are the historical go-to signals for debugging an application and deriving the internal state of an application. They can be very useful if developers emit the logs wisely (using the right severity level and context) and are essential for observing an application. However, they usually lack contextual information, such as where they were called from.
The Telemetry component provides the Fluent Bit log collector for the collection and shipment of application logs of any container running in the Kyma runtime. You can configure the log collector with external systems using runtime configuration with a dedicated Kubernetes API (CRD) named LogPipeline
. With the LogPipeline's HTTP output, you can natively integrate with vendors which support this output, or with any vendor using a Fluentd integration. The support for the aimed vendor-neutral OTLP protocol will be added soon. To overcome the missing flexibility of the current proprietary protocol, you can run the collector in the unsupported mode, leveraging the full vendor-specific output options of Fluent Bit. You can also bring your own log collector if you need advanced configuration options.
Prerequisites
Your application must log to stdout
or stderr
, which is the recommended way by Kubernetes to emit logs. It ensures that the logs are processable by Kubernetes primitives like kubectl logs
. Any other way of instrumentation is not supported yet. In the future, an OTLP push-based endpoint might be provided to send logs from the application to the collector/agent.
Architecture
Fluent Bit
The Telemetry component provides Fluent Bit as a log collector. Fluent Bit collects all application logs of the cluster workload and ships them to a backend.
- Container logs are stored by the Kubernetes container runtime under the
var/log
directory and its subdirectories. - Fluent Bit runs as a DaemonSet (one instance per node), detects any new log files in the folder, and tails them using a filesystem buffer for reliability.
- Fluent Bit queries the Kubernetes API Server for additional Pod metadata, such as Pod annotations and labels.
- The Telemetry component configures Fluent Bit with your custom output configuration.
- If Kyma's deprecated logging component is installed, the operator configures the shipment to the in-cluster Loki instance automatically.
- As specified in your LogPipeline configuration, Fluent Bit sends the log data to observability systems outside or inside the Kyma cluster. Here, you can use the integration with HTTP to integrate a system directly or with an additional Fluentd installation.
- The user accesses the internal and external observability system to analyze and visualize the logs.
Pipelines
Fluent Bit comes with a pipeline concept, which supports a flexible combination of inputs with outputs and filtering in between; for details, see Fluent Bit: Output.
Kyma's Telemetry component brings a predefined setup of the Fluent Bit DaemonSet and a base configuration, which assures that the application logs of the workloads in the cluster are processed reliably and efficiently. Additionally, the telemetry component provides a Kubernetes API called LogPipeline
to configure outputs with some filtering capabilities.
A central
tail
input plugin reads the application logs.The application logs are enriched by the
kubernetes
filter. Then, for every LogPipeline definition, arewrite_tag
filter is generated, which uses a dedicatedtag
with the name<logpipeline>.*
, followed by the custom configuration defined in the LogPipeline resource. You can add your own filters to the default filters.Based on the default and custom filters, you get the desired output for each
LogPipeline
.
This approach assures a reliable buffer management and isolation of pipelines, while keeping flexibility on customizations.
Telemetry Operator
The LogPipeline resource is managed by the Telemetry Operator, a typical Kubernetes operator responsible for managing the custom parts of the Fluent Bit configuration.
The Telemetry Operator watches all LogPipeline resources and related Secrets. Whenever the configuration changes, it validates the configuration (with a validating webhook) and generates a new configuration for the Fluent Bit DaemonSet, where several ConfigMaps for the different aspects of the configuration are generated. Furthermore, referenced Secrets are copied into one Secret that is also mounted to the DaemonSet.
Setting up a LogPipeline
In the following steps, you can see how to set up a typical LogPipeline. For an overview of all available attributes, see the reference document.
Step 1: Create a LogPipeline and output
To ship application logs to a new output, create a resource file of the LogPipeline kind:
Click to copykind: LogPipelineapiVersion: telemetry.kyma-project.io/v1alpha1metadata:name: http-backendspec:output:http:dedot: falseport: "80"uri: "/"host:value: https://myhost/logsuser:value: "user"password:value: "not-required"An output is a data destination configured by a Fluent Bit output of the relevant type. The LogPipeline supports the following output types:
- http, which sends the data to the specified HTTP destination. The output is designed to integrate with a Fluentd HTTP Input, which opens up a huge ecosystem of integration possibilities.
- grafana-loki, which sends the data to the Kyma-internal Loki instance.
Note: This output is considered legacy and is only provided for backward compatibility with the deprecated in-cluster Loki instance. It might not be compatible with the latest Loki versions. For integration with a custom Loki installation, use the
custom
output with the nameloki
instead. See also this tutorial. custom, which supports the configuration of any destination in the Fluent Bit configuration syntax.
Note: If you use a
custom
output, you put the LogPipeline in the unsupported mode.See the following example of the
custom
output:Click to copyspec:output:custom: |Name httpHost https://myhost/logsHttp_User userHttp_Passwd not-requiredFormat jsonPort 80Uri /Tls ontls.verify onNOTE: If you use a
custom
output, you put the LogPipeline in the unsupported mode.
To create the instance, apply the resource file in your cluster:
Click to copykubectl apply -f path/to/my-log-pipeline.yamlCheck that the status of the LogPipeline in your cluster is
Ready
:Click to copykubectl get logpipelineNAME STATUS AGEhttp-backend Ready 44s
Step 2: Create an input
If you need selection mechanisms for application logs on the Namespace or container level, you can use an input spec to restrict or specify from which resources logs are included.
If you don't define any input, it's collected from all Namespaces, except the system Namespaces kube-system
, istio-system
, kyma-system
, and kyma-integration
, which are excluded by default. For example, you can define the Namespaces to include in the input collection, exclude Namespaces from the input collection, or choose that only system Namespaces are included. Learn more about the available parameters and attributes.
The following example collects input from all Namespaces excluding kyma-system
and only from the istio-proxy
containers:
kind: LogPipelineapiVersion: telemetry.kyma-project.io/v1alpha1metadata: name: http-backendspec: input: application: namespaces: exclude: - kyma-system containers: include: - istio-proxy output: ...
It might happen that Fluent Bit prints an error per processed log line, which is then collected and re-processed. To avoid problems with such recursive logs, it is recommended that you exclude the logs of the Fluent Bit container. The following example collects input from all Namespaces including system Namespaces, but excludes the Fluent Bit container:
spec: input: application: namespaces: system: true containers: exclude: - fluent-bit
Step 3: Add filters
To enrich logs with attributes or drop whole lines, add filters to the existing pipeline. The following example contains three filters, which are executed in sequence.
kind: LogPipelineapiVersion: telemetry.kyma-project.io/v1alpha1metadata: name: http-backendspec: filters: - custom: | Name grep Regex $kubernetes['labels']['app'] my-deployment - custom: | Name grep Exclude $kubernetes['namespace_name'] kyma-system|kube-system|kyma-integration|istio-system - custom: | Name record_modifier Record cluster_identifier ${KUBERNETES_SERVICE_HOST} input: ... output: ...
NOTE: If you use a
custom
output, you put the LogPipeline in the unsupported mode.
The Telemetry Operator supports different types of Fluent Bit filter. The example uses the grep and the record_modifier filter.
- The first filter keeps all log records that have the
kubernetes.labels.app
attribute set with the valuemy-deployment
; all other logs are discarded. Thekubernetes
attribute is available for every log record. See Kubernetes filter (metadata) for more details. - The second filter drops all log records fulfilling the given rule. Here, typical Namespaces are dropped based on the
kubernetes
attribute. - A log record is modified by adding a new attribute. Here, a constant attribute is added to every log record to record the actual cluster node name at the record for later filtering in the backend system. As a value, a placeholder is used referring to a Kubernetes-specific environment variable.
Step 4: Add authentication details from Secrets
Integrations into external systems usually need authentication details dealing with sensitive data. To handle that data properly in Secrets, the LogPipeline supports the reference of Secrets.
Using the http output definition and the valueFrom attribute, you can map Secret keys as in the following http output example:
kind: LogPipelineapiVersion: telemetry.kyma-project.io/v1alpha1metadata: name: http-backendspec: output: http: dedot: false port: "80" uri: "/" host: valueFrom: secretKeyRef: name: http-backend-credentials namespace: default key: HTTP_ENDPOINT user: valueFrom: secretKeyRef: name: http-backend-credentials namespace: default key: HTTP_USER password: valueFrom: secretKeyRef: name: http-backend-credentials namespace: default key: HTTP_PASSWORD input: ... filters: ...
The related Secret must fulfill the referenced name and Namespace, and contain the mapped key as in the following example:
kind: SecretapiVersion: v1metadata: name: http-backend-credentialsstringData: HTTP_ENDPOINT: https://myhost/logs HTTP_USER: myUser HTTP_PASSWORD: XXX
To leverage data provided by the Kubernetes Secrets in a custom
output definition, use placeholder expressions for the data provided by the Secret, then specify the actual mapping to the Secret keys in the variables section, like in the following example:
kind: LogPipelineapiVersion: telemetry.kyma-project.io/v1alpha1metadata: name: http-backendspec: output: custom: | Name http Host ${ENDPOINT} # Defined in Secret HTTP_User ${USER} # Defined in Secret HTTP_Password ${PASSWORD} # Defined in Secret Tls On variables: - name: ENDPOINT valueFrom: secretKeyRef: - name: http-backend-credentials namespace: default key: HTTP_ENDPOINT input: ... filters: ...
NOTE: If you use a
custom
output, you put the LogPipeline in the unsupported mode.
Step 5: Rotate the Secret
As used in the previous step, a Secret referenced with the secretKeyRef construct can be rotated manually or automatically. For automatic rotation, update the Secret's actual values and keep the Secret's keys stable. The LogPipeline watches the referenced Secrets and detects changes, so the Secret rotation takes immediate effect. When using a Secret owned by the SAP BTP Operator you can configure a credentialsRotationPolicy
with a specific rotationFrequency
to achieve an automated rotation.
Step 6: Add a parser
Typically, you want your logs shipped in a structured format so that a backend like OpenSearch can immediately index the content according to the log attributes. By default, a LogPipeline tries to parse all logs as a JSON document and enrich the record with the parsed attributes on the root record. Thus, logging in JSON format in the application results in structured log records. Sometimes, logging in JSON is not an option (the log configuration is not under your control), and the logs are in an unstructured or plain format. To adjust this, you can define your custom parser and activate it with a filter or a Pod annotation.
The following example defines a parser named dummy_test
using a dedicated LogParser
resource type:
kind: LogParserapiVersion: telemetry.kyma-project.io/v1alpha1metadata: name: dummy_testspec: parser: content: | Format regex Regex ^(?<INT>[^ ]+) (?<FLOAT>[^ ]+) (?<BOOL>[^ ]+) (?<STRING>.+)$
The parser is referenced by its name in a filter of the pipeline and is activated for all logs of the pipeline.
kind: LogPipelineapiVersion: telemetry.kyma-project.io/v1alpha1metadata: name: http-backendspec: filters: - custom: | Name parser Parser dummy_test input: ... output: ...
NOTE: If you use a
custom
output, you put the LogPipeline in the unsupported mode.
Instead of defining a filter, you can annotate your workload in the following way (here, the parser is activated only for the annotated workload):
apiVersion: v1kind: Podmetadata: name: dummy annotations: fluentbit.io/parser: dummy_testspec: ...
Log record processing
After a log record has been read, it is preprocessed by centrally configured plugins, like the kubernetes
filter. Thus, when a record is ready to be processed by the sections defined in the LogPipeline definition, it has several attributes available for processing and shipment.
Learn more about these attributes in the following sections.
Container log message
In the example, we assume there's a container myContainer
of Pod myPod
, running in Namespace myNamespace
, logging to stdout
with the following log message in the JSON format:
{ "level": "warn", "message": "This is the actual message", "tenant": "myTenant", "traceID": "123"}
Tail input
The central pipeline tails the log message from a log file managed by the container runtime. The file name contains the Namespace, Pod, and container information that will be available later as part of the tag. The resulting log record available in an internal Fluent Bit representation looks similar to the following example:
{ "time": "2022-05-23T15:04:52.193317532Z", "stream": "stdout", "_p": "F", "log": "{\"level\": \"warn\",\"message\": \"This is the actual message\",\"tenant\": \"myTenant\",\"traceID\": \"123\"}}
The attributes in the example have the following meaning:
Attribute | Description |
---|---|
time | The timestamp generated by the container runtime at the moment the log was written to the log file. |
stream | The stream to which the application wrote the log, either stdout or stderr . |
_p | Indicates if the log message is partial (P ) or final (F ). Optional, dependent on container runtime. Because a CRI multiline parser is applied for the tailing phase, all multilines on the container runtime level are aggregated already and no partial entries must be left. |
log | The raw and unparsed log message. |
Kubernetes filter (metadata)
In the next stage, the Kubernetes filter is applied. The container information from the log file name (available in the tag) is interpreted and used for a Kubernetes API Server request to resolve more metadata of the container. All the resolved metadata enrich the existing record as a new attribute kubernetes
:
{ "kubernetes": { "pod_name": "myPod-74db47d99-ppnsw", "namespace_name": "myNamespace", "pod_id": "88dbd1ef-d977-4636-804d-ef220454be1c", "host": "myHost1", "container_name": "myContainer", "docker_id": "5649c36fcc1e956fc95e3145441f427d05d6e514fa439f4e4f1ccee80fb2c037", "container_hash": "myImage@sha256:1f8d852989c16345d0e81a7bb49da231ade6b99d51b95c56702d04c417549b26", "container_image": "myImage:myImageTag", "labels": { "app": "myApp", ... }, "annotations": { "sidecar.istio.io/inject"=>"true", ... } }}
Kubernetes filter (JSON parser)
After the enrichment of the log record with the Kubernetes-relevant metadata, the Kubernetes filter also tries to parse the record as a JSON document. If that is successful, all the parsed root attributes of the parsed document are added as new individual root attributes of the log.
The record before applying the JSON parser:
{ "time": "2022-05-23T15:04:52.193317532Z", "stream": "stdout", "_p": "F", "log": "{\"level\": \"warn\",\"message\": \"This is the actual message\",\"tenant\": \"myTenant\",\"traceID\": \"123\"}", "kubernetes": {...}}
The record after applying the JSON parser:
{ "time": "2022-05-23T15:04:52.193317532Z", "stream": "stdout", "_p": "F", "log": "{\"level\": \"warn\",\"message\": \"This is the actual message\",\"tenant\": \"myTenant\",\"traceID\": \"123\"}", "kubernetes": {...}, "level": "warn", "message": "This is the actual message", "tenant": "myTenant", "traceID": "123"}
Rewrite tag
As per the LogPipeline definition, a dedicated rewrite_tag filter is introduced. The filter brings a dedicated filesystem buffer for the outputs defined in the related pipeline, and with that, ensures a shipment of the logs isolated from outputs of other pipelines. As a consequence, each pipeline runs on its own tag.
Limitations
Currently there are the following limitations for LogPipelines that are served by Fluent Bit:
Unsupported Mode
The unsupportedMode
attribute of a LogPipeline indicates that you are using a custom
filter and/or custom
output. The Kyma team does not provide support for a custom configuration.
Fluent Bit plugins
You cannot enable the following plugins, because they potentially harm the stability:
- Multiline Filter
- Kubernetes Filter
- Rewrite_Tag Filter
Reserved log attributes
The log attribute named kubernetes
is a special attribute that's enriched by the kubernetes
filter. When you use that attribute as part of your structured log payload, the metadata enriched by the filter are overwritten by the payload data. Filters that rely on the original metadata might no longer work as expected.
Furthermore, the __kyma__
prefix is used internally by the Telemetry Operator. When you use the prefix attribute in your log data, the data might be overwritten.
Buffer limits
Fluent Bit buffers up to 1 GB of logs if a configured output cannot receive logs. The oldest logs are dropped when the limit is reached.
Throughput
Each Fluent Bit Pod can process up to 10 MB/s of logs for a single LogPipeline. With multiple pipelines, the throughput per pipeline is reduced. The used logging backend or performance characteristics of the output plugin might limit the throughput earlier.
Max amount of pipelines - CPU/Mem constraints
In the production profile, no more than 5 LogPipelines. In the evaluation profile, no more than 3 LogPipelines.