Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fluentd logs is full of backslash and kibana doesn't show k8s pods logs #2545

Open
avarf opened this issue Aug 6, 2019 · 17 comments
Open

Fluentd logs is full of backslash and kibana doesn't show k8s pods logs #2545

avarf opened this issue Aug 6, 2019 · 17 comments
Labels
bug Something isn't working

Comments

@avarf
Copy link

avarf commented Aug 6, 2019

Describe the bug
I set up an EFK stack for gathering my different k8s pods logs based on this tutorial: https://mherman.org/blog/logging-in-kubernetes-with-elasticsearch-Kibana-fluentd/ on a Microk8s single node cluster. Everything is up and working and I can connect kibanna to elasticsearch and see the indexes but in the discovery section of kibana there is no log related to my pods and there are kubelete logs.

When I checked the logs of fluentd I saw that it is full of backslashes:

2019-08-05 15:23:17 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-08-05T17:23:10.167379794+02:00 stdout P 2019-08-05 15:23:10 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-08-05T17:23:07.09726655+02:00 stdout P 2019-08-05 15:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\"2019-08-05T17:23:04.433817307+02:00 stdout P 2019-08-05 15:23:04 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\"2019-08-05T17:22:52.546188522+02:00 stdout P 2019-08-05 15:22:52 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \\\\\\\\\\\\\\\"2019-08-05T17:22:46.694679863+02:00 stdout F \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

There are much more backslashes but I just copied this amount to show the log.

Your Environment

  • Fluentd or td-agent version: I tested this with two images: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch and also v1.3 but the results were the same
  • Operating system: I am using Ubuntu 18.04 but the fluentd is running in a container and in a single node kubernetes cluster running on Microk8s

Your Configuration
Based on the tutorial that I mentioned earlier I am using two config files for setting up fluentd:

  1. fluentd-rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: kube-system
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-system
  1. fluentd-daemonset.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  # namespace: default
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4-debian-elasticsearch
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.logging"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENT_UID
            value: "0"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
@avarf avarf added the bug Something isn't working label Aug 6, 2019
@repeatedly
Copy link
Member

When I checked the logs of fluentd I saw that it is full of backslashes:

Is this log single line, right? If so, it seems several logs are merged into one.
Do you show the configuration/application example to reproduce the problem?

@avarf
Copy link
Author

avarf commented Aug 8, 2019

No, the log is full of backslashes and there are single lines of actual log and then pages of backslashes but I didn't want to copy all the meaningless backslashes and when I searched for the "error" there wasn't any.
Regarding the configuration, you have all the configuration, I followed that tutorial, used an image and the environment variables that you can see in the yaml files and I ran it on Microk8s and Ubuntu 18.04.

@codesqueak
Copy link

Any progress on this issue ? I seem to have just hit exactly the same problem

I use a slightly different setup using

image: fluent/fluentd-kubernetes-daemonset:v1-debian-elasticsearch

but otherwise substantially the same.

Looking at the logs, it appears to be repeatedly reprocessing the same information, objecting to the format, which generates a new, longer log entry which is then reprocessed .... and around we go.

@ebastos
Copy link

ebastos commented Dec 9, 2019

I have the same problem after following this tutorial, but using k3s as my kubernetes deployment.

If I strip the backslashs I can see something like:

# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |tr -s "\\"
tr: warning: an unescaped backslash at end of string is not portable
\"
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.66350503Z stdout F \"\"\""
2019-12-09 20:23:29 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: "2019-12-09T20:23:24.664147887Z stdout P 2019-12-09 20:23:24 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:21.243596958Z stdout P 2019-12-09 20:23:21 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:07.807619666Z stdout P 2019-12-09 20:23:07 +0000 [warn]: #0 [in_tail_container_logs] pattern not match: \"2019-12-09T20:23:01.152628152Z stdout F \"

But otherwise it's not even possible to see what is going on:

# kubectl logs --tail=5 fluentd-48jkv -n kube-logging |egrep -o '\\'|wc -l
32650

My fluend.yaml is as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  labels:
    app: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: kube-logging
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.kube-logging.svc.cluster.local"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

@etegan
Copy link

etegan commented Jan 13, 2020

Same issue. Does anyone have a solution for this?

@neoxack
Copy link

neoxack commented Jan 30, 2020

Same issue \\\\

@johnfedoruk
Copy link

If your fluentd logs are growing in backslashes, then your fluentd container is parsing its own logs and recursively generating new logs.

Consider creating a fluentd-config.yaml file that is setup to ignore /var/log/containers/fluentd* logs. My example here will help you parse Apache logs... RTFM for more information on configuring sources.

Here is my fluentd-config.yaml file:

kind: ConfigMap
apiVersion: v1
metadata:
  name: fluentd-config
  namespace: kube-logging
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
data:
  containers.input.conf: |-
    <source>
      @type tail
      @id in_tail_container_logs
      path /var/log/containers/*.log
      exclude_path ["/var/log/containers/fluentd*"]
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      format /^.* (?<source>(stderr|stdout))\ F\ (?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
      time_format %d/%b/%Y:%H:%M:%S %z
    </source>
  output.conf: |-
    # Enriches records with Kubernetes metadata
    <filter kubernetes.**>
      type kubernetes_metadata
    </filter>
    <match **>
       type elasticsearch
       log_level info
       include_tag_key true
       host elasticsearch.kube-logging.svc.cluster.local
       port 9200
       logstash_format true
       # Set the chunk limits.
       buffer_chunk_limit 2M
       buffer_queue_limit 8
       flush_interval 5s
       # Never wait longer than 5 minutes between retries.
       max_retry_wait 30
       # Disable the limit on the number of retries (retry forever).
       disable_retry_limit
       # Use multiple threads for processing.
       num_threads 2
    </match>

Then you will want to update your fluentd DaemonSet. I have had success with the gcr.io/google-containers/fluentd-elasticsearch:v2.0.1 image. Attach your fluentd-config to your fluentd DaemonSet.

Here's what that looks like:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: gcr.io/google-containers/fluentd-elasticsearch:v2.0.1
        env:
          - name: FLUENTD_SYSTEMD_CONF
            value: "disable"
          - name: FLUENTD_ARGS
            value: "--no-supervisor -q"
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlogcontainers
          mountPath: /var/log/containers
          readOnly: true
        - name: config
          mountPath: /etc/fluent/config.d
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlogcontainers
        hostPath:
          path: /var/log/containers/
      - name: config
        configMap:
          name: fluentd-config

Best of luck!

@repeatedly
Copy link
Member

repeatedly commented Mar 12, 2020

@lecaros
Copy link

lecaros commented Oct 20, 2020

I see 2 possible concurrent causes:

  1. You're not excluding fluentd logs (hence the numerous '\' and the circular log messages)
  2. k3s will prefix the log lines with datetime, stream (stdout, stderr) and a log tag. So, if your message is "hello Dolly", k3s will save it to the file as:
    2020-10-20T18:05:39.163671864-05:00 stdout F "hello Dolly"

The pattern not match explains why kibana doesn't see any error message. They're not being sent to your elastic service.

Having a proper filter/parser would help on this.
Can you post your fluentd conf?

@mariusgrigoriu
Copy link

Is there a good way for fluentd's own logs to be shipped if possible?

@mickdewald
Copy link

I got this issue as well, because I was using containerd instead of docker. I solved it by putting in the following configuration:

- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
  value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/

@108356037
Copy link

@micktg
Your solution fixed my problem! Much appreciation!

@repeatedly
Copy link
Member

For lastest images, use cri parser is better than regexp: https://github.com/fluent/fluentd-kubernetes-daemonset#use-cri-parser-for-containerdcri-o-logs

@varungupta19
Copy link

I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this

    env:
      - name:  FLUENT_ELASTICSEARCH_HOST
        value: "elasticsearch.kube-logging.svc.cluster.local"
      - name:  FLUENT_ELASTICSEARCH_PORT
        value: "9200"
      - name: FLUENT_ELASTICSEARCH_SCHEME
        value: "http"
      - name: FLUENTD_SYSTEMD_CONF
        value: disable
      - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
        value: /var/log/containers/fluent*
      - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
        value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ 

@rawinng
Copy link

rawinng commented Aug 13, 2021

I found @micktg and @varungupta19 answer solve the problem.

@jsvasquez
Copy link

Thanks, @micktg and @varungupta19. Problem solved.

@Athuliva
Copy link

Athuliva commented Apr 6, 2024

I followed a digital ocean tutorial https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes to setup my EFK for kubernetes and faced the same issue. The above answer by @micktg resolved the issue. I added the below in environment variables of my fluentd yaml file, so now my environment variables look like this

    env:
      - name:  FLUENT_ELASTICSEARCH_HOST
        value: "elasticsearch.kube-logging.svc.cluster.local"
      - name:  FLUENT_ELASTICSEARCH_PORT
        value: "9200"
      - name: FLUENT_ELASTICSEARCH_SCHEME
        value: "http"
      - name: FLUENTD_SYSTEMD_CONF
        value: disable
      - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH
        value: /var/log/containers/fluent*
      - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
        value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ 

adding value: /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/ this didn't help me. I am trying to implement it in microk8s

openstack-mirroring pushed a commit to openstack/openstack-helm-infra that referenced this issue Jul 9, 2024
+ prevent Fluentd from parsing its own logs and fix an issue with
  endless backslashes (fluent/fluentd#2545)
+ increase chunk limit size
+ add storage for systemd plugin configuration
+ add pos_file parameter for the tail sources

Change-Id: I7d6e54d2324e437c92e5e8197636bd6c54419167
openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Jul 9, 2024
* Update openstack-helm-infra from branch 'master'
  to 01e66933b3c2b93c6677c04a00361ceeb70a9634
  - [fluentd] Adjust configuration for v1.15
    
    + prevent Fluentd from parsing its own logs and fix an issue with
      endless backslashes (fluent/fluentd#2545)
    + increase chunk limit size
    + add storage for systemd plugin configuration
    + add pos_file parameter for the tail sources
    
    Change-Id: I7d6e54d2324e437c92e5e8197636bd6c54419167
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests