Fluent Bit is deployed as a daemonset in Kubernetes, allowing it to run on every node in the cluster. This makes it easy to not worry about reconfiguring Fluent Bit for any new nodes added to the Kubernetes cluster. Fluent Bit supports multiple inputs, outputs, and filter plugins depending on the source, destination, and parsers involved with log processing. For example, the Tail input plugin reads every log event from one or more log files or containers in a manner similar to the UNIX tail -f command. On the output side, the GELF (Graylog Extended Log Format) output plugin sends logs in GELF format to the Graylog server. The Kubernetes filter plugin supplements logs by extracting metadata like Container Name, Container ID, POD Name, Namespace, Labels, and Annotations from the Kubernetes API server, decorating log data with those additional pieces of information.

Fluent Bit Installation

In order to install Fluent Bit, a Kubernetes cluster must be running. You must also have kubectl installed and configured to access the Kubernetes cluster. Finally, it’s necessary to have a Graylog server setup with GELF TCP input and connectivity allowed to the GELF TCP input port from all the Kubernetes nodes.

Here are the steps to create and configure Fluent Bit as a Kubernetes Daemonset.

1. Use kubectl to create the logging namespace, service account, and role-based access control configuration.

sh-4.2$ kubectl create namespace logging
sh-4.2$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-service-account.yaml
sh-4.2$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role.yaml
sh-4.2$ kubectl create -f https://raw.githubusercontent.com/fluent/fluent-bit-kubernetes-logging/master/fluent-bit-role-binding.yaml

2. Update the sample fluent-bit-cm.yaml file with the target Graylog server and GELF TCP input port. Customize and tweak the input plugin corresponding to the environment. Note that in some scenarios, the default Fluent Bit DB flb_kube.db name can cause conflict with New Relic APM, which also relies on Fluent Bit for monitoring purposes. Filter plugin attributes can also be adjusted depending on what Kubernetes metadata needs to be included with the logs.

fluent-bit-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
  labels:
    k8s-app: fluent-bit
data:
  # Configuration files: server, input, filters and output
  # ======================================================
  fluent-bit.conf: |
    [SERVICE]
        Flush                     1
        Log_Level                 info
        Daemon                    off
        Parsers_File              parsers.conf
        HTTP_Server               On
        HTTP_Listen               0.0.0.0
        HTTP_Port                 2020

    @INCLUDE input-kubernetes.conf
    @INCLUDE filter-kubernetes.conf
    @INCLUDE output-graylog.conf

  input-kubernetes.conf: |
    [INPUT]
        Name               tail
        Tag                kube.*
        Path               /var/log/containers/*.log
        Parser             docker
        DB                 /var/log/flb_graylog.db
        DB.Sync            Normal
        Docker_Mode        On
        Buffer_Chunk_Size  512KB
        Buffer_Max_Size    5M
        Rotate_Wait        30
        Mem_Buf_Limit      30MB
        Skip_Long_Lines    On
        Refresh_Interval   10

  filter-kubernetes.conf: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Merge_Log           On
        Merge_Log_Key       log
        Keep_Log            Off
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off		
        Annotations         Off
        Labels              On	

  output-graylog.conf: |
    [OUTPUT]
        Name                    gelf
        Match                   *
        Host                    
        Port                    
        Mode                    tcp
        Gelf_Short_Message_Key  log 

  parsers.conf: |
    [PARSER]
        Name   apache
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache2
        Format regex
        Regex  ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   apache_error
        Format regex
        Regex  ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$

    [PARSER]
        Name   nginx
        Format regex
        Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name   json
        Format json
        Time_Key time
        Time_Format %d/%b/%Y:%H:%M:%S %z

    [PARSER]
        Name        docker
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L
        Time_Keep   On

    [PARSER]
        Name        syslog
        Format      regex
        Regex       ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key    time
        Time_Format %b %d %H:%M:%S

3. Create a Kubernetes ConfigMap named fluent-bit-config using the following command:

sh-4.2$ kubectl create -f fluent-bit-cm.yaml

4. Update the fluent-bit image in the following yaml to the most recent version available on Docker Hub.

fluent-bit-graylog-ds.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluent-bit
  namespace: logging
  labels:
    k8s-app: fluent-bit-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    matchLabels:
      k8s-app: fluent-bit-logging
  template:
    metadata:
      labels:
        k8s-app: fluent-bit-logging
        version: v1
        kubernetes.io/cluster-service: "true"
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "2020"
        prometheus.io/path: /api/v1/metrics/prometheus
    spec:
      containers:
      - name: fluent-bit
        image: fluent/fluent-bit:1.3.10
        imagePullPolicy: Always
        ports:
          - containerPort: 2020
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
        - name: fluent-bit-config
          mountPath: /fluent-bit/etc/
      terminationGracePeriodSeconds: 10
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers
      - name: fluent-bit-config
        configMap:
          name: fluent-bit-config
      serviceAccountName: fluent-bit
      tolerations:
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      - operator: "Exists"
        effect: "NoExecute"
      - operator: "Exists"
        effect: "NoSchedule"

5. Create a Daemonset using the fluent-bit-graylog-ds.yaml to deploy Fluent Bit pods on all the nodes in the Kubernetes cluster.

sh-4.2$ kubectl create -f fluent-bit-graylog-ds.yaml

6. Verify that the fluent-bit pods are running in the logging namespace.

sh-4.2$ kubectl get po -o wide -n logging

7. Verify Kubernetes logs and metadata are reporting to the Graylog server instance.

Conclusion

The advantage of Fluent Bit is its capability to run with very minimal resources efficiently and with very high performance when compared to the competing Fluentd log ingester. It also provides value by offering plugins for most commonly used systems, including output filters for systems like Elasticsearch, Apache Kafka, and Splunk.

If you have questions on how you can best leverage Kubernetes, or are looking for help with your Kubernetes-based implementation, please engage with us via comments on this blog post, or reach out to us here.

Additional Reading

Additional questions regarding Kubernetes? Check out Kubernetes – Container Health Checks or for AWS-specific details, take a look at Setting up EKS Cluster AutoScaler or The Definitive Guide to Setting Up Prometheus with Grafana Integration for EKS. If you have questions about cloud provisioning in general, run through our summary on the Top Ten Best Practices for Terraform Implementations.