Andrew Elwell
10/31/2025, 12:11 AM[ 89%] Built target flb-plugin-custom_calyptia
[ 98%] Built target fluent-bit-shared
make[2]: *** No rule to make target 'backtrace-prefix/lib/libbacktrace.a', needed by 'bin/fluent-bit'. Stop.
make[1]: *** [CMakeFiles/Makefile2:10067: src/CMakeFiles/fluent-bit-bin.dir/all] Error 2
make: *** [Makefile:156: all] Error 2
aelwell@joey-02:~/compile/fluent-bit-4.1.1/build>dujas
11/01/2025, 11:17 AMJoao Costa
11/03/2025, 9:51 AMDavid
11/05/2025, 1:46 PMSujay
11/05/2025, 3:07 PMstorage.max_chunks_up: 1000
storage.backlog.mem_limit: 500M
storage.total_limit_size: "10G"
inputTail:
Buffer_Chunk_Size: 300k
Buffer_Max_Size: 500MB
Ignore_Older: 10m
Mem_Buf_Limit: 500MB
storage.type: filesystem
Rotate_Wait: "30"
Refresh_Interval: "1"
But started getting timeouts after this we it was uable to connect to fluentd and we saw increase in fluentbit retry metrics and also fluentbit input metrics
below is the error can someone help me out
24:02] [error] [upstream] connection #1771 to <tcp://172.20.193.46:24240> timed out after 10 seconds (connection timeout) [2025/11/05 13:24:02] [error] [upstream] connection #1773 to <tcp://172.20.193.46:24240> timed out after 10 seconds (connection timeout) [2025/11/05 13:24:02] [error] [upstream]Stephan Wirth
11/06/2025, 12:16 PMprometheus_remote_write and loki.
I have limited the memory usage of all inputs via mem_buf_limit: 56KB. When flushing of the output fails, I can see that inputs are paused:
[2025/11/06 06:32:19] [ warn] [input] tail.1 paused (mem buf overlimit)
[2025/11/06 06:32:19] [ info] [input] pausing tail.1
[2025/11/06 06:33:01] [ info] [input] resume tail.1
[2025/11/06 06:33:01] [ info] [input] tail.1 resume (mem buf overlimit)
[2025/11/06 06:33:01] [ warn] [input] tail.1 paused (mem buf overlimit)
[2025/11/06 06:33:01] [ info] [input] pausing tail.1
[2025/11/06 06:33:09] [ info] [input] resume tail.1
However, the total memory usage of the fluent-bit process continues to rise. Checking storage metrics I can see that storage_layer.chunks.mem_chunks is continuously increasing. Is there a way to limit that?
In the Buffering and storage docs I find that storage.total_limit_size can be set for storage.type == filesystem but I can't find information on how to limit chunks in memory.
Thanks!devsecops
11/06/2025, 1:19 PMAndrew Elwell
11/07/2025, 5:10 AM- name: content_modifier
action: extract
key: 'message'
# print one line per reply, with time, IP, name, type, class, rcode, timetoresolve, fromcache and responsesize.
pattern: '^(?<source_ip>[^ ]+) (?<query_request>[^ ]+) (?<query_record_type>[^ ]+) (?<query_class>[^ ]+) (?<query_result>[^ ]+) (?<unbound_time_to_resolve>[^ ]+) (?<unbound_from_cache>[^ ]+) (?<query_response_length>.+)$'
condition:
op: and
rules:
- field: '$message_type'
op: eq
value: 'reply'
- name: content_modifier
action: convert
key: unbound_time_to_resolve
converted_type: int
- name: content_modifier
action: convert
key: query_response_length
converted_type: int
- name: content_modifier
action: convert
key: unbound_from_cache
converted_type: boolean
but I'm still getting
[3] log_unbound: [[1762491366.000000000, {}], {"process"=>"unbound", "pid"=>934, "tid"=>"1", "message_type"=>"reply", "message"=>"127.0.0.1 <http://vmetrics1.pawsey.org.au|vmetrics1.pawsey.org.au>. AAAA IN NOERROR 0.000000 1 109", "gim_event_type_code"=>"140200", "source_ip"=>"127.0.0.1", "query_request"=>"vmetrics1.pawsey.org.au.", "query_record_type"=>"AAAA", "query_class"=>"IN", "query_result"=>"NOERROR", "unbound_time_to_resolve"=>"0.000000", "unbound_from_cache"=>"1", "query_response_length"=>"109", "cluster"=>"DNS", "event_reporter"=>"ns0"}]gcol
11/07/2025, 7:49 AMAlex
11/07/2025, 8:52 AM[error] [output:opentelemetry:opentelemetry.3] could not flush records (http_do=-1)
Log_Level debug: (pos=5 or pos=0)
[debug] [yyjson->msgpack] read error code=6 msg=unexpected character, expected a JSON value pos=0
I don't see the error from info log_level in the debug for some reason. (JFYI)
Questions:
1. Is forwarding OTLP traces from OpenTelemetry input to OpenTelemetry output supposed to work in Fluent Bit?
2. Should I use Raw_Traces On parameter? (tried both with/without - same errors)
[INPUT]
Name opentelemetry
Listen 0.0.0.0
Port 4318
Tag otel
Tag_From_Uri Off
[OUTPUT]
Name opentelemetry
Match otel
Host ${OSIS_PIPELINE_HOST}
Port 443
Traces_uri /v1/traces
Tls On
Tls.verify On
aws_auth On
aws_service osis
aws_region ${AWS_REGION}
Thanks in advance! 🙏Dean Meehan
11/07/2025, 2:14 PMfields.service_name to OTEL Resource: service.name
Fluentbit Tail: {"event": "my log message", "fields": {"service_name": "my_service", "datacenter": "eu-west"}}Gmo1492
11/07/2025, 4:38 PMGmo1492
11/07/2025, 4:38 PMGmo1492
11/07/2025, 4:39 PMGmo1492
11/07/2025, 4:39 PMReading state information... Done
E: Unable to locate package fluent-bit
[fluent-bit][error] APT install failed (vendor repo unreachable and Ubuntu archive install failed).Gmo1492
11/07/2025, 4:39 PMGmo1492
11/07/2025, 4:42 PMapt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)). (my stuff was outdated) - once I changed it to follow the latest install docs it failsScott Bisker
11/07/2025, 4:46 PMGmo1492
11/07/2025, 4:48 PMJason A
11/07/2025, 5:38 PMamazon-ebs: E: Failed to fetch <https://packages.fluentbit.io/ubuntu/noble/dists/noble/InRelease> 522
amazon-ebs: E: The repository '<https://packages.fluentbit.io/ubuntu/noble> noble InRelease' is no longer signed.
amazon-ebs: N: Updating from such a repository can't be done securely, and is therefore disabled by default.
amazon-ebs: N: See apt-secure(8) manpage for repository creation and user configuration details.
==> amazon-ebs: Provisioning step had errors: Running the cleanup provisScott Bisker
11/07/2025, 6:19 PMJason A
11/07/2025, 6:21 PMJosh
11/07/2025, 6:52 PMGmo1492
11/07/2025, 7:06 PMCelalettin
11/07/2025, 7:23 PMSaksham
11/10/2025, 8:23 AMBryson Edwards
11/10/2025, 10:49 PM{
"namespace": "test"
}
i would want:
{
"some_new_field": "some_new_value",
"spec": {
"namespace": "test"
}
}Michael Marshall
11/11/2025, 9:51 PMPost "<http://192.168.141.95:9880/services/collector>": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
I have tried lots of options and configurations, but my current one is cli based:
root@ip-192-168-141-95:~# /opt/fluent-bit/bin/fluent-bit -i splunk -p port=9880 -p buffer_chunk_size=1024 -p buffer_max_size=32M -p tag=splunk.logs -p net.io_timeout=300s -o stdout -p match=splunk.logs -vv
which is producing:
[2025/11/11 21:44:09.381347930] [trace] [io] connection OK
[2025/11/11 21:44:09.381397730] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.381863699] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.381894442] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382594157] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382625300] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382642132] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382657844] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382674014] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382684183] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382700140] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382710296] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382724162] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382734559] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382748716] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382759216] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382772032] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382782780] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382796156] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382805907] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382818906] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382828814] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382843934] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.382853034] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382863254] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382878776] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382888383] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382908014] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382918664] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382933485] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382943435] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382961527] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.382972431] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.382990641] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383000808] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.383026942] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383042965] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.383060552] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383070761] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.383085467] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383097179] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.383111593] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383120958] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.383137668] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.383146008] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383157180] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.383170359] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383179843] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:09.383193275] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.383203629] [trace] [io coro=(nil)] [net_read] ret=706
[2025/11/11 21:44:09.383216611] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.431509514] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.681537238] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.681554644] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.879452869] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:09.879531281] [trace] [io coro=(nil)] [net_read] ret=0
[2025/11/11 21:44:09.879549725] [trace] [downstream] destroy connection #48 to <tcp://192.168.141.95:46304>
[2025/11/11 21:44:09.879621135] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:09.931509675] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:10.95119333] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:10.95162342] [trace] [io coro=(nil)] [net_read] ret=0
[2025/11/11 21:44:10.95179536] [trace] [downstream] destroy connection #49 to <tcp://192.168.141.95:46314>
[2025/11/11 21:44:10.95247475] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:10.181511800] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:10.431508128] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:10.681546565] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:10.681585263] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:10.931508179] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:11.181514100] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:11.431510732] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:11.681539544] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:11.931508704] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.173087049] [trace] [io] connection OK
[2025/11/11 21:44:12.173199150] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.173810559] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.173841862] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.173872772] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.173883888] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.173898853] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.173909280] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.173923156] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.173933024] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.173946124] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.173955163] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.173967800] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.173977479] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.173989628] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.173999144] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.174096070] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.174110901] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.174203854] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.174395146] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.174415522] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.174426114] [trace] [io coro=(nil)] [net_read] ret=1024
[2025/11/11 21:44:12.174435379] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.174441221] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.174447781] [trace] [io coro=(nil)] [net_read] ret=314
[2025/11/11 21:44:12.174457878] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.181508649] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.181507560] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.430735078] [trace] [io coro=(nil)] [net_read] try up to 1024 bytes
[2025/11/11 21:44:12.430779926] [trace] [io coro=(nil)] [net_read] ret=0
[2025/11/11 21:44:12.430796710] [trace] [downstream] destroy connection #52 to <tcp://192.168.141.95:46322>
[2025/11/11 21:44:12.430866695] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.431506047] [trace] [sched] 0 timer coroutines destroyed
[2025/11/11 21:44:12.681535932] [trace] [sched] 0 timer coroutines destroyed
Any ideas?
When i switched it to tcp, i get:
______ _ _ ______ _ _ ___ __
| ___| | | | | ___ (_) | / | / |
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __/ /| | `| |
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| | | |
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /\___ |__| |_
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ |_(_)___/
[2025/11/11 21:49:12.454217350] [ info] [fluent bit] version=4.1.1, commit=, pid=7654
[2025/11/11 21:49:12.454345650] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/11/11 21:49:12.454355937] [ info] [simd ] SSE2
[2025/11/11 21:49:12.454363428] [ info] [cmetrics] version=1.0.5
[2025/11/11 21:49:12.454371187] [ info] [ctraces ] version=0.6.6
[2025/11/11 21:49:12.454441883] [ info] [input:tcp:tcp.0] initializing
[2025/11/11 21:49:12.454450891] [ info] [input:tcp:tcp.0] storage_strategy='memory' (memory only)
[2025/11/11 21:49:12.455168829] [ info] [sp] stream processor started
[2025/11/11 21:49:12.455347140] [ info] [output:stdout:stdout.0] worker #0 started
[2025/11/11 21:49:12.455396357] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2
"}] tcp.0: [[1762897752.520261984, {}], {"log"=>"POST /services/collector HTTP/1.1
"}] tcp.0: [[1762897752.520272277, {}], {"log"=>"Host: 192.168.141.95:9880
"}] tcp.0: [[1762897752.520273812, {}], {"log"=>"User-Agent: OpenTelemetry Collector Contrib/11f9362e
"}] tcp.0: [[1762897752.520275124, {}], {"log"=>"Content-Length: 44970
"}] tcp.0: [[1762897752.520276343, {}], {"log"=>"Authorization: Splunk my_token
"}] tcp.0: [[1762897752.520277527, {}], {"log"=>"Connection: keep-alive
"}] tcp.0: [[1762897752.520278816, {}], {"log"=>"Content-Encoding: gzip
"}] tcp.0: [[1762897752.520280153, {}], {"log"=>"Content-Type: application/json
"}] tcp.0: [[1762897752.520281350, {}], {"log"=>"__splunk_app_name: OpenTelemetry Collector Contrib
"}] tcp.0: [[1762897752.520282527, {}], {"log"=>"__splunk_app_version:
"}]] tcp.0: [[1762897752.520283955, {}], {"log"=>"Accept-Encoding: gzip
"}]] tcp.0: [[1762897752.520285037, {}], {"log"=>"Connection: close
"}]] tcp.0: [[1762897752.520286276, {}], {"log"=>"Michael Marshall
11/11/2025, 9:52 PMVictor Nilsson
11/12/2025, 2:02 PM---
pipeline:
inputs:
- name: systemd
tag: systemd.*
read_from_tail: on
threaded: true
lowercase: on
db: /fluent-bit/db/systemd.db
storage.type: memory # Filesystem buffering is not needed for tail input since the files are stored locally.
mem_buf_limit: 250M
alias: in_systemd
We have set db as well as read_from_tail: on so our thoughts were that the fluent-bit container should not resend already processed logs, is this true?