Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Logstash lost data during log rotate #214

Closed
Tsukiand opened this issue Sep 17, 2018 · 4 comments
Closed

Logstash lost data during log rotate #214

Tsukiand opened this issue Sep 17, 2018 · 4 comments

Comments

@Tsukiand
Copy link

Tsukiand commented Sep 17, 2018

I have use logstash-input-file(4.1.4) to ingest from file. I found data loss during log rotation.
I have set my 3 files to rotate. And when the file over 1k the file rotate happen.

My configuration of log rotate:
{
missingok
size 1k
notifempty
sharedscripts
rotate 3
}

My script to generate log and rotate:
for (( i=1 ; i <= 100000; i++ ))
do
echo "$i this is a bunch of test data blah blah" >> /tmp/log/test

if ! ((i % 1000)); then
sleep 1
fi

if ! ((i % 30000 || i == 100000)); then
/usr/sbin/logrotate -f /etc/logrotate.d/test &
fi
done

My configuration of logstash:
input {
file {
path => "/tmp/log/test*"
}
}

output {
file {
path => "/tmp/output.txt"
codec => line { format => "custom format: %{message}" }
}
}

Data loss happened as below:
I found that creating new "log" file caused data loss. I have checked the source code and found that new "log" file lost some logs in the beginning. (create_initial.rb seek operation cause this issue).It means that logs that written to the "log" files during the file rotation will lost.

Please give me some advice on this issue.

Thanks,
Tsukiand

@Tsukiand Tsukiand reopened this Sep 17, 2018
@lrbsunday
Copy link

I got the same issue, any ideas?

@Tsukiand
Copy link
Author

Tsukiand commented Sep 26, 2018

@lrbsunday I have test with input path as "/tmp/log/test" and "/tmp/log/test*". And data loss is the same.
When i use "/tmp/log/test", filewatch only monitor the "test" file. And logstash will lost data during file rotation.
When i use "/tmp/log/test*", filewatch monitor "test" "test.1" "test.2". And logstash will not lost data during file rotation. But we also lost data. I will explain the data loss:

  1. We have test test.1 test.2 and test.3
  2. File rotation happened.
    2.1 File rotation1: test.2 change to test.3 (new test.3 will rotate from old test.2, no data loss)
    2.2 File rotation2: test.1 change to test.2 (new test.2 will rotate from old test.1, no data loss)
    2.3 File rotation3 : test change to test.1 (new test.1 will rotate from old test, no data loss)
    2.4 File rotation4: new test generated (As old test change to test.1, the watched_file changed, and it caused new test to rotate as initial file, and seek to the current size. The seek operation result in data loss)

I have add a flag(:rotate_flag) in "watced_file.rb" to avoid data loss. But i am not sure whether my change will bring other issues. Maybe you can give me some advice.

attr_reader :bytes_read, :state, :file, :buffer, :recent_states, :bytes_unread, :rotate_flag
attr_reader :path, :accessed_at, :modified_at, :pathname, :filename
attr_reader :listener, :read_loop_count, :read_chunk_size, :stat
attr_reader :loop_count_type, :loop_count_mode
attr_accessor :last_open_warning_at

def initialize(pathname, stat, settings)
@settings = settings
@pathName = Pathname.new(pathname)
@path = @pathname.to_path
@filename = @pathname.basename.to_s
full_state_reset(stat)
watch
set_standard_read_loop
set_accessed_at
@rotate_flag = false
end

def flag?
@rotate_flag
end

def set_flag
@rotate_flag = true
end

def rotate_from(other)
# move all state from other to this one
set_standard_read_loop
file_close
@bytes_read = other.bytes_read
@bytes_unread = other.bytes_unread
@Listener = nil
@initial = false
@recent_states = other.recent_states
@accessed_at = other.accessed_at
if !other.delayed_delete?
# we don't know if a file exists at the other.path yet
# so no reset
other.full_state_reset
other.set_flag
end
set_stat PathStatClass.new(pathname)
ignore
end

def rotate_as_initial_file
# rotation, when no sincedb record exists for new inode - we have never seen this inode before.
rotate_as_file
if !flag?
@initial = true
end
#@initial = true
end

@guyboertje
Copy link
Contributor

The temporary work around is to use start_position => "beginning" as this forces processing to start at the beginning of the latest file.
However, it is a bug that this should be necessary. The docs say... If you have old data you want to import, set this to 'beginning'. but clearly this is not old data.

@Tsukiand
Copy link
Author

Tsukiand commented Oct 8, 2018

@guyboertje

Thanks for your reply. I have test with start_position => "beginning" and it works. But as you said, it is a temporary work around. Maybe we need a fix.

guyboertje pushed a commit to guyboertje/logstash-input-file that referenced this issue Oct 25, 2018
guyboertje pushed a commit that referenced this issue Oct 29, 2018
#217)

* Force all files under rotation to start at 0 or at the sincedb record.
* Update travis.yml to update versions.

Fixes #214
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants