Skip to content

WIP Use java DatagramSocket directly#38

Open
andrewvc wants to merge 3 commits into
logstash-plugins:mainfrom
andrewvc:use-java-udp-api
Open

WIP Use java DatagramSocket directly#38
andrewvc wants to merge 3 commits into
logstash-plugins:mainfrom
andrewvc:use-java-udp-api

Conversation

@andrewvc

Copy link
Copy Markdown

Attempts to address elastic/logstash#9346 by just using the Java UDP API directly. The issue there is that JRuby's implementation uses too much memory, the underlying ruby strings returned from the UDP socket use up as much memory as the UDP buffer size, regardless of how big the actual packet is.

I would like to benchmark this and make one more pass before this is ready for review.

@praseodym

Copy link
Copy Markdown
Contributor

I haven’t tested or benchmarked this PR, but it might be worthwhile to use Netty instead of the Java socket API. Other plugins like logstash-input-tcp already do.

(Also, this would be so much simpler once the Logstash Java API is available!)

@andrewvc

andrewvc commented Apr 20, 2018 via email

Copy link
Copy Markdown
Author

@IrlJidel

Copy link
Copy Markdown
Contributor

You need to assign to rcvbuf

@andrewvc

andrewvc commented Apr 20, 2018 via email

Copy link
Copy Markdown
Author

@andrewvc

Copy link
Copy Markdown
Author

@IrlJidel good catch on rcvbuf, that should be fixed now. I think this needs another pass from me this week to double check java exception handling and also to add more tests. That error, for instance didn't fail the test suite.

@marioplumbarius

Copy link
Copy Markdown

Let me know when you think it's OK, so I can run perf tests using this patch and see the results before releasing it.

@logger.warn("Unable to set receive_buffer_bytes to desired size. Requested #{@receive_buffer_bytes} but obtained #{rcvbuf} bytes.")
@udp.setReceiveBufferSize(@receive_buffer_bytes)

actual_receive_buffer_bytes = @udp.getReceiveBufferBytes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be @udp.getReceiveBufferSize


@udp.bind(@host, @port)
@logger.info("UDP listener started", :address => "#{@host}:#{@port}", :receive_buffer_bytes => "#{rcvbuf}", :queue_size => "#{@queue_size}")
@logger.info("UDP listener started", :address => "#{@host}:#{@port}", :receive_buffer_bytes => "#{@receive_buffer_bytes}", :queue_size => "#{@queue_size}")

@IrlJidel IrlJidel Apr 23, 2018

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep to previous behavior we should be logging using actual_receive_buffer_bytes rather than @receive_buffer_bytes

@jsvd

jsvd commented Apr 23, 2018

Copy link
Copy Markdown
Member

I'd like to attempt a fix of the issue with JRuby instead of going directly to a rewrite of the plugin. For reference, a workaround was proposed by @original-brownbear here.

That said I'm +1 on exploring a more native implementation, just not for the reason of fixing the memory usage issue.

@IrlJidel

Copy link
Copy Markdown
Contributor

jruby RubyString doesn't have a .b method

@andrewvc

andrewvc commented Apr 23, 2018 via email

Copy link
Copy Markdown
Author

@IrlJidel

Copy link
Copy Markdown
Contributor

I put the perf test results in elastic/logstash#9346 (comment) but might make more sense to have them here:

tested on LS 5.6.3.

Tests were run for 15minutes with 300 byte message.

buffer_size Patch 3.3.2
1K 52k eps 51k eps
64K 51k eps 40k eps *
  • 3.3.2 with 64K buffer_size udp->main cpu usage was 92%, for all other cases was 46%

Couldn't test using @original-brownbear suggestion to use .b as this method doesn't appear to be available in jruby

@jsvd

jsvd commented Apr 23, 2018

Copy link
Copy Markdown
Member

If you think it's not worthwhile however I'm glad to abandon it.

I seem to have explained myself badly for you arrive at this statement. I think it's worth exploring a more native
and Java based approach, the same way we've been doing for all other plugins like tcp, http, date, etc.

What I want to alert to is that it's different to review and accept a PR that rewrites the critical path of the plugin vs one that fixes an issue by changing a small part (how the data read from the buffer is handled).
The test suite for this plugin isn't great as you mentioned, which doesn't give me any confidence that a change like this won't be worse (or better) for the many use cases (the list is left as an exercise for the reader) of this plugin. The only guarantee we have is that the things that are tested work and that many many people that use the plugin as-is, and are hitting a known list of bugs seen in this repo's issues list.

To sum up, for this PR to be the solution for the fix, it needs to go to extra lengths to provide enough confidence that current use cases are not worse overall (if it improves some or even all, great!).

@andrewvc

Copy link
Copy Markdown
Author

@jsvd those are all valid concerns. I have a lot of irons in the fire and likely won't be able to address things in that detail given time concerns.

I'm going to leave this PR open, but I won't do further work on it.

@jsvd

jsvd commented Apr 23, 2018

Copy link
Copy Markdown
Member

Just for reference, I tried running this on logstash 6.2.4 and got an exception when setting the host field in the event:

[2018-04-23T17:16:36,493][ERROR][logstash.inputs.udp      ] Exception in inputworker {"exception"=>org.logstash.MissingConverterException: Missing Converter handling for full class name=java.net.Inet4Address, simple name=Inet4Address, "backtrace"=>["org.logstash.Valuefier.fallbackConvert(Valuefier.java:97)", "org.logstash.Valuefier.convert(Valuefier.java:75)", "org.logstash.Valuefier.lambda$static$3(Valuefier.java:51)", "org.logstash.Valuefier.convert(Valuefier.java:73)", "org.logstash.ext.JrubyEventExtLibrary$RubyEvent.ruby_set_field(JrubyEventExtLibrary.java:95)", "org.logstash.ext.JrubyEventExtLibrary$RubyEvent$INVOKER$i$2$0$ruby_set_field.call(JrubyEventExtLibrary$RubyEvent$INVOKER$i$2$0$ruby_set_field.gen)", "org.jruby.internal.runtime.methods.JavaMethod$JavaMethodN.call(JavaMethod.java:741)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", "Users.joaoduarte.projects.logstash_plugins.logstash_minus_input_minus_udp.lib.logstash.inputs.udp.RUBY$method$push_decoded_event$0(/Users/joaoduarte/projects/logstash_plugins/logstash-input-udp/lib/logstash/inputs/udp.rb:155)", "Users.joaoduarte.projects.logstash_plugins.logstash_minus_input_minus_udp.lib.logstash.inputs.udp.RUBY$method$push_decoded_event$0$__VARARGS__(/Users/joaoduarte/projects/logstash_plugins/logstash-input-udp/lib/logstash/inputs/udp.rb)", "org.jruby.internal.runtime.methods.CompiledIRMethod.call(CompiledIRMethod.java:77)", "org.jruby.internal.runtime.methods.MixedModeIRMethod.call(MixedModeIRMethod.java:93)", "org.jruby.ir.targets.InvokeSite.invoke(InvokeSite.java:145)", 

@andrewvc

Copy link
Copy Markdown
Author

We should, by the way, aim for a minimal refactor, re-initializing the string somehow to free that memory.

@jsvd

jsvd commented Apr 23, 2018

Copy link
Copy Markdown
Member

Also, for reference, I created an issue in JRuby about this: jruby/jruby#5148

@praseodym

Copy link
Copy Markdown
Contributor

We could also simply print a warning log message stating that using the default buffer size is bad for peformance. Or, if we'd want to get better out-of-the-box performance, decrease the default buffer size to 1472 and add a warning message that the buffer size should be set explicitly to the maximum message length that the user expects.

@jsvd

jsvd commented May 2, 2018

Copy link
Copy Markdown
Member

I created a PR that works around the memory issue #39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants