This message was deleted Apache Druid #dev

Join Slack

This message was deleted.

# dev

Slackbot

08/17/2023, 7:43 PM

This message was deleted.

Clint Wylie

08/17/2023, 9:29 PM

will have a look, any idea on the right way to fix this?

Xavier

08/17/2023, 9:30 PM

not sure, I didn't look in detail into what is causing the massive allocations

Xavier

08/17/2023, 9:31 PM

I have a feeling that some of the json parsing is getting evaluated lazily

Clint Wylie

08/17/2023, 9:33 PM

yeah flattener stuff is all lazy, which is why it wasn’t working with the wrapper

Clint Wylie

08/17/2023, 9:34 PM

i guess i did this before i did the new type-aware schema discovery stuff, so maybe i have a better way to do it now

Clint Wylie

08/17/2023, 9:39 PM

hmm, though i sort of wonder if the json input format (and all of the other input formats that support nested data such as avro, orc, parquet) would look something like this if used directly since most of the time actually seems to be in the flattener stuff

Clint Wylie

08/17/2023, 9:48 PM

hmm, maybe finally need to do the thing where we actually resolve all of the columns that are required from the input reader for dims/transforms/aggs and just eagerly convert them all to a plain map, will do some experiments

Clint Wylie

08/17/2023, 9:48 PM

thanks for sharing 👍

Xavier

08/17/2023, 9:49 PM

I can share the full profiles if that helps

Xavier

08/17/2023, 9:49 PM

I can send inputSpecs privately if you need to

Clint Wylie

08/17/2023, 9:49 PM

i think im ok, but will ping you if needed

Clint Wylie

08/17/2023, 9:50 PM

do you define a schema or use schema discovery?

Xavier

08/17/2023, 9:50 PM

we define a schema

Clint Wylie

08/17/2023, 9:50 PM

cool

Xavier

08/17/2023, 9:51 PM

useFieldDiscovery is true on the flattenspec, but we also specify fields directly for nested ones

Clint Wylie

08/17/2023, 9:54 PM

i wonder how many events i need to see a similar profile, how big was this stream in the profile?

Clint Wylie

08/17/2023, 9:55 PM

i suppose it probably doesn’t take that much to make the flattener stand out though, its so expensive

Xavier

08/17/2023, 9:55 PM

not that big, 20k events/s

Clint Wylie

08/17/2023, 9:56 PM

cool, i think that should be enough info, thanks, will reach out if i need anything else

Clint Wylie

08/22/2023, 1:26 AM

so, it does look like the ‘regression’ profiles the same way as using the input format directly

Clint Wylie

08/22/2023, 1:29 AM

using json input format directly

Clint Wylie

08/22/2023, 1:29 AM

using kafka input format in master

Clint Wylie

08/22/2023, 1:29 AM

reverted commit in question

Clint Wylie

08/22/2023, 1:30 AM

going to repeat the reverted measurement again soonly just to make sure

Clint Wylie

08/22/2023, 2:36 AM

repeat looks same-ish as last run

Clint Wylie

08/22/2023, 2:37 AM

so like… the interesting thing here, wall-time wise the reverted commit runs took like 30-40 seconds longer to process the same number of rows

Clint Wylie

08/22/2023, 2:38 AM

so it is perhaps only a regression for some schema shapes

Clint Wylie

08/22/2023, 2:39 AM

my flattenSpec on my generated data

Untitled

Clint Wylie

08/22/2023, 2:39 AM

and dims

Untitled

Clint Wylie

08/22/2023, 2:39 AM

its fairly flatten heavy i guess, idk how that compares to your schema

Clint Wylie

08/22/2023, 2:41 AM

my screenshots were cpu time, which makes sense that they would be different given that the reverted commit makes the flattening eager

Clint Wylie

08/22/2023, 2:43 AM

looking at memory allocations, it doesn’t seem to be that much extra data, but it is split between parse and add methods

Clint Wylie

08/22/2023, 2:43 AM

well i guess its a decent chunk extra

Clint Wylie

08/22/2023, 2:45 AM

maybe 10-20gb or so (slight variation between runs)

Clint Wylie

08/22/2023, 2:47 AM

reverted allocations

Clint Wylie

08/22/2023, 2:48 AM

plain json input format:

Clint Wylie

08/22/2023, 2:48 AM

another view of plain json format

Clint Wylie

08/22/2023, 2:51 AM

here is what i mean about wall-time

Clint Wylie

08/22/2023, 2:51 AM

kafka input format in master:

Clint Wylie

08/22/2023, 2:51 AM

kafka input format with revert

Clint Wylie

08/22/2023, 2:52 AM

im not sure why that is the case, could be related to my test data schema

Clint Wylie

08/22/2023, 2:52 AM

i repeated tests to try to get same-ish results and they were pretty consistent

Clint Wylie

08/22/2023, 2:52 AM

my test data was that above schema with 10m generated rows

Clint Wylie

08/22/2023, 2:53 AM

and i just let them run until processed all the rows and then shut off profiler without forcing a persist

Clint Wylie

08/22/2023, 2:54 AM

i should probably repeat the measurements but suspend the supervisor at the end of the run and see if the lazy nature of using flattener backed map instead of eager copying is related to why it seemed faster without the revert

Clint Wylie

08/22/2023, 2:55 AM

will do that experiment in a bit

Clint Wylie

08/22/2023, 2:56 AM

anyway, so far have at least confirmed my suspicions that my changes made the kafka reader perform the same as the underlying reader

Clint Wylie

08/22/2023, 2:58 AM

so the regression perhaps is more of a case of uncovering a performance issue/difference in the underlying stuff, so the “fix” will likely be changes to how flattening works rather than a fix specific to the kafka reader

Clint Wylie

08/22/2023, 2:58 AM

and i think it means my change is basically “correct” since its delegating to the underlying reader, just the underlying reader could be better

Clint Wylie

08/22/2023, 2:58 AM

if that makes sense

Clint Wylie

08/22/2023, 7:35 PM

btw, adding suspending the task to the captures results in them having approximately the same total time, just spent in different places

Clint Wylie

08/22/2023, 7:36 PM

reverted

Clint Wylie

08/22/2023, 7:36 PM

master

Clint Wylie

08/22/2023, 7:37 PM

allocated size was actually smaller for some reason on master than reverted on this run

Xavier

08/22/2023, 8:16 PM

We saw massive GC pressure with the change compared to before. It might be an artifact of our spec but the difference was large

Xavier

08/22/2023, 8:18 PM

The flame graphs I posted in GitHub were allocation profiles, not cpu profiles.

Clint Wylie

08/22/2023, 8:23 PM

is that different than the ‘memory allocations’ of profiling with intellij?

Xavier

08/22/2023, 8:24 PM

No probably the same assuming it uses asyncprofiler.

Clint Wylie

08/22/2023, 8:26 PM

could share your spec so i can compare to mine and adjust my testing one to try to reproduce, i didn’t really have any aggs or transforms on mine, so that might have blown up the cost of the flattener due to repeated reads of the same value

Clint Wylie

08/22/2023, 8:28 PM

though im certain that there is definitely a difference between eager and lazy flattening, maybe its enough evidence to just always eagerly flatten

Clint Wylie

08/22/2023, 8:29 PM

just being cautious since the same pattern is used by avro/json/orc/parquet/protobuf so the right way to do this will probably impact all of them

Xavier

08/22/2023, 8:29 PM

We have fairly large nested payloads of which we only keep a relatively small number of nested fields.

Clint Wylie

08/22/2023, 8:30 PM

ah yeah deeper nesting would also probably exaggerate things compared to my run

Clint Wylie

08/22/2023, 8:30 PM

since it has to recreate the stuff all along the path for each flatten value

Xavier

08/22/2023, 8:31 PM

Doing it lazily might also keep the full json payload longer in memory, causing the GC to have to work harder

Xavier

08/22/2023, 8:31 PM

But I’m just guessing here

15 Views

Open in Slack

Previous Next