Hello everyone, Do you know what can cause GMS to ...
# troubleshoot
g
Hello everyone, Do you know what can cause GMS to process MAEs so slowly? I cleaned my Elasticsearch and ran the reindex job. There are 64994 MAE rows. More than 1 hour later I still can't see all metadata in the front end. Previus versions GMS could process this volume very fast. Deployment details: Environment: kubernetes Datahurb version: 0.10.2 GMS replicas: 1 Standalone consumers: False
a
Hi, @brainy-tent-14503 might be able to help you out with this!
g
In addition to the GMS taking time to consume all the MAEs from the reindexing job, it is also taking a long time to consume events coming from the ingestion. What could be causing this?
b
If it is not making any forward progress is might be due to some
assert
statements which are Errors and not Exceptions, the former is not caught, and under some circumstances can fail to acknowledge the kafka messages. I have a PR to fix this here. I do not know whether this is the cause, so I would also take a look at the bulk processing batch size, if its <1000, might need to increase the interval
ES_BULK_FLUSH_PERIOD
[doc] (this is slow progress vs the other condition which might be very slow to no progress).
Additionally any seeming unrelated exceptions in the gms logs?
g
I only found exceptions caused by DataFetcher. It seems to have a bug when I edit the view to filter entities of Term Group (using rule 'is ony of'). This causes 500 errors to appear in the UI but the data is returned
I'm using the following configurations for the bulk processor:
a
The DataFetcher exceptions due to a view filter, shouldn’t be effecting performance for sure. Are you seeing messages like
c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 18
where the number is maybe in the 100-700 range? If so, then increasing
flushPeriod
to like 5 can make sure those become full batches of 1000. That is delaying the write to accumulate more documents. This is however only typical for large ingestions. I’d also check the cpu limit and how close you’re coming to that limit cap.