I m noticing some odd behavior I ingested some data via the DataHub #ingestion

I’m noticing some odd behavior. I ingested some da...

white-beach-27328

06/10/2021, 7:03 PM

I’m noticing some odd behavior. I ingested some data via the

hive

ingestion recipe leveraging

acryl-datahub[hive, datahub-kafka]==0.8.1.1

which created a

DatasetSnapshot

through the MCE Consumer which I can retrieve from the GMS with a request to the

/datasets?action=getSnapshot

endpoint using the

urn

I see in the kafka message. However, when I look in the datahub frontend, I can’t find the dataset anywhere and it doesn’t come back when search for the dataSet’s name. Kind of confused as to what the problem would be, any ideas?

white-beach-27328

06/10/2021, 7:03 PM

FYI: @early-lamp-41924

white-beach-27328

06/10/2021, 7:03 PM

I’m running the 0.7.1 version of the helm charts

white-beach-27328

06/10/2021, 7:37 PM

I don’t see the urn in ES but I also don’t see any messages on the

FailedMetadataChangeEvent_v4

topic. I also don’t see any errors in the logs of the MCE consumer or the GMS.

early-lamp-41924

06/10/2021, 7:38 PM

If get is working, seems like it did reach GMS. Is MAE consumer job running correctly? Are there any suspicious logs that got printed there?

white-beach-27328

06/10/2021, 7:53 PM

let me take a look

white-beach-27328

06/10/2021, 7:56 PM

huh maybe elastic search is rate limiting the requests

early-lamp-41924

06/10/2021, 7:56 PM

interesting. are you using a managed elastic?

white-beach-27328

06/10/2021, 7:56 PM

Copy code

17:50:36.060 [I/O dispatcher 1] INFO  c.l.m.k.e.ElasticsearchConnector - Error feeding bulk request. No retries left
org.elasticsearch.ElasticsearchStatusException: Unable to parse response body
	at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1872)
	at org.elasticsearch.client.RestHighLevelClient$1.onFailure(RestHighLevelClient.java:1785)
	at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:617)
	at org.elasticsearch.client.RestClient$1.completed(RestClient.java:362)
	at org.elasticsearch.client.RestClient$1.completed(RestClient.java:346)
	at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
	at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181)
	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448)
	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338)
	at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
	at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
	at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
	at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
	at java.lang.Thread.run(Thread.java:748)
	Suppressed: java.lang.IllegalStateException: Unsupported Content-Type: text/plain;charset=ISO-8859-1
		at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1889)
		at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1869)
		... 19 common frames omitted
Caused by: org.elasticsearch.client.ResponseException: method [POST], host [<http://vpc-datahub-es-nifd6wpqydydzm5oclqzikptge.us-west-2.es.amazonaws.com:80>], URI [/_bulk?timeout=1m], status line [HTTP/1.1 429 Too Many Requests]
429 Too Many Requests /_bulk
	at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302)
	at org.elasticsearch.client.RestClient.access$1700(RestClient.java:100)
	at org.elasticsearch.client.RestClient$1.completed(RestClient.java:350)
	... 16 common frames omitted

white-beach-27328

06/10/2021, 7:56 PM

I see that

too many requests

in there

white-beach-27328

06/10/2021, 7:57 PM

yeah we’re using AWS elasticsearch

white-beach-27328

06/10/2021, 8:03 PM

https://aws.amazon.com/premiumsupport/knowledge-center/resolve-429-error-es/

white-beach-27328

06/10/2021, 8:03 PM

Is there a way to add in longer retry delays to the MCE consumer or GMS?

white-beach-27328

06/10/2021, 8:04 PM

I don’t want to spin up a huge ES cluster just to handle small bursts of large throughput and the latency wouldn’t bother me

white-beach-27328

06/10/2021, 8:30 PM

I can’t really find anything looking at the GMS source code that would allow for that, but given that the ingestion framework is built to be bursty it would be nice to figure out a way to mitigate that. The Java project is a bit distributed and relying on frameworks I don’t use for me to be able to understand in a couple hours, so if anyone has some time, I’d appreciate help around it. I can create a Github Issue around it if we want to discuss, though I’m surprised others haven’t run into the issue unless they already had huge ES clusters running.

early-lamp-41924

06/10/2021, 9:13 PM

See https://github.com/linkedin/datahub/blob/29832e53852f084b7b37ae45ff0f4aee96fc80e8/[…]nkedin/metadata/kafka/elasticsearch/ElasticsearchConnector.java

early-lamp-41924

06/10/2021, 9:13 PM

This is where the number of retries and retry interval are set prior to v0.8.0

early-lamp-41924

06/10/2021, 9:14 PM

On v0.8.0 we set these values using env variable but unfortunately in v0.7.1 you need to make a code change

early-lamp-41924

06/10/2021, 9:14 PM

Also try changing these variables to tune the frequency of writes. https://github.com/linkedin/datahub/blob/82791008c3e8d682cc39dd67eaa36325d0afdb0a/[…]metadata/kafka/elasticsearch/ElasticsearchConnectorFactory.java

early-lamp-41924

06/10/2021, 9:15 PM

these are actually reading from env variables, so it might be easier to play around with these first!

white-beach-27328

06/10/2021, 9:16 PM

Ah ok cool! I'll take a look. I'm talking with @big-carpet-38439 to figure out that upgrade today so maybe we'll make some progress

big-carpet-38439

06/10/2021, 9:33 PM

I've also invited @early-lamp-41924 🙂

early-lamp-41924

06/10/2021, 9:34 PM

Awesome!!!

big-carpet-38439

06/10/2021, 10:59 PM

Will be 5 min late!

big-carpet-38439

06/10/2021, 11:07 PM

@white-beach-27328 @early-lamp-41924 2 minutes more srorry previous meeting running

big-carpet-38439

06/11/2021, 12:15 AM

@white-beach-27328 let us know when migration gets done. Glad to see it working

white-beach-27328

06/11/2021, 12:16 AM

for sure, I actually killed it since the neo4j cluster failing kind of makes it moot

white-beach-27328

06/11/2021, 12:16 AM

but it had gotten to 56000 rows by then

white-beach-27328

06/11/2021, 12:16 AM

so it seems reasonable

big-carpet-38439

06/11/2021, 12:16 AM

awesome

white-beach-27328

06/11/2021, 12:30 AM

yeah for sure, really appreciate the help!

Open in Slack

Previous Next