Hi Everyone After upgrading the new version v0 9 2 I got the DataHub #troubleshoot

Hi Everyone! After upgrading the new version v0.9....

few-sunset-43876

11/21/2022, 3:27 AM

Hi Everyone! After upgrading the new version v0.9.2, I got the OOM issue when I search the lineage of a dataset. It keeps loanding and results with timeout (pic as below) The logs from datahub-gms:

Copy code

03:16:05.440 [I/O dispatcher 1] INFO  c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 2 Took time ms: -1
03:16:05.663 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.t.DataHubUsageEventTransformer:74 - Invalid event type: SearchAcrossLineageResultsViewEvent
03:16:05.663 [ThreadPoolTaskExecutor-1] WARN  c.l.m.k.DataHubUsageEventsProcessor:56 - Failed to apply usage events transform to record: {"type":"SearchAcrossLineageResultsViewEvent","query":"","total":10,"actorUrn":"urn:li:corpuser:datahub","timestamp":1669000565516,"date":"Mon Nov 21 2022 10:16:05 GMT+0700 (Indochina Time)","userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36","browserId":"57f357cc-cdf7-4104-a7fa-30d8eda4f486"}
03:16:06.447 [I/O dispatcher 1] INFO  c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 1 Took time ms: -1

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "I/O dispatcher 1"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "ThreadPoolTaskScheduler-1"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-producer-network-thread | producer-1"

and the logs from datahub-frontend-react:

Copy code

2022-11-21 03:17:04,148 [application-akka.actor.default-dispatcher-13] ERROR application - 

! @7pkjpoecp - Internal server error, for (POST) [/api/v2/graphql] ->
 
play.api.UnexpectedException: Unexpected exception[CompletionException: java.util.concurrent.TimeoutException: Read timeout to datahub-gms/172.18.0.3:8080 after 60000 ms]
	at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:340)
	at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:263)
	at play.core.server.AkkaHttpServer$$anonfun$1.applyOrElse(AkkaHttpServer.scala:443)
	at play.core.server.AkkaHttpServer$$anonfun$1.applyOrElse(AkkaHttpServer.scala:441)
	at scala.concurrent.Future.$anonfun$recoverWith$1(Future.scala:417)
	at scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:41)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
	at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
	at akka.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:92)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
	at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92)
	at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException: Read timeout to datahub-gms/172.18.0.3:8080 after 60000 ms
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346)
	at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088)
	at scala.concurrent.java8.FuturesConvertersImpl$CF.apply(FutureConvertersImpl.scala:21)
	at scala.concurrent.java8.FuturesConvertersImpl$CF.apply(FutureConvertersImpl.scala:18)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
	at scala.concurrent.BatchingExecutor$Batch.processBatch$1(BatchingExecutor.scala:67)
	at scala.concurrent.BatchingExecutor$Batch.$anonfun$run$1(BatchingExecutor.scala:82)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:85)
	at scala.concurrent.BatchingExecutor$Batch.run(BatchingExecutor.scala:59)
	at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:875)
	at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:110)
	at scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:107)
	at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:873)
	at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)
	at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:288)
	at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:288)
	at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:288)
	at scala.concurrent.Promise.complete(Promise.scala:53)
	at scala.concurrent.Promise.complete$(Promise.scala:52)
	at scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:187)
	at scala.concurrent.Promise.failure(Promise.scala:104)
	at scala.concurrent.Promise.failure$(Promise.scala:104)
	at scala.concurrent.impl.Promise$DefaultPromise.failure(Promise.scala:187)
	at play.libs.ws.ahc.StandaloneAhcWSClient$ResponseAsyncCompletionHandler.onThrowable(StandaloneAhcWSClient.java:227)
	at play.shaded.ahc.org.asynchttpclient.netty.NettyResponseFuture.abort(NettyResponseFuture.java:278)
	at play.shaded.ahc.org.asynchttpclient.netty.request.NettyRequestSender.abort(NettyRequestSender.java:473)
	at play.shaded.ahc.org.asynchttpclient.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43)
	at play.shaded.ahc.org.asynchttpclient.netty.timeout.ReadTimeoutTimerTask.run(ReadTimeoutTimerTask.java:56)
	at play.shaded.ahc.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:670)
	at play.shaded.ahc.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:745)
	at play.shaded.ahc.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:473)
	at play.shaded.ahc.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.TimeoutException: Read timeout to datahub-gms/172.18.0.3:8080 after 60000 ms
	... 7 common frames omitted

the stats of the containers

Copy code

CONTAINER ID   NAME                        CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
52616fa99479   datahub-frontend-react      0.59%     523.7MiB / 31.26GiB   1.64%     598kB / 619kB     0B / 0B           52
d72c1d91089c   datahub_datahub-actions_1   0.06%     50.66MiB / 31.26GiB   0.16%     295MB / 181MB     5.46MB / 0B       24
805489e4533c   datahub-gms                 748.15%   1.754GiB / 31.26GiB   5.61%     316MB / 3.25MB    0B / 0B           127
69761ab51fcc   schema-registry             0.21%     520.5MiB / 31.26GiB   1.63%     104MB / 99.2MB    6.14MB / 12.3kB   49
34814372e50d   broker                      0.88%     508.4MiB / 31.26GiB   1.59%     957MB / 977MB     13.3MB / 801MB    89
30a6648fdbd5   elasticsearch               0.98%     932.2MiB / 31.26GiB   2.91%     26.5MB / 27.6MB   34.1MB / 178MB    134
bbef225eadba   zookeeper                   0.22%     358MiB / 31.26GiB     1.12%     20MB / 12MB       451kB / 188kB     67
9a83d87163a1   mysql                       0.06%     348MiB / 31.26GiB     1.09%     63.7MB / 301MB    14.9MB / 26.1MB   33
e0d367b11df2   neo4j                       0.59%     1.609GiB / 31.26GiB   5.15%     17.3MB / 926MB    1.47GB / 26.1MB   78

The java heap size of datahub-gms

Copy code

bash-5.1$ java -XX:+PrintFlagsFinal -version | grep HeapSize
   size_t ErgoHeapSizeLimit                        = 0                                         {product} {default}
   size_t HeapSizePerGCThread                      = 43620760                                  {product} {default}
   size_t InitialHeapSize                          = 526385152                                 {product} {ergonomic}
   size_t LargePageHeapSizeThreshold               = 134217728                                 {product} {default}
   size_t MaxHeapSize                              = 8392802304                                {product} {ergonomic}
    uintx NonNMethodCodeHeapSize                   = 5836300                                {pd product} {ergonomic}
    uintx NonProfiledCodeHeapSize                  = 122910970                              {pd product} {ergonomic}
    uintx ProfiledCodeHeapSize                     = 122910970                              {pd product} {ergonomic}
openjdk version "11.0.17" 2022-10-18
OpenJDK Runtime Environment (build 11.0.17+8-alpine-r3)
OpenJDK 64-Bit Server VM (build 11.0.17+8-alpine-r3, mixed mode)

datahub-gms container with free command:

Copy code

docker exec -it datahub-gms bash
bash-5.1$ free
              total        used        free      shared  buff/cache   available
Mem:       32776400     8052724      417880           0    24305796    24294940
Swap:       4194300        3584     4190716

The application is deploy in GCP, the stats of VM:

Copy code

cat /proc/meminfo
MemTotal:       32776400 kB
MemFree:          306556 kB
MemAvailable:   24412316 kB
Buffers:            2212 kB
Cached:         23913504 kB
SwapCached:          124 kB
Active:         15746384 kB
Inactive:       15120120 kB
Active(anon):    5049788 kB
Inactive(anon):  1926800 kB
Active(file):   10696596 kB
Inactive(file): 13193320 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       4194300 kB
SwapFree:        4191228 kB
Dirty:                84 kB
Writeback:             0 kB
AnonPages:       6950912 kB
Mapped:           309100 kB
Shmem:             25800 kB
Slab:             885396 kB
SReclaimable:     618596 kB
SUnreclaim:       266800 kB
KernelStack:       18816 kB
PageTables:        30292 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    20582500 kB
Committed_AS:   13568028 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       63820 kB
VmallocChunk:   34359661428 kB
Percpu:             5760 kB
HardwareCorrupted:     0 kB
AnonHugePages:   2617344 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      103232 kB
DirectMap2M:     5136384 kB
DirectMap1G:    30408704 kB

The production with older version V0.8.24 didn't have this OOM issue. It happens after upgrading to v0.9.2. I upgrade the new version using command from docker-compose.yml of version v0.9.2

Copy code

docker-compose down --remove-orphans && docker-compose pull && docker-compose -p datahub up --force-recreate

Is there anything I need to check or adjust (reindexing or something...)? Any help would be appreciated.

few-sunset-43876

11/21/2022, 4:32 AM

compared with Prod, I see that the free memory in Prod is much greater than my VM. Prod:

Copy code

docker exec -it datahub-gms bash
bash-5.1$ free
              total        used        free      shared  buff/cache   available
Mem:       32775184    12033140     9027096           0    11714948    19968312
Swap:       4194300     1229480     2964820

my VM:

Copy code

docker exec -it datahub-gms bash
bash-5.1$ free
              total        used        free      shared  buff/cache   available
Mem:       32775184    12033140     9027096           0    11714948    19968312
Swap:       4194300     1229480     2964820

Is it relevant? Do I need to clear the cache memory?

few-sunset-43876

11/21/2022, 12:37 PM

For more detail: The lineage which I opened is very big that has many levels if we expand all (see image below) Do you think it's because with the new version, it will expand all and load all the entities that caused OOM? Is there anywhere to configure the depth to load (not to load all entities behind)? (After click on the lineage button, the CPU increased a lot)

Copy code

CONTAINER ID   NAME                        CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
805489e4533c   datahub-gms                 748.15%   1.754GiB / 31.26GiB   5.61%     316MB / 3.25MB    0B / 0B           127

Thank you so much for your support!

bulky-soccer-26729

11/21/2022, 3:27 PM

hey Hue! so i'm not positive if there was a feature change between 0.8.24 and 0.9.2 that would cause this situation, although I do know some changes were made around this endpoint to properly batch requests with very large amounts of lineage. do you know if you were getting all data for this asset before the upgrade properly? and are you able to get 1st and 2nd degree lineage for this dataset or no? if you do 3+ it will get everything downstream or upstream no matter how many levels it is

bulky-soccer-26729

11/21/2022, 3:31 PM

also next time would you mind pasting code blocks/logs in a thread to keep the channel thread cleaner/easier to read? it helps our team keep track of everyone's posts much easier!

few-sunset-43876

11/21/2022, 5:01 PM

Hi Chris, With the old version (v.0.8.24), when click on lineage, it shows the up/down streams (see images below) And when click on lineage visualization, although it doesn't display all degrees, I can expand it to get all (image below). For the new version v.0.9.2, I even could not see any up/down streams when click on lineage tab or lineage visualization. Not sure if there is not enough java heap size? Could you please give some advice or something to check? sorry for the mess, I will keep everything in a thread next time.

bulky-soccer-26729

11/21/2022, 6:33 PM

no worries! and gotcha. so you're not even able to get first degree lineage in the Lineage tab for impact analysis? or is it only erroring when you check "3+"?

brainy-tent-14503

11/22/2022, 2:48 AM

It looks like there is additional headroom for giving gms more memory. I am not certain it’ll help, however with 32gb and no other options on the java process it’ll use up to 25% of the 32GB or ~8GB (in line with the max heap size shown). This is java default behavior. We can give the jvm additional memory by leveraging the

JAVA_OPTS

environment variable and the -Xmx / -Xms options. These can be set in the helm chart values by setting this env. I can’t guarantee that it’ll work but at least it should be able to use more of the memory you’ve allocated for the pod.

Copy code

datahub-gms:
    extraEnvs:
        - name: JAVA_OPTS
          value: -Xmx28g -Xms16g

few-sunset-43876

11/22/2022, 2:49 AM

Hi Chris, I'm not able to get the first degree lineage. And there isn't 1+ 2+ 3+ checkbox showed datahub-frontend-react logs showed timeout. datahub-gms logs showed OOM.

brainy-tent-14503

11/22/2022, 2:50 AM

I just noticed you’re using the docker compose, not helm/k8. The same env should be configurable in the docker compose files. Probably should adjust those settings down though since you’ve likely got other things running. Maybe put Xmx20g

few-sunset-43876

11/22/2022, 2:51 AM

Thanks David! Is it only for datahub-gms, or anywhere else?

brainy-tent-14503

11/22/2022, 2:52 AM

Well, the error is likely from gms. It’s going to be the main place to configure it.

brainy-tent-14503

11/22/2022, 2:53 AM

The other components like the frontend, are not really memory intensive for example.

brainy-tent-14503

11/22/2022, 2:55 AM

Depending on exactly which env file is picked up it may even be limiting it to like

JAVA_OPTS=-Xms1g -Xmx1g

brainy-tent-14503

11/22/2022, 2:55 AM

Which is not exactly

1.754GiB / 31.26GiB

that but maybe with a little off-heap usage

brainy-tent-14503

11/22/2022, 2:58 AM

Ok, I see you mentioned this docker compose file. It is not pulling in any of the various env files, so likely not setting that env var with 1g.

few-sunset-43876

11/22/2022, 3:01 AM

yes, for datahub-gms, it's currently 25% of 32GB, e.g 8GB for the max heap size. do I need to increase it to 20g?

Copy code

bash-5.1$ java -XX:+PrintFlagsFinal -version | grep HeapSize
   size_t ErgoHeapSizeLimit                        = 0                                         {product} {default}
   size_t HeapSizePerGCThread                      = 43620760                                  {product} {default}
   size_t InitialHeapSize                          = 526385152                                 {product} {ergonomic}
   size_t LargePageHeapSizeThreshold               = 134217728                                 {product} {default}
   size_t MaxHeapSize                              = 8392802304                                {product} {ergonomic}
    uintx NonNMethodCodeHeapSize                   = 5836300                                {pd product} {ergonomic}
    uintx NonProfiledCodeHeapSize                  = 122910970                              {pd product} {ergonomic}
    uintx ProfiledCodeHeapSize                     = 122910970                              {pd product} {ergonomic}
openjdk version "11.0.17" 2022-10-18
OpenJDK Runtime Environment (build 11.0.17+8-alpine-r3)
OpenJDK 64-Bit Server VM (build 11.0.17+8-alpine-r3, mixed mode)

brainy-tent-14503

11/22/2022, 3:03 AM

I can’t say exactly how high, but I would start with perhaps doubling it to 16g and test it. The memory usage depends on the size and shape of the data so its not a one size fits all calibration.

brainy-tent-14503

11/22/2022, 3:03 AM

Let me lookup our default size

brainy-tent-14503

11/22/2022, 3:05 AM

So in general we typically allocate 8gb, however in this case I think increasing it might help.

few-sunset-43876

11/22/2022, 3:19 AM

Thanks David! let me give it a try 🙂 I will inform you the result

few-sunset-43876

11/22/2022, 4:26 AM

Hi David, I changed the JAVA_OPTS in datahub-gms/DockerFile

Copy code

ENV JMX_OPTS=""
ENV JAVA_OPTS="-Xms20g -Xmx20g"

and re-create the containers (in docker folder, run)

Copy code

docker-compose down --remove-orphans && docker-compose pull && docker-compose -p datahub up --force-recreate

but the size didn't not change, it's still 8g

Copy code

[root@datahub-preprod-v2 datahub-gms]# docker exec -it datahub-gms bash
bash-5.1$ java -XX:+PrintFlagsFinal -version | grep HeapSize
   size_t ErgoHeapSizeLimit                        = 0                                         {product} {default}
   size_t HeapSizePerGCThread                      = 43620760                                  {product} {default}
   size_t InitialHeapSize                          = 526385152                                 {product} {ergonomic}
   size_t LargePageHeapSizeThreshold               = 134217728                                 {product} {default}
   size_t MaxHeapSize                              = 8392802304                                {product} {ergonomic}
    uintx NonNMethodCodeHeapSize                   = 5836300                                {pd product} {ergonomic}
    uintx NonProfiledCodeHeapSize                  = 122910970                              {pd product} {ergonomic}
    uintx ProfiledCodeHeapSize                     = 122910970                              {pd product} {ergonomic}
openjdk version "11.0.17" 2022-10-18
OpenJDK Runtime Environment (build 11.0.17+8-alpine-r3)
OpenJDK 64-Bit Server VM (build 11.0.17+8-alpine-r3, mixed mode)

Is there anywhere else need to be modified? ps: the docker-compose.yml is the default of v0.9.2 https://github.com/datahub-project/datahub/blob/master/docker/docker-compose.yml

brainy-tent-14503

11/22/2022, 4:39 AM

You need to edit the docker-compose.yml file, specifically you add the environment variable there. For example see the

environment:

and following line.

Copy code

datahub-gms:
    build:
        context: ../
        dockerfile: docker/datahub-gms/Dockerfile
    image: ${DATAHUB_GMS_IMAGE:-linkedin/datahub-gms}:${DATAHUB_VERSION:-head}
    hostname: datahub-gms
    container_name: datahub-gms
    environment:
      - JAVA_OPTS=-Xms20g -Xmx20g
    ports:
      - ${DATAHUB_MAPPED_GMS_PORT:-8080}:8080
    depends_on:
      - elasticsearch-setup
      - kafka-setup
      - mysql
      - neo4j

few-sunset-43876

11/22/2022, 4:52 AM

hi David, The size is not changed even I added the environment inside the docker-compose.yml file

Copy code

datahub-gms:
    build:
        context: ../
        dockerfile: docker/datahub-gms/Dockerfile
    image: ${DATAHUB_GMS_IMAGE:-linkedin/datahub-gms}:${DATAHUB_VERSION:-head}
    hostname: datahub-gms
    container_name: datahub-gms
    ports:
      - ${DATAHUB_MAPPED_GMS_PORT:-8080}:8080
    environment:
      - JAVA_OPTS=-Xms20g -Xmx20g
    depends_on:
      - elasticsearch-setup
      - kafka-setup
      - mysql
      - neo4j

docker exec -it datahub-gms bash

Copy code

bash-5.1$ java -XX:+PrintFlagsFinal -version | grep HeapSize
   size_t ErgoHeapSizeLimit                        = 0                                         {product} {default}
   size_t HeapSizePerGCThread                      = 43620760                                  {product} {default}
   size_t InitialHeapSize                          = 526385152                                 {product} {ergonomic}
   size_t LargePageHeapSizeThreshold               = 134217728                                 {product} {default}
   size_t MaxHeapSize                              = 8392802304                                {product} {ergonomic}
    uintx NonNMethodCodeHeapSize                   = 5836300                                {pd product} {ergonomic}
    uintx NonProfiledCodeHeapSize                  = 122910970                              {pd product} {ergonomic}
    uintx ProfiledCodeHeapSize                     = 122910970                              {pd product} {ergonomic}

few-sunset-43876

11/22/2022, 4:57 AM

In addition, whenever I click on Lineage tab, we get this WARN log

Copy code

04:55:44.028 [pool-12-thread-1] INFO  c.l.m.filter.RestliLoggingFilter:55 - GET /entitiesV2?ids=List(urn%3Ali%3Acorpuser%3Adatahub) - batchGet - 200 - 6ms
04:55:44.592 [pool-12-thread-1] INFO  c.l.m.filter.RestliLoggingFilter:55 - GET /entitiesV2?ids=List(urn%3Ali%3Acorpuser%3Adatahub) - batchGet - 200 - 27ms
04:55:45.367 [I/O dispatcher 1] INFO  c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 2 Took time ms: -1
04:55:46.192 [ThreadPoolTaskExecutor-1] INFO  c.l.m.k.t.DataHubUsageEventTransformer:74 - Invalid event type: SearchAcrossLineageResultsViewEvent
04:55:46.193 [ThreadPoolTaskExecutor-1] WARN  c.l.m.k.DataHubUsageEventsProcessor:56 - Failed to apply usage events transform to record: {"type":"SearchAcrossLineageResultsViewEvent","query":"","total":10,"actorUrn":"urn:li:corpuser:datahub","timestamp":1669092946065,"date":"Tue Nov 22 2022 11:55:46 GMT+0700 (Indochina Time)","userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36","browserId":"57f357cc-cdf7-4104-a7fa-30d8eda4f486"}
04:55:46.398 [I/O dispatcher 1] INFO  c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 1 Took time ms: -1

few-sunset-43876

11/25/2022, 3:18 AM

Hi @brainy-tent-14503, do you have any idea that i set the - JAVA_OPTS=-Xms20g -Xmx20g but the docker container still has the MaxHeapSize is still 8g (25% of 32g)? Thank you!

Copy code

bash-5.1$ java -XX:+PrintFlagsFinal -version | grep HeapSize
   size_t ErgoHeapSizeLimit                        = 0                                         {product} {default}
   size_t HeapSizePerGCThread                      = 43620760                                  {product} {default}
   size_t InitialHeapSize                          = 526385152                                 {product} {ergonomic}
   size_t LargePageHeapSizeThreshold               = 134217728                                 {product} {default}
   size_t MaxHeapSize                              = 8392802304                                {product} {ergonomic}
    uintx NonNMethodCodeHeapSize                   = 5836300                                {pd product} {ergonomic}
    uintx NonProfiledCodeHeapSize                  = 122910970                              {pd product} {ergonomic}
    uintx ProfiledCodeHeapSize                     = 122910970                              {pd product} {ergonomic}

brainy-tent-14503

11/25/2022, 3:07 PM

Can you see the parameters on the java process? The command you are running is not representative of the java process running the gms service. You can find this information like so. First, identify the container id for gms.

Copy code

$ docker container ls

Find the container id running the

datahub-gms

image. Next run this command with the container id.

Copy code

docker exec <container id> ps -ef

You should see something the following and I would expect to see your settings with 20g. This would at least verify that the java instance running gms has the higher memory that we are intending to set.

Copy code

13 datahub  37:47 java -Xms1g -Xmx1g -jar /jetty-runner.jar --jar jetty-util.jar --jar jetty-jmx.jar --config /datahub/datahub-gms/scripts/jetty.xml /datahub/datahub-gms/bin/war.war

few-sunset-43876

11/28/2022, 9:06 AM

ah, Thanks David! I see it reflected what I set for the Xms Xms when I run

Copy code

docker exec <container id> ps -ef

Thank you again!

6 Views

Open in Slack

Previous Next