I have one more interesting issue that I’m trying ...
# troubleshoot
g
I have one more interesting issue that I’m trying to resolve now. When I added great_expectations action to ingest validation data to DataHub, my task failed with memory leak.
Copy code
- name: datahub_action
    action:
      module_name: datahub.integrations.great_expectations.action
      class_name: DataHubValidationAction
      server_url: <http://datahub-gms:8080>
Do you have any idea what can be the problem?
i
Could you share the logs?
g
Copy code
[2022-03-30, 11:20:37 UTC] {great_expectations.py:80} INFO - Running validation with Great Expectations...
[2022-03-30, 11:20:37 UTC] {great_expectations.py:83} INFO - Ensuring data context is valid...
[2022-03-30, 11:20:37 UTC] {data_context.py:620} INFO - Usage statistics is disabled; skipping initialization.
[2022-03-30, 11:20:58 UTC] {local_task_job.py:154} INFO - Task exited with return code Negsignal.SIGKILL
[2022-03-30, 11:20:58 UTC] {taskinstance.py:1280} INFO - Marking task as FAILED. dag_id=dwh_process_dim_tables, task_id=ge_dim_truck, execution_date=20220329T000000, start_date=20220330T112036, end_date=20220330T112058
[2022-03-30, 11:20:58 UTC] {local_task_job.py:264} INFO - 0 downstream tasks scheduled from follow-on schedule check
nothing special in logs
we are running this task in Docker Container
i
How much memory does docker container configured to have? I’m trying to understand how you identified the issue as being a memory leak.
g
As I understand from the documentation such error means insufficient resources
Copy code
Task exited with return code Negsignal.SIGKILL
Then I tried to monitor how much resources are used by my task and see that memory quickly increasing to the limit When I removed the DataHub action it works with max 500MB of memory
i
How much memory was the container using and still failing? 10GB?
g
When I removed the limits from my docker-compose I got the max 10 GB and it failed
i
I see. Could you please open an issue in https://github.com/datahub-project/datahub/issues so that we can track this issue? Please add as much information as possible. If you can define a reproducible test case for this it would be perfect!
👍 1
If you can provide the action with the great_expectations code you’re using & the characteristics of what was being processed would be awesome
g