I’m facing the following error when running Hive i...
# ingestion
p
I’m facing the following error when running Hive ingestion with the latest linkedin/datahub-ingestion
(97bed71)
It works with older versions like
c9c1ba4
. I guess some libraries are changed and this needs to be fixed.
l
@miniature-tiger-96062 can you please take a look?
p
Here are the details:
Copy code
9db7a34 - 4 days ago - failing - 731 MB
05a7939 - 7 days ago - failing - 738 MB
9887ca9 - 10 days ago - failing - 726 MB
13d6280 - 17 days ago - failing - 731 MB
b19addb - 21 days ago - failing - 730 mb
dd7bead - a month ago - failing - 728 mb.
c9c1ba4 - a month ago - working - 719 mb
f40bf1c - a month ago - working - 719 mb
Somehow, when the image size is increased from 719 MB to 728 MB, it starts to fail.
This is my hive configuration:
Copy code
source:
  type: hive
  config:
    host_port: <redacted-url>
    env: "DEV"
    database: event_tracking
    schema_pattern:
        allow:
           - "event_tracking"
    table_pattern:
        allow:
          - "event_tracking.page_view_event"
sink:
  type: "datahub-kafka"
  config:
    connection:
      bootstrap: <redacted-url>
      schema_registry_url: <redacted-url>
m
Hi @polite-flower-25924, let me try to reproduce and take a look at what's going on.
The error you pasted below seems like a connection error to hive metastore. I am assuming you meant the same hive configuration is working with older versions?
Also, assuming that you put in the username and password, if your store requires that?
@polite-flower-25924, are you still facing the issue?
p
I didn’t need to try with the latest version if any
I’ve shared the working and non-working tags in this thread
g
@polite-flower-25924 how is that config working? It looks like a push from Hive metastore directly to datahub-kafka. Is that supported in the repository now or you wrote custom Hive hooks?
p
@gorgeous-optician-32034 it’s pull-based approach. Ingestion Jobs pulls the metadata from Hive metastore and pushes to datahub-kafka