Hey Everyone :wave:, I am not able to ingest any ...
# troubleshoot
w
Hey Everyone 👋, I am not able to ingest any glossary terms with the new datahubbbb versions
0.8.40
and
0.8.41
, is this a known issue with these version ? Local docker testing works alright for
0.8.42
but since helm chart is not yet available, i can not use it.
b
Hey Kamal! We've heard this just recently actually. could you share any error logs you may be seeing with us?
w
Hey @bulky-soccer-26729 , There is no error, logs shows ingest ran alright but glossary terms were not updated on UI. I tried adding glossary terms manually on UI but new ones were not visible on UI using this method as well. Here are output logs ( i shortened logs by removing most of glossary terms)
Copy code
Source (datahub-business-glossary) report:
{'workunits_produced': 49,
 'workunit_ids': [
         'urn:li:glossaryTerm:PII.Country',
         'urn:li:glossaryTerm:PII.Race',
         'urn:li:glossaryTerm:PII.Place of birth',
         'urn:li:glossaryTerm:PII.Religion',
         'urn:li:glossaryTerm:PII.Age Range'],
 'warnings': {},
 'failures': {},
 'cli_version': '0.8.41',
 'cli_entry_location': '/datahub/__init__.py',
 'py_version': '3.9.0 (default, Nov 25 2021, 20:20:38) \n[Clang 12.0.5 (clang-1205.0.22.11)]',
 'py_exec_path': '.venv/bin/python',
 'os_details': 'macOS-12.4-x86_64-i386-64bit'}
Sink (datahub-rest) report:
{'records_written': 49,
 'warnings': [],
 'failures': [],
 'downstream_start_time': datetime.datetime(2022, 8, 4, 19, 23, 12, 939647),
 'downstream_end_time': datetime.datetime(2022, 8, 4, 19, 23, 40, 526068),
 'downstream_total_latency_in_seconds': 27.586421,
 'gms_version': 'v0.8.41'}
If i test using docker on local, the
gms_version
always points to ‘v0.8.42’ irrespective of what version i provide for
acryl-datahub
in requirements.txt , so thats not letting me test the version
'v0.8.41'
locally. Another Qis, Is there a way i could pin this version on local ?
b
very weird, so you can't even create through the UI either? are you able to go directly to a term page to see if it has data there? try going to
<http://localhost:3000/glossaryTerm/urn:li:glossaryTerm:PII.Country>
and see if you can see documentation or owners or any other data associated with it to see if it exists
I do see an issue above however in the urns you're creating - there shouldn't be any spaces in urns like that. You can add spaces to names in the UI but when ingesting via a yaml file you can't have spaces in the urn name
w
Yes i see the terms when i use the path with correct urn, just these terms are not shown on UI.
b
when you create a term via the UI are you specifying a Parent?
w
I deleted all the terms hoping a fresh ingest might fix it, now the existing glossary terms are also not visible. So to answer your question, no i did not test with assigning a parent on UI.
I do see an issue above however in the urns you’re creating - there shouldn’t be any spaces in urns like that. You can add spaces to names in the UI but when ingesting via a yaml file you can’t have spaces in the urn name
I see, i will give it a go and get back on this.
One observation on below statement
I do see an issue above however in the urns you’re creating - there shouldn’t be any spaces in urns like that. You can add spaces to names in the UI but when ingesting via a yaml file you can’t have spaces in the urn name
I just checked , the existing ingest ( before i added additional glossary terms) had space in urn as well and it was working fine before the upgrade.
b
okay gotcha that's good to know. I'd still try it without the spaces when ingesting - still thinking about this/looking into it for you
w
Feedback received, thanks for highlighting this. and thank you so much for your response and looking into it. 🙂 🚀
b
after ingesting - can you check your document store (mysql or whatever you use) to ensure that the terms you ingested show up there? we can dynamically render a page when you pass the urn into the URL so technically it's not a guarantee that checking the page means the data exists (unless you saw descriptions, owners, metadata like that)
ooo okay this is very helpful. So you're ingesting and getting the data into your store properly, it's just not showing up in the UI. Would you mind checking one of the glossary terms specifically (such as
urn:li:glossaryTerm:Discoverd.Country
) and look at its
glossaaryTermInfo
aspect, checking out its metadata. Does it have a
parentNode
set? Also, when you go to your glossary do you see nothing at all? or are you able to see the
Discoverd
Glossary Node (Term Group)?
c
Hey @bulky-soccer-26729, I work with @wooden-pencil-40912 and have taken a look into this. I can confirm in the
glossaryTermInfo
it is showing
parentNode
. I’ve upgraded our staging instance to
v0.8.43
, I think there’s a few issues at play with our data. 1. All groups and terms ingested via recipe are not showing in the UI but are in the metadata and if I rebuild indices they still do not show 2. I can add new groups and terms via UI but they do not immediately show, I have to rebuild indices each time Looking at the metadata the difference I’m seeing in
glossaryNodeInfo
and
glossaryTermInfo
is that there’s a
name
element on the ones I created in the UI, whereas these don’t exist in the ingested ones, maybe because we’re setting the id and name the same, the ones created in the UI have a UUID for the ID, as there’s no advanced property to update the ID. I’m seeing some fails in our actions pod, might be nothing but I’m wondering if this is what’s used to trigger the indices builds when we create/ingest them?
I’ve found our issue, the GMS was erroring on a missing platform event topic (FYI we’re jumping up from
v0.8.28
), we’re using custom topic names, so I’ve set
PLATFORM_EVENT_TOPIC_NAME
and created the topic and now the groups and terms immediately appear when adding via the UI. I’ve found for the ingest, I’ve had to hard delete and re-ingest for them to appear, rebuilding the indices isn’t working for us, I’ll look and see if I can see if the underlying metadata before and after differs
I’ve compared the aspect metadata for both nodes and terms post delete and re-ingest and it’s identical. Also, I’ve tested running the restore indices job and it doesn’t fix it, I also tried removing the indices first and then running the restore indices, but then GMS was then erroring and referring to the existing one, I had restart it and and manually delete and re-ingest the nodes and terms again.
b
hey Peter! so sorry for not getting back to you until now as I was on vacation. However this is all very interesting and great information - thanks so much for digging into this. So can you confirm that you're able to get your glossary in a state that you want now that you've set PLATFORM_EVENT_TOPIC_NAME?