Hello All, Currently I'm working on a Pulsar sourc...
# contribute-code
a
Hello All, Currently I'm working on a Pulsar source and used the Kafka source as a starting point. Everything is working fine except for one thing. When using the naming convention for a Pulsar topic in the creation of the dataset_urn the search breaks in the UI. A Pulsar topic is constructed like "persistent://tenant/namespace/topic", I assume that the slashes are the cause of that. Is it allowed to have forward slashes in the urn? To my understanding, the slash character is allowed in the NSS part. Preferable I would like to keep the naming as is, any pointers are highly appreciated.
b
Hmmm it should be fine. I’m actually thinking this is related to the : character since it’s a reserved urn character. What you can do is simply encode this name for the urn, as long as it’s consistently generated from the source you should be fine. Then just place the real name that you want to be displayed in the ‘name’ field of the ‘DatasetProperties’ aspect!
a
Did some more testing, after encoding the urn it works (as expected). I ingested the following urn
Copy code
# Encoded urn
urn:li:dataset:(urn:li:dataPlatform:pulsar,local.persistent-rvm-test-twitter-tweets2,TEST)
# Pulsar topic name urn
urn:li:dataset:(urn:li:dataPlatform:pulsar,<local.persistent://rvm/test/twitter-tweets2,TEST>)
The main screen shows both "Datasets" under the Try searching for (see attached screenshot). Clicking on suggestions results in "No results found for .." and the "expected" results. It looks like having slashes in the urn is okay besides from the Try searching for. Can you point me to the code where the search is handled?
narrowed it down a bit more, the search does not find an urn or name containing two forward slashes. Searching for "persistent*:/*rvm/test/twitter-tweets" (1 slash) returns dataset "local.persistent*://*rvm/test/twitter-tweets".. (urn is with 2 slashes) Searching for "persistent*://*rvm/test/twitter-tweets" (2 slashes) returns no result..
b
Got it… I think this must be because our search index interprets the slash as escape characters or something similar
We need to dig deeper
thank you 1