Hi all! So with the new version v0.8.16, Azure AD ...
# troubleshoot
r
Hi all! So with the new version v0.8.16, Azure AD JIT group provisioning is working swimmingly, but the default is that group names are assigned to the group ID. This is not really user friendly, and I need to change this to the group name. I can do that in the Azure AD ingestion recipe by setting
azure_ad_response_to_groupname_attr
(I think - not tested), but how can I set this for the JIT provisioning ie. when someone logs into the front end? (see screenshot). I think I want it set to
displayName
.
b
Hi hi! It all depends on the claims that are returned by Azure at Login time! You’ll need to configure azure to return the group’s proper display name along with everything else. I haven’t personally tried that because we don’t use azure internally
Then if we have a claim that contains display names for groups, it’s just a matter of configuration
But if azure can’t for some reason be configured to correctly return the group names at login time we may just have to stick with batch ingestion
r
Good stuff, thanks! We've changed it to return the sAMAccountName (whatever that is) so hopefully that resolves it. We cannot currently use the ingestion task, because it fails when it encounters a group or user where the name field is
null
(and it is nullable).
b
Oh boy - thanks for reporting this! Sounds like we need to update the Azure source to address this
When the name field is null
Do you recall what the error you're seeing is?
Because the behavior is supposed to be that we simply skip records in the case that we cannot do the mapping
Copy code
datahub_corp_group_urn = self._map_azure_ad_group_to_urn(azure_ad_group)
if not datahub_corp_group_urn:
    error_str = "Failed to extract DataHub Group Name from Azure AD Group named {}. Skipping...".format(
        azure_ad_group.get("displayName")
    )
    self.report.report_failure("azure_ad_group_mapping", error_str)
    continue
r
There's a long stack trace, but the python exception is basically this:
Copy code
File "/home/mamor/.local/lib/python3.8/site-packages/datahub/entrypoints.py", line 93, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
File "/home/mamor/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
File "/home/mamor/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
File "/home/mamor/.local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/mamor/.local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/mamor/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
File "/home/mamor/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
File "/home/mamor/.local/lib/python3.8/site-packages/datahub/cli/ingest_cli.py", line 58, in run
    pipeline.run()
File "/home/mamor/.local/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 141, in run
    for wu in self.source.get_workunits():
File "/home/mamor/.local/lib/python3.8/site-packages/datahub/ingestion/source/identity/azure_ad.py", line 158, in get_workunits
    datahub_corp_user_urn = self._map_azure_ad_user_to_urn(
File "/home/mamor/.local/lib/python3.8/site-packages/datahub/ingestion/source/identity/azure_ad.py", line 358, in _map_azure_ad_user_to_urn
    user_name = self._map_azure_ad_user_to_user_name(azure_ad_user)
File "/home/mamor/.local/lib/python3.8/site-packages/datahub/ingestion/source/identity/azure_ad.py", line 350, in _map_azure_ad_user_to_user_name
    return self._extract_regex_match_from_dict_value(
File "/home/mamor/.local/lib/python3.8/site-packages/datahub/ingestion/source/identity/azure_ad.py", line 391, in _extract_regex_match_from_dict_value
    raise ValueError(f"Unable to find the key {key} in Group. Is it wrong?")

ValueError: Unable to find the key mail in Group. Is it wrong?
Which is caused by this fake user having no email:
Copy code
self = AzureADSource(ctx=<datahub.ingestion.api.common.PipelineContext object at 0x7f8b59c5d580>)
     str_dict = {'@odata.type': '#microsoft.graph.user',
                 'id': '226bd5d9-932e-4b22-9982-df86876e13a3',
                 'businessPhones': [],
                 'displayName': 'testmembershipdepartment',
                 'givenName': 'Test',
                 'jobTitle': 'System and Process Analyst',
                 'mail': None,
                 'mobilePhone': None,
                 'officeLocation': None,
                 'preferredLanguage': None,
                 'surname': 'Membership',
                 'userPrincipalName': '<mailto:testmembershipdepartment@dfds.com|testmembershipdepartment@dfds.com>'}
     Dict = typing.Dict
     key = 'mail'
     pattern = '([^@]+)'
     raw_value = None
     re.search = <function 'search' re.py:198>
The reason why this happens, is that our legacy on-premise ADFS setup syncs to AzureAD (it has a slightly different entity model, and allows weird stuff like groups with no name). My opinion is that the
ValueError
that is raised should be caught somewhere, so that the ingestion simply skips the entries with
null
in the name attribute. Same for groups. What do you think?