Hi guys! I'm working in ingest CorpUser data to da...
# ingestion
e
Hi guys! I'm working in ingest CorpUser data to datahub across to json file. All work without problem but all the data related to CorpUserEditableInfo are not showing into datahub. Here you have my json file and how it's showing in datahub, Can you tell me what I am doing wrong?
Copy code
{
        "auditHeader": null,
        "proposedSnapshot": {
            "com.linkedin.pegasus2avro.metadata.snapshot.CorpUserSnapshot": {
                "urn": "urn:li:corpuser:carlos.guevara",
                "aspects": [
                    {
                        "com.linkedin.pegasus2avro.identity.CorpUserInfo": {
                            "active": true,
                            "countryCode": "MX",
                            "departmentId": null,
                            "departmentName": null,
                            "displayName": {
                                "string": "Carlos Guevara"
                            },
                            "email": "<mailto:cg@kavak.com|cg@kavak.com>",
                            "firstName": "Carlos",
                            "fullName": "Carlos Guevara",
                            "lastName": "Guevara",
                            "managerUrn": "urn:li:corpuser:milan.sahu",
                            "title": {
                                "string": "Data Engineer"
                            }
                        },
                        "com.linkedin.pegasus2avro.identity.CorpUserEditableInfo": {
                            "pictureLink": "<https://github.com/gabe-lyons.png>",
                            "skills": ["superset"]
                        }
                    }
                ]
            }
        },
        "proposedDelta": null
    },
Hero you have how it's showing the profile after the ingest.
m
@elegant-toddler-36093 this might be related to overwrites from the auto-ingest via OIDC login and the ingestion that happens from your custom user metadata ingestion. @big-carpet-38439 do you think that might be the case here?
b
I don't believe so. The auto ingest at OIDC login only occurs if the user explicitly does not yet exist
But we do overwrite the pictureLink if the user does not exist.. let me look at it again. Are you on the latest version?
@green-football-43791 Do you think this could be something on the frontend? I thought skills and pictures have been displaying correctly otherwise, though
Carlos can you help us confirm that the data is ingested into MySQL?
e
Sure, let me a second!
Copy code
+--------------------------------+----------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------------------------+--------------------------+------------+
| urn                            | aspect               | version | metadata                                                                                                                                                                                                                                         | systemmetadata                                                                | createdon                  | createdby                | createdfor |
+--------------------------------+----------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------------------------+--------------------------+------------+
| urn:li:corpuser:carlos.guevara | corpUserInfo         |       0 | {"lastName":"Guevara","firstName":"Carlos","countryCode":"VE","displayName":"Carlos Guevara","fullName":"Carlos Guevara","active":true,"managerUrn":"urn:li:corpuser:milan.sahu","title":"Data Engineer","email":"<mailto:carlos.guevara.ext@kavak.com|carlos.guevara.ext@kavak.com>"} | {"lastObserved":1630513706503,"runId":"64c1f46e-0b41-11ec-aa79-fcb3bcaa3039"} | 2021-09-01 16:26:43.304000 | urn:li:principal:UNKNOWN | NULL       |
| urn:li:corpuser:carlos.guevara | corpUserKey          |       0 | {"username":"carlos.guevara"}                                                                                                                                                                                                                    | {"lastObserved":1630513603292,"runId":"64c1f46e-0b41-11ec-aa79-fcb3bcaa3039"} | 2021-09-01 16:26:43.304000 | urn:li:principal:UNKNOWN | NULL       |
| urn:li:corpuser:datahub        | corpUserEditableInfo |       0 | {"skills":[],"teams":[],"pictureLink":"<https://raw.githubusercontent.com/linkedin/datahub/master/datahub-web-react/src/images/default_avatar.png>"}                                                                                               | NULL                                                                          | 2021-09-01 16:23:22.000000 | urn:li:principal:datahub | NULL       |
| urn:li:corpuser:datahub        | corpUserInfo         |       0 | {"displayName":"Data Hub","active":true,"fullName":"Data Hub","email":"<mailto:datahub@linkedin.com|datahub@linkedin.com>"}                                                                                                                                                    | NULL                                                                          | 2021-09-01 16:23:22.000000 | urn:li:principal:datahub | NULL       |
| urn:li:corpuser:milan.sahu     | corpUserInfo         |       0 | {"fullName":"Milan Sahu","active":true,"title":"Data Platform Architect","countryCode":"MX","email":"<mailto:milan.sahu@kavak.com|milan.sahu@kavak.com>","displayName":"Milan Sahu"}                                                                                           | {"lastObserved":1630513706535,"runId":"64c1f46e-0b41-11ec-aa79-fcb3bcaa3039"} | 2021-09-01 16:26:44.186000 | urn:li:principal:UNKNOWN | NULL       |
| urn:li:corpuser:milan.sahu     | corpUserKey          |       0 | {"username":"milan.sahu"}                                                                                                                                                                                                                        | {"lastObserved":1630513604175,"runId":"64c1f46e-0b41-11ec-aa79-fcb3bcaa3039"} | 2021-09-01 16:26:44.186000 | urn:li:principal:UNKNOWN | NULL       |
+--------------------------------+----------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------+----------------------------+--------------------------+------------+
Well, that extra information not is in mySQL.
b
Interesting! It's not getting in
Well, I think we've found it. Just need to understand why this is occurring
It appears that the aspect is being created, but defaults are being taken as opposd to it being populated
So you're using datahub ingest -c with a File to DataHub sink?
@mammoth-bear-12532 I wonder if something is happening in the mapping from the file to the generated Python models
e
Yes, datahub ingest -c ./users.yml
and the yml is:
Copy code
source:
  type: file
  config:
    # Coordinates
    filename: ./metadata-ingestion/json/result.json
sink:
  type: "datahub-rest"
  config:
    server: ${SERVER}
In the ingest logs, I dont have any message.
b
do you mind also trying via datahub-kafka sink?
I want to try to isolate the issue here.. trying to repro on my end
e
Hi @big-carpet-38439, kafka don't work for me, already we have a infrastructure in the server that we cannot change. And yes, the auto ingest login-users it's working and that user have image and others metadata info, but some externals developers are not going to enter here, so we want to add these users in this way.
b
Got it - will you be using OIDC authentication @elegant-toddler-36093?
e
Yes, but this issue is currently in the server and in my local version.
In my local version I dont have OIDC authentication.