b

    bulky-lunch-72217

    1 year ago
    hello, i get an error where execute the /opt/datahub/datahub-master/metadata-ingestion/sql-etl/mysql_etl.py the error is avro.io.AvroTypeException: The datum is not an example of the schema
    m

    mammoth-bear-12532

    1 year ago
    @bulky-lunch-72217: I would suggest moving to the new and improved python ingestion scripts if you can.
    b

    bulky-lunch-72217

    1 year ago
    thanks, i try it
    i successfully execute
    datahub ingest -c ./examples/recipes/mysql_to_datahub.yml
    , but the datahub's prod has not change.
    g

    gray-shoe-75895

    1 year ago
    Are you using the datahub-kafka sink or the datahub-rest sink?
    Also, it seems curious that "workunits_produced" is empty - maybe all the tables are being filtered out by the allow/deny rules?
    b

    bulky-lunch-72217

    1 year ago
    yes, I was wrong, i set the wrong allow/deny rules.
    it's work, but I can't find the lineage of MySQL table, just a single point, how do I make these relationships?
    i make these ralationships via curl and api, but i don't know what is "($params😦),name:barUp,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)"
    <http://localhost:8080/datasets/($params:(),name:barUp,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/downstreamLineage>
    github:https://github.com/linkedin/datahub/blob/928444928a1618d0a861cfff371d12951d39a3ab/gms/README.md#create-group
    g

    gray-shoe-75895

    1 year ago
    Where are you getting the lineage information from?
    b

    bulky-lunch-72217

    1 year ago
    i make these lineages via
    curl '<http://localhost:8080/dashboards?action=ingest>' -X POST -H 'X-RestLi-Protocol-Version:2.0.0' --data '{"snapshot": {"aspects":[{"com.linkedin.dataset.UpstreamLineage":{"upstreams":[{"auditStamp":{"time":1612576061011,"actor":"urn:li:corpuser:fbar"},"dataset":"urn:li:dataset:(urn:li:dataPlatform:mysql,my.test.user,PROD)","type":"TRANSFORMED"},{"auditStamp":{"time":1612576061011,"actor":"urn:li:corpuser:fbar"},"dataset":"urn:li:dataset:(urn:li:dataPlatform:mysql,my.test.metatable,PROD)","type":"TRANSFORMED"}]}}],"urn":"urn:li:dataset:(urn:li:dataPlatform:mysql,my.test.test,PROD)"}}'
    now, i want to get downstream datasets of user, how do i do?
    g

    gray-shoe-75895

    1 year ago
    the upstream and downstream information should appear in the UI under the lineage tab of the dataset
    b

    bulky-lunch-72217

    1 year ago
    yes, i can see the lineage in the UI, can i get the lineage via api? like this
    curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' '<http://localhost:8080/datasets/($params:(),name:barUp,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)/downstreamLineage>' | jq
    b

    big-carpet-38439

    1 year ago
    that should be the correct endpoint to fetch down stream lineage for your dataset named BarUp
    btw I don't see the "barUp" table name in the curl ^^
    b

    bulky-lunch-72217

    1 year ago
    i see this example, actually, i dont know how format
    ($params:(),name:barUp,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Afoo)
    b

    big-carpet-38439

    1 year ago
    Oh i see -- so that is basically a stringified DatasetKey.pdl struct... the name = the name of the dataset you are querying for, the origin = the environment (PROD, STAGING), the platform = the dataset platform type (kafka, hdfs, hive, etc!)
    @bulky-lunch-72217 can you try the following:
    curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' '<http://localhost:8080/datasets/($params:(),name:my.test.user,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Amysql)/downstreamLineage>' | jq
    By the way, is the name "my.test.test" or "my.test.user"? if it is "my.test.test." you can try
    curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' '<http://localhost:8080/datasets/($params:(),name:my.test.test,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Amysql)/downstreamLineage>' | jq
    b

    bulky-lunch-72217

    1 year ago
    yes, it is my.test.user
    thanks!!! it's ok
    i use this
    curl -H 'X-RestLi-Protocol-Version:2.0.0' -H 'X-RestLi-Method: get' '<http://localhost:8080/datasets/($params:(),name:my.test.user,origin:PROD,platform:urn%3Ali%3AdataPlatform%3Amysql)/downstreamLineage>'
    b

    big-carpet-38439

    1 year ago
    wooo!!