You need to set `pipeline_name` -&gt; <https://dat...
# ingestion
d
f
oh, i see, thx Tamas , You helped me many times :d
d
I’m happy to help 🙂
f
i run the ingestion (after i drop a view) but the view is still in datahub recepie
Copy code
source:
    type: bigquery
    config:
        project_id: project1
        credential:
        include_table_lineage: true
        stateful_ingestion:
          enabled: true

sink:
    type: datahub-rest
    config:
        server: '<http://localhost:8080>'

pipeline_name: "test_bq"
result
Copy code
'tables_scanned': 384,
 'views_scanned': 197,
 'entities_profiled': 0,
 'filtered': [],
 'soft_deleted_stale_entities': [],
 'query_combiner': None}
Sink (datahub-rest) report:
{'records_written': 581,
 'warnings': [],
 'failures': [],
 'downstream_start_time': datetime.datetime(2022, 2, 25, 13, 20, 35, 953359),
 'downstream_end_time': datetime.datetime(2022, 2, 25, 13, 26, 11, 524126),
 'downstream_total_latency_in_seconds': 335.570767}
rights
Copy code
BigQuery Admin
BigQuery Metadata Viewer
Logging Admin
I also dont see the lineage ( i created the view today)
d
In the logs at the beginning there should be some logline about how many audit log got from Bq and how many were used for lineage building. Please, can you check it? We saw some users who failed to have lineage and I’m chasing this and I have a possible fix which will be available soon. -> I do hope it will fix but as I couldn’t reproduce this issue on our side.
f
oh, this warning
{datahub.ingestion.source.sql.bigquery:224} - Built lineage map containing 0 entries. /Users/dragos.c-ami/work/myvirtualenv/lib/python3.8/site-packages/google/cloud/bigquery/client.py513 UserWarning: Cannot create BigQuery Storage client, the dependency google-cloud-bigquery-storage is not installed.
hmm, i think i need some extra right. I run for another project i have Built lineage map containing 3856 entries.
2 project, same right, for one i got Built lineage map containing 3856 entries. and for the second one i got Built lineage map containing 0 entries.
hi @dazzling-judge-80093, i think the view is not deleted because i got this errror
Copy code
Querying for the latest ingestion checkpoint for pipelineName:'test_bq', platformInstanceId:'bigquery_no_host_port_no_database', job_name:'common_ingest_from_sql_source'
[2022-02-28 11:08:30,356] INFO     {datahub.ingestion.run.pipeline:79} - sink wrote workunit am-dwh-t1.tv.P_VD_TV
[2022-02-28 11:08:30,650] INFO     {datahub.ingestion.source.state_provider.datahub_ingestion_state_provider:89} - The last committed ingestion checkpoint for pipelineName:'test_bq', platformInstanceId:'bigquery_no_host_port_no_database', job_name:'common_ingest_from_sql_source' found with start_time: 2022-02-28 07:01:07.713000+00:00 and a bucket duration of None.
[2022-02-28 11:08:30,653] INFO     {datahub.ingestion.source.state.checkpoint:118} - Successfully constructed last checkpoint state for job common_ingest_from_sql_source
[2022-02-28 11:08:30,656] INFO     {datahub.ingestion.source.state_provider.datahub_ingestion_state_provider:109} - Committing ingestion checkpoint for pipeline:'test_bq',instance:'bigquery_no_host_port_no_database', job:'common_ingest_from_sql_source'
[2022-02-28 11:08:30,821] INFO     {datahub.ingestion.source.state_provider.datahub_ingestion_state_provider:130} - Committed ingestion checkpoint for pipeline:'test_bq',instance:'bigquery_no_host_port_no_database', job:'common_ingest_from_sql_source'
[2022-02-28 11:08:30,821] INFO     {datahub.cli.ingest_cli:83} - Finished metadata ingestion
d
Interestingly here I can’t see even if it tried to get the audit logs. Is this log partial or should contain everything?
f
its everything
d
Do you use two different ingestion recipe of one for both project?
f
i add
Copy code
start_time: 2022-02-28 00:00:00.065028Z
        end_time:   2022-02-28 23:59:00.065028Z
and now i have 1 buld lineage, so the problem is that i see the view in datahub, even if it deleted in bq
d
Does the lineage work if you split into two part?