Hello, I'm trying to ingest data from BigQuery to ...
# ingestion
s
Hello, I'm trying to ingest data from BigQuery to my local version of DataHub. The script runs successfully, but no info about tables is loaded. In DataHub I just see BigQuery folder with my project name. Recipe is strait-forward(I removed sensitive data):
Copy code
source:
  type: "bigquery"
  config:
    project_id: ""
    credential:
      project_id: ""
      private_key_id: ""
      private_key: ""
      client_email: ""
      client_id: ""
    profiling:
      enabled: false
    include_table_lineage: false
    start_time: "2020-03-01T00:00:00Z"
sink:
  type: "datahub-rest"
  config:
    server: "<http://localhost:8080>"
Script output:
Copy code
Source (bigquery) report:
{'entities_profiled': '0',
 'event_ids': ['container-info...',
               'container-platforminstance...',
               'container-subtypes-...'],
 'events_produced': '3',
 'events_produced_per_sec': '2',
 'failures': {},
 'filtered': [],
 'include_table_lineage': 'False',
 'invalid_partition_ids': {},
 'log_page_size': '1000',
 'partition_info': {},
 'profile_table_selection_criteria': {},
 'running_time': '1.26 seconds',
 'selected_profile_tables': {},
 'soft_deleted_stale_entities': [],
 'start_time': '2022-09-14 14:47:16.617896 (1.26 seconds ago).',
 'table_metadata': {},
 'tables_scanned': '0',
 'upstream_lineage': {},
 'use_date_sharded_audit_log_tables': 'False',
 'use_exported_bigquery_audit_metadata': 'False',
 'use_v2_audit_metadata': 'False',
 'views_scanned': '0',
 'warnings': {},
 'window_end_time': '2022-09-14 12:47:16.343255+00:00 (1.53 seconds ago).',
 'window_start_time': '2020-03-01 00:00:00+00:00 (2 years, 28 weeks and 3 days ago).'}
Sink (datahub-rest) report:
{'current_time': '2022-09-14 14:47:17.875747 (now).',
 'failures': [],
 'gms_version': 'v0.8.44',
 'pending_requests': '0',
 'records_written_per_second': '0',
 'start_time': '2022-09-14 14:46:55.391435 (22.48 seconds ago).',
 'total_duration_in_seconds': '22.48',
 'total_records_written': '3',
 'warnings': []}
1
DataHub version: v0.8.44
d
hmm, it seems like it only ingest a container (maybe for the project itself) but couldn’t find any dataset. Can you check in the logs if you see any error or if with the credentials you can list tables?
s
Thanks for help Tamas. You are right. I tried the same account with Python version of BigQuery and it returned no results. I made it work now by creating brand new service account with correct permissions.