Hello, We are doing redshift ingestion with below ...
# ingestion
s
Hello, We are doing redshift ingestion with below source: ---------------- source: type: "redshift" config: platform_instance: "dataland" env: "DEV" username: "${REDSHIFT_USER}" password: "${REDSHIFT_PASSWORD}" host_port: "dataland-internal.big.dev.scmspain.io:5439" database: "dwh_sch_sp_db" schema_pattern: deny: - .*_mgmt$ table_pattern: deny: - .*dim_ad_normalization$ - .*dim_api_client$ include_table_lineage: False include_views: False stateful_ingestion: enabled: True remove_stale_metadata: True profiling: enabled: true limit: 1000 turn_off_expensive_profiling_metrics: True profile_pattern: allow: - ^dwh_sch_sp_db.motos deny: - .tmp. - .temp. options: connect_args: sslmode: prefer ------------- after our ingestion ran successfully we can see in logs that tables defined with source.deny are soft deleted --------------LOGS----------------------- 'soft_deleted_stale_entities': ['urnlidataset:(urnlidataPlatform:redshift,dataland.dwh_sch_sp_db.pro_infojobs_es.dim_api_client,DEV)', 'urnlidataset:(urnlidataPlatform:redshift,dataland.dwh_sch_sp_db.infojobs_es.dim_api_client,DEV)'], 'query_combiner': {'total_queries': 676, 'uncombined_queries_issued': 379, 'combined_queries_issued': 82, 'queries_combined': 297, 'query_exceptions': 0}, 'saas_version': 'PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.38698', 'upstream_lineage': {}} Sink (datahub-kafka) report: {'records_written': 8252, 'warnings': [], 'failures': [], 'downstream_start_time': None, 'downstream_end_time': None, 'downstream_total_latency_in_seconds': None} ---------------------------------------- but the tables are still visible from UI and the mysql table has 2 entry first with {"removed":true} and again with {"removed":false}, can you please explain what is happening wrong here.
b
hey there! looking into this issue right now
would you mind getting the Status aspect for the two entities that you are not expecting to see in the UI due to soft deleting? You can do it via cli! https://datahubproject.io/docs/cli/#get
b
but the tables are still visible from UI
- this is very strange. if you restart GMS does the issue persist? can you confirm that your kafka consumer is up and running?
s
@big-carpet-38439 kafka consumer is up and running
Also any idea about why it entered 2 records in database, first with "removed=true" then again with "removed=false" (~P~FA screenshot)
b
This can happen if the table exists in redshift and then is removed
Or if you run datahub delete CLI