better-insurance-34701
09/29/2022, 4:19 AMdatahub --debug ingest -c /git/dwh_dev/datahub.yml
[2022-09-29 11:13:51,202] DEBUG {datahub.telemetry.telemetry:210} - Sending init Telemetry
[2022-09-29 11:13:52,261] DEBUG {datahub.telemetry.telemetry:243} - Sending Telemetry
[2022-09-29 11:13:52,726] INFO {datahub.cli.ingest_cli:182} - DataHub CLI version: 0.8.45
[2022-09-29 11:13:52,746] DEBUG {datahub.cli.ingest_cli:196} - Using config: {'source': {'type': 'dbt', 'config': {'manifest_path': '/git/dwh_dev/target/manifest.json', 'catalog_path': '/git/dwh_dev/target/catalog.json', 'test_results_path': '/git/dwh_dev/target/run_results.json', 'target_platform': 'postgres', 'load_schemas': False, 'meta_mapping': {'business_owner': {'match': '.*', 'operation': 'add_owner', 'config': {'owner_type': 'user', 'owner_category': 'BUSINESS_OWNER'}}, 'data_steward': {'match': '.*', 'operation': 'add_owner', 'config': {'owner_type': 'user', 'owner_category': 'DATA_STEWARD'}}, 'technical_owner': {'match': '.*', 'operation': 'add_owner', 'config': {'owner_type': 'user', 'owner_category': 'TECHNICAL_OWNER'}}, 'has_pii': {'match': True, 'operation': 'add_tag', 'config': {'tag': 'has_pii'}}, 'data_governance.team_owner': {'match': 'Finance', 'operation': 'add_term', 'config': {'term': 'Finance_test'}}, 'source': {'match': '.*', 'operation': 'add_tag', 'config': {'tag': '{{ $match }}'}}}, 'query_tag_mapping': {'tag': {'match': '.*', 'operation': 'add_tag', 'config': {'tag': '{{ $match }}'}}}}}}
[2022-09-29 11:13:52,814] DEBUG {datahub.ingestion.sink.datahub_rest:116} - Setting env variables to override config
[2022-09-29 11:13:52,814] DEBUG {datahub.ingestion.sink.datahub_rest:118} - Setting gms config
[2022-09-29 11:13:52,814] DEBUG {datahub.ingestion.run.pipeline:174} - Sink type:datahub-rest,<class 'datahub.ingestion.sink.datahub_rest.DatahubRestSink'> configured
[2022-09-29 11:13:52,814] INFO {datahub.ingestion.run.pipeline:175} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://localhost:8080>
[2022-09-29 11:13:52,818] DEBUG {datahub.ingestion.sink.datahub_rest:116} - Setting env variables to override config
[2022-09-29 11:13:52,818] DEBUG {datahub.ingestion.sink.datahub_rest:118} - Setting gms config
[2022-09-29 11:13:52,818] DEBUG {datahub.ingestion.reporting.datahub_ingestion_run_summary_provider:120} - Ingestion source urn = urn:li:dataHubIngestionSource:cli-151c2b7711eb626e440af8c75a9082e9
[2022-09-29 11:13:52,819] DEBUG {datahub.emitter.rest_emitter:247} - Attempting to emit to DataHub GMS; using curl equivalent to:
curl -X POST -H 'User-Agent: python-requests/2.28.1' -H 'Accept-Encoding: gzip, deflate' -H 'Accept: */*' -H 'Connection: keep-alive' -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'Content-Type: application/json' --data '{"proposal": {"entityType": "dataHubIngestionSource", "entityUrn": "urn:li:dataHubIngestionSource:cli-151c2b7711eb626e440af8c75a9082e9", "changeType": "UPSERT", "aspectName": "dataHubIngestionSourceInfo", "aspect": {"value": "{\"name\": \"[CLI] dbt\", \"type\": \"dbt\", \"platform\": \"urn:li:dataPlatform:unknown\", \"config\": {\"recipe\": \"{\\\"source\\\": {\\\"type\\\": \\\"dbt\\\", \\\"config\\\": {\\\"manifest_path\\\": \\\"${DBT_PROJECT_ROOT}/target/manifest.json\\\", \\\"catalog_path\\\": \\\"${DBT_PROJECT_ROOT}/target/catalog.json\\\", \\\"test_results_path\\\": \\\"${DBT_PROJECT_ROOT}/target/run_results.json\\\", \\\"target_platform\\\": \\\"postgres\\\", \\\"load_schemas\\\": false, \\\"meta_mapping\\\": {\\\"business_owner\\\": {\\\"match\\\": \\\".*\\\", \\\"operation\\\": \\\"add_owner\\\", \\\"config\\\": {\\\"owner_type\\\": \\\"user\\\", \\\"owner_category\\\": \\\"BUSINESS_OWNER\\\"}}, \\\"data_steward\\\": {\\\"match\\\": \\\".*\\\", \\\"operation\\\": \\\"add_owner\\\", \\\"config\\\": {\\\"owner_type\\\": \\\"user\\\", \\\"owner_category\\\": \\\"DATA_STEWARD\\\"}}, \\\"technical_owner\\\": {\\\"match\\\": \\\".*\\\", \\\"operation\\\": \\\"add_owner\\\", \\\"config\\\": {\\\"owner_type\\\": \\\"user\\\", \\\"owner_category\\\": \\\"TECHNICAL_OWNER\\\"}}, \\\"has_pii\\\": {\\\"match\\\": true, \\\"operation\\\": \\\"add_tag\\\", \\\"config\\\": {\\\"tag\\\": \\\"has_pii\\\"}}, \\\"data_governance.team_owner\\\": {\\\"match\\\": \\\"Finance\\\", \\\"operation\\\": \\\"add_term\\\", \\\"config\\\": {\\\"term\\\": \\\"Finance_test\\\"}}, \\\"source\\\": {\\\"match\\\": \\\".*\\\", \\\"operation\\\": \\\"add_tag\\\", \\\"config\\\": {\\\"tag\\\": \\\"{{ $match }}\\\"}}}, \\\"query_tag_mapping\\\": {\\\"tag\\\": {\\\"match\\\": \\\".*\\\", \\\"operation\\\": \\\"add_tag\\\", \\\"config\\\": {\\\"tag\\\": \\\"{{ $match }}\\\"}}}}}}\", \"version\": \"0.8.45\", \"executorId\": \"__datahub_cli_\"}}", "contentType": "application/json"}}}' '<http://localhost:8080/aspects?action=ingestProposal>'
[2022-09-29 11:13:52,849] DEBUG {datahub.ingestion.run.pipeline:269} - Reporter type:datahub,<class 'datahub.ingestion.reporting.datahub_ingestion_run_summary_provider.DatahubIngestionRunSummaryProvider'> configured.
[2022-09-29 11:13:52,982] DEBUG {datahub.telemetry.telemetry:243} - Sending Telemetry
[2022-09-29 11:13:53,555] DEBUG {datahub.entrypoints:168} - File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 196, in __init__
131 def __init__(
132 self,
133 config: PipelineConfig,
134 dry_run: bool = False,
135 preview_mode: bool = False,
136 preview_workunits: int = 10,
137 report_to: Optional[str] = None,
138 no_default_report: bool = False,
139 ):
(...)
192 self._record_initialization_failure(e, "Failed to create source")
193 return
194
195 try:
--> 196 self.source: Source = source_class.create(
197 self.config.source.dict().get("config", {}), self.ctx
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/source/dbt.py", line 1001, in create
999 @classmethod
1000 def create(cls, config_dict, ctx):
--> 1001 config = DBTConfig.parse_obj(config_dict)
1002 return cls(config, ctx, "dbt")
File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
File "pydantic/main.py", line 342, in pydantic.main.BaseModel.__init__
ValidationError: 1 validation error for DBTConfig
load_schemas
extra fields not permitted (type=value_error.extra)
The above exception was the direct cause of the following exception:
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 197, in run
111 def run(
112 ctx: click.Context,
113 config: str,
114 dry_run: bool,
115 preview: bool,
116 strict_warnings: bool,
117 preview_workunits: int,
118 suppress_error_logs: bool,
119 test_source_connection: bool,
120 report_to: str,
121 no_default_report: bool,
122 no_spinner: bool,
123 ) -> None:
(...)
193 _test_source_connection(report_to, pipeline_config)
194
195 try:
196 logger.debug(f"Using config: {pipeline_config}")
--> 197 pipeline = Pipeline.create(
198 pipeline_config,
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 317, in create
306 def create(
307 cls,
308 config_dict: dict,
309 dry_run: bool = False,
310 preview_mode: bool = False,
311 preview_workunits: int = 10,
312 report_to: Optional[str] = None,
313 no_default_report: bool = False,
314 raw_config: Optional[dict] = None,
315 ) -> "Pipeline":
316 config = PipelineConfig.from_dict(config_dict, raw_config)
--> 317 return cls(
318 config,
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 202, in __init__
131 def __init__(
132 self,
133 config: PipelineConfig,
134 dry_run: bool = False,
135 preview_mode: bool = False,
136 preview_workunits: int = 10,
137 report_to: Optional[str] = None,
138 no_default_report: bool = False,
139 ):
(...)
198 )
199 logger.debug(f"Source type:{source_type},{source_class} configured")
200 <http://logger.info|logger.info>("Source configured successfully.")
201 except Exception as e:
--> 202 self._record_initialization_failure(
203 e, f"Failed to configure source ({source_type})"
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 129, in _record_initialization_failure
128 def _record_initialization_failure(self, e: Exception, msg: str) -> None:
--> 129 raise PipelineInitError(msg) from e
---- (full traceback above) ----
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 197, in run
pipeline = Pipeline.create(
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 317, in create
return cls(
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 202, in __init__
self._record_initialization_failure(
File "/home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 129, in _record_initialization_failure
raise PipelineInitError(msg) from e
PipelineInitError: Failed to configure source (dbt)
[2022-09-29 11:13:53,555] DEBUG {datahub.entrypoints:198} - DataHub CLI version: 0.8.45 at /home/thinh/datahub_venv/lib/python3.10/site-packages/datahub/__init__.py
[2022-09-29 11:13:53,556] DEBUG {datahub.entrypoints:201} - Python version: 3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0] at /home/thinh/datahub_venv/bin/python3 on Linux-5.15.0-48-generic-x86_64-with-glibc2.35
[2022-09-29 11:13:53,556] DEBUG {datahub.entrypoints:204} - GMS config {'models': {}, 'patchCapable': True, 'versions': {'linkedin/datahub': {'version': 'v0.8.45', 'commit': '21a8718b1093352bc1e3a566d2ce0297d2167434'}}, 'managedIngestion': {'defaultCliVersion': '0.8.42', 'enabled': True}, 'statefulIngestionCapable': True, 'supportsImpactAnalysis': True, 'telemetry': {'enabledCli': True, 'enabledIngestion': False}, 'datasetUrnNameCasing': False, 'retention': 'true', 'datahub': {'serverType': 'quickstart'}, 'noCode': 'true'}
famous-florist-7218
09/29/2022, 4:22 AMValidationError: 1 validation error for DBTConfig
load_schemas
extra fields not permitted (type=value_error.extra)
better-insurance-34701
09/29/2022, 4:44 AMfamous-florist-7218
09/29/2022, 5:15 AMmammoth-bear-12532
The dbt ingestion source's disable_dbt_node_creation and load_schema options have been removed. They were no longer necessary due to the recently added sibling entities functionality.
better-insurance-34701
09/29/2022, 6:35 AMlittle-megabyte-1074
bland-orange-13353
09/29/2022, 10:34 PM