Is Datahub Delta lake connector tested with Databr...
# troubleshoot
a
Is Datahub Delta lake connector tested with Databricks Delta lake? If yes, I am getting an error below while trying to load delta tables from s3 base path, can someone help if they have seen this:
Copy code
[2022-09-28 07:36:12,222] ERROR    {datahub.entrypoints:192} -
Traceback (most recent call last):
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/entrypoints.py", line 149, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
  File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/telemetry/telemetry.py", line 347, in wrapper
    raise e
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/telemetry/telemetry.py", line 299, in wrapper
    res = func(*args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/utilities/memory_leak_detector.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/cli/ingest_cli.py", line 212, in run
    loop.run_until_complete(run_func_check_upgrade(pipeline))
  File "/usr/lib64/python3.7/asyncio/base_events.py", line 587, in run_until_complete
    return future.result()
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/cli/ingest_cli.py", line 166, in run_func_check_upgrade
    ret = await the_one_future
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/cli/ingest_cli.py", line 158, in run_pipeline_async
    None, functools.partial(run_pipeline_to_completion, pipeline)
  File "/usr/lib64/python3.7/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/cli/ingest_cli.py", line 148, in run_pipeline_to_completion
    raise e
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/cli/ingest_cli.py", line 134, in run_pipeline_to_completion
    pipeline.run()
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/run/pipeline.py", line 350, in run
    self.preview_workunits if self.preview_mode else None,
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/delta_lake/source.py", line 329, in get_workunits
    for wu in self.process_folder(self.source_config.complete_path, get_folders):
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/delta_lake/source.py", line 296, in process_folder
    delta_table = read_delta_table(path, self.source_config)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/delta_lake/delta_lake_utils.py", line 32, in read_delta_table
    raise e
  File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/delta_lake/delta_lake_utils.py", line 28, in read_delta_table
    delta_table = DeltaTable(path, storage_options=opts)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/deltalake/table.py", line 92, in __init__
    table_uri, version=version, storage_options=storage_options
deltalake.PyDeltaTableError: Failed to load checkpoint: Invalid JSON in checkpoint: expected value at line 1 column 1
[2022-09-28 07:36:12,223] ERROR    {datahub.entrypoints:196} - Command failed:
	Failed to load checkpoint: Invalid JSON in checkpoint: expected value at line 1 column 1.
@square-activity-64562 have you seen this somewhere before?
s
No
a