Hi team, i encountered below error when ingesting ...
# ingestion
b
Hi team, i encountered below error when ingesting from hive. It seems to be issue when ingesting views. Do you have some idea how to troubleshoot?
Copy code
Traceback (most recent call last):
  File "/home/hadoop/.pyenv/versions/3.7.2/bin/datahub", line 8, in <module>
    sys.exit(main())
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/datahub/entrypoints.py", line 93, in main
    sys.exit(datahub(standalone_mode=False, **kwargs))
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/datahub/entrypoints.py", line 81, in ingest
    pipeline.run()
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/datahub/ingestion/run/pipeline.py", line 108, in run
    for wu in self.source.get_workunits():
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/datahub/ingestion/source/sql_common.py", line 239, in get_workunits
    yield from self.loop_views(inspector, schema, sql_config)
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/datahub/ingestion/source/sql_common.py", line 319, in loop_views
    view_definition = inspector.get_view_definition(view)
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 338, in get_view_definition
    self.bind, view_name, schema, info_cache=self.info_cache
  File "/home/hadoop/.pyenv/versions/3.7.2/lib/python3.7/site-packages/sqlalchemy/engine/interfaces.py", line 363, in get_view_definition
    raise NotImplementedError()
NotImplementedError
g
which source are you using?
based on the stack trace, I believe this should fix it https://github.com/linkedin/datahub/pull/2796
b
thanks @gray-shoe-75895, the PR looks good. btw i am now using DataHub CLI for quickstart. Is it possible to run the latest code from master branch using datahub CLI? Or we can only test after it is released?
g
I’ll cut another release right now, and will ping in a few minutes. For future reference, you can install directly from github using
pip install 'git+<https://github.com/linkedin/datahub.git#egg=acryl_datahub[datahub-kafka]&subdirectory=metadata-ingestion>'
(changing datahub-kafka to whatever plugins you’d like)
@boundless-student-48844 I just cut v0.8.4.0 https://pypi.org/project/acryl-datahub/0.8.4.0/, which should have fixes for both of the issues you were running into
b
wow, thats really impressive!!! thank you Harshal!! Will try again 🙏