Good Morning! Anyone seen thrift errors like this ...
# ingestion
g
Good Morning! Anyone seen thrift errors like this before?
thrift.transport.TTransport.TTransportException: Bad status: 78 (b'5.7.22-log')
g
where'd you see that?
e.g. which source/sinks were you using?
g
recipe is
Copy code
source:
  type: hive
  config:
    username: user
    password: pw
    host_port: localhost:3306

sink:
  type: "datahub-rest"
  config:
    server: "<http://localhost:8080>"
stack trace
Copy code
Traceback (most recent call last):
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/bin/datahub", line 33, in <module>
    sys.exit(load_entry_point('datahub', 'console_scripts', 'datahub')())
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/src/datahub/entrypoints.py", line 70, in ingest
    pipeline.run()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/src/datahub/ingestion/run/pipeline.py", line 81, in run
    for wu in self.source.get_workunits():
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/src/datahub/ingestion/source/sql_common.py", line 163, in get_workunits
    inspector = reflection.Inspector.from_engine(engine)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 135, in from_engine
    return Inspector(bind)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 108, in __init__
    bind.connect().close()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2263, in connect
    return self._connection_cls(self, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 104, in __init__
    else engine.raw_connection()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2370, in raw_connection
    self.pool.unique_connection, _connection
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
    return fn()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 304, in unique_connection
    return _ConnectionFairy._checkout(self)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
    rec = pool._do_get()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/impl.py", line 140, in _do_get
    self._dec_overflow()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    with_traceback=exc_tb,
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/impl.py", line 137, in _do_get
    return self._create_connection()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
    return _ConnectionRecord(self)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
    self.__connect(first_connect_check=True)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    with_traceback=exc_tb,
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
    connection = pool._invoke_creator(self)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
    return dialect.connect(*cargs, **cparams)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 508, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/pyhive/hive.py", line 94, in connect
    return Connection(*args, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/pyhive/hive.py", line 192, in __init__
    self._transport.open()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/thrift_sasl/__init__.py", line 96, in open
    message=("Bad status: %d (%s)" % (status, payload)))
thrift.transport.TTransport.TTransportException: Bad status: 78 (b'5.7.22-log')
i should note that i was getting
sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:hive
before i
pip install sasl thrift_sasl pyhive
g
huh interesting
I suspect it's using the wrong authentication mechanism
looking into it
g
ok, apologies, i also commented out the follwing lines in
hive.py
Copy code
if (password is not None) != (auth in ('LDAP', 'CUSTOM')):
            raise ValueError("Password should be set if and only if in LDAP or CUSTOM mode; "
                             "Remove password or use one of those modes")
because without that, when i run
datahub ingest -c hive_recipe.yml
i get
ValueError: Password should be set if and only if in LDAP or CUSTOM mode; Remove password or use one of those modes
the full stack trace of that is
Copy code
Traceback (most recent call last):
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/bin/datahub", line 33, in <module>
    sys.exit(load_entry_point('datahub', 'console_scripts', 'datahub')())
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/src/datahub/entrypoints.py", line 70, in ingest
    pipeline.run()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/src/datahub/ingestion/run/pipeline.py", line 81, in run
    for wu in self.source.get_workunits():
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/src/datahub/ingestion/source/sql_common.py", line 163, in get_workunits
    inspector = reflection.Inspector.from_engine(engine)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 135, in from_engine
    return Inspector(bind)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/reflection.py", line 108, in __init__
    bind.connect().close()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2263, in connect
    return self._connection_cls(self, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 104, in __init__
    else engine.raw_connection()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2370, in raw_connection
    self.pool.unique_connection, _connection
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
    return fn()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 304, in unique_connection
    return _ConnectionFairy._checkout(self)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
    fairy = _ConnectionRecord.checkout(pool)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
    rec = pool._do_get()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/impl.py", line 140, in _do_get
    self._dec_overflow()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    with_traceback=exc_tb,
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/impl.py", line 137, in _do_get
    return self._create_connection()
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
    return _ConnectionRecord(self)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
    self.__connect(first_connect_check=True)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
    pool.logger.debug("Error on connect(): %s", e)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    with_traceback=exc_tb,
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
    connection = pool._invoke_creator(self)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
    return dialect.connect(*cargs, **cparams)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 508, in connect
    return self.dbapi.connect(*cargs, **cparams)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/pyhive/hive.py", line 94, in connect
    return Connection(*args, **kwargs)
  File "/Users/cohenm/autodesk/src/datahub/metadata-ingestion/venv/lib/python3.7/site-packages/pyhive/hive.py", line 123, in __init__
    raise ValueError("Password should be set if and only if in LDAP or CUSTOM mode; "
ValueError: Password should be set if and only if in LDAP or CUSTOM mode; Remove password or use one of those modes
g
Can you try adding an auth option to your recipe?
Copy code
source:
  type: hive
  config:
    username: user
    password: pw
    host_port: localhost:3306
    options:
      auth: LDAP
also, not sure if setting host_port to port 3306 was intentional, since that's the standard mysql port
g
TypeError: Invalid argument(s) 'auth' sent to create_engine(), using configuration HiveDialect/QueuePool/Engine.  Please check that the keyword arguments are appropriate for this combination of components.
g
ah I missed a layer:
Copy code
source:
  type: hive
  config:
    username: user
    password: pw
    host_port: localhost:3306
    options:
      connect_args:
        auth: LDAP
g
that solved the problem of having to comment out that check in hive.py
still getting the
thrift.transport.TTransport.TTransportException: Bad status: 78 (b'5.7.22-log')
g
is hive running on localhost:3306? - that's normally the port where mysql runs
g
no, it's in RDS but it is on 3306
this is the Hive Metastore
in my jdbc client i use
jdbc:<mysql://rds-metastore-us-east-1-prd.wewet.us-east-1.rds.amazonaws.com:3306>
wondering if i should just use the Mysql source?
g
I think so - not fully sure what you mean by "this is Hive Metastore"
g
Wow, that did it. many lines of
DEBUG    {datahub.ingestion.run.pipeline:38} - sink called success callback
sorry to waste your time with this!
thank you for the help though! 🙂
g
no worries - glad I could help!