```File datahub v 0 8 6 metadata ingestion dhubv086 lib64 py DataHub #ingestion

```File "datahub_v_0_8_6/metadata-ingestion/dhubv0...

rich-policeman-92383

07/13/2021, 3:26 PM

Copy code

File "datahub_v_0_8_6/metadata-ingestion/dhubv086/lib64/python3.6/site-packages/pyhive/hive.py", line 479, in execute
    _check_status(response)
File "datahub_v_0_8_6/metadata-ingestion/dhubv086/lib64/python3.6/site-packages/pyhive/hive.py", line 609, in _check_status
    raise OperationalError(response)
OperationalError: (pyhive.exc.OperationalError) TExecuteStatementResp(status=TStatus(statusCode=3, infoMessages=['*org.apache.hive.service.cli.HiveSQLException:Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.ClassNotFoundException Class com.LeapSerde not found:17:16', 'org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:400', 'org.apache.hive.service.cli.operation.SQLOperation:runQuery:SQLOperation.java:238', 'org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:274', 'org.apache.hive.service.cli.operation.Operation:run:Operation.java:337', 'org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:439', 'org.apache.hive.service.cli.session.HiveSessionImpl:executeStatement:HiveSessionImpl.java:405', 'org.apache.hive.service.cli.CLIService:executeStatement:CLIService.java:257', 'org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:503', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1313', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1298', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor:process:HadoopThriftAuthBridge.java:747', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624', 'java.lang.Thread:run:Thread.java:748', '*org.apache.hadoop.hive.metastore.api.MetaException:java.lang.ClassNotFoundException Class com.LeapSerde not found:28:12', 'org.apache.hadoop.hive.metastore.MetaStoreUtils:getDeserializer:MetaStoreUtils.java:406', 'org.apache.hadoop.hive.ql.metadata.Table:getDeserializerFromMetaStore:Table.java:274', 'org.apache.hadoop.hive.ql.metadata.Table:getDeserializer:Table.java:267', 'org.apache.hadoop.hive.ql.exec.DDLTask:describeTable:DDLTask.java:3184', 'org.apache.hadoop.hive.ql.exec.DDLTask:execute:DDLTask.java:380', 'org.apache.hadoop.hive.ql.exec.Task:executeTask:Task.java:214', 'org.apache.hadoop.hive.ql.exec.TaskRunner:runSequential:TaskRunner.java:99', 'org.apache.hadoop.hive.ql.Driver:launchTask:Driver.java:2054', 'org.apache.hadoop.hive.ql.Driver:execute:Driver.java:1750', 'org.apache.hadoop.hive.ql.Driver:runInternal:Driver.java:1503', 'org.apache.hadoop.hive.ql.Driver:run:Driver.java:1287', 'org.apache.hadoop.hive.ql.Driver:run:Driver.java:1282', 'org.apache.hive.service.cli.operation.SQLOperation:runQuery:SQLOperation.java:236'], sqlState='08S01', errorCode=1, errorMessage='Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.ClassNotFoundException Class com.LeapSerde not found'), operationHandle=None)
[SQL: DESCRIBE `default.leap_flume_prod_new`]

Hi Guys Can you please help me with this error while ingesting hive metadata. datahub version: v0.8.6

gray-shoe-75895

07/13/2021, 5:06 PM

Could you try opening up a hive shell and executing `DESCRIBE `default.leap_flume_prod_new`` ? This appears to be a failure on the hive side or possibly in the driver

rich-policeman-92383

07/13/2021, 6:37 PM

Yes got the same error on hive cli as well.

rich-policeman-92383

07/13/2021, 6:43 PM

is there a way to exclude this table while ingesting hive metadata

gray-shoe-75895

07/13/2021, 6:52 PM

Yep! Use the

table_pattern

option

Copy code

table_pattern:
      deny:
        - "default.leap_flume_prod_new"

(see the docs for details https://datahubproject.io/docs/metadata-ingestion - the deny pattern is actually a regex)

rich-policeman-92383

07/14/2021, 4:58 PM

@gray-shoe-75895 The ignore pattern option works fine. Now i am stuck with a weird problem where the ingestion job suddenly fails with error user XXX does have privilege to describe formatted schema.table. Using hive the same user is able to execute the command that datahub fails to execute.

rich-policeman-92383

07/14/2021, 5:02 PM

Also is there a way to speed up the metadata ingestion process in case of hive. Every time the hive ingestion job fails it starts afresh and this job has to scan more than 14K tables.

gray-shoe-75895

07/14/2021, 7:21 PM

Can you try running with

datahub --debug ingest ...

gray-shoe-75895

07/14/2021, 7:22 PM

Unfortunately there’s no way to speed it up - with hive, we’re forced to call

describe formatted

for each table, which is where much of the time goes

gray-shoe-75895

07/14/2021, 7:23 PM

I’d recommend limiting the amount of data you ingest while debugging/testing using the

schema_pattern

and

table_pattern

options

Open in Slack

Previous Next