strong-analyst-47204
07/08/2020, 7:52 AMbumpy-keyboard-50565
07/21/2020, 12:58 PMbumpy-keyboard-50565
08/01/2020, 11:58 AMstrong-pharmacist-65336
10/01/2020, 5:55 PMstrong-pharmacist-65336
10/01/2020, 5:55 PMstrong-pharmacist-65336
10/01/2020, 5:55 PMstrong-pharmacist-65336
10/01/2020, 5:56 PM(venv) (base) USA-MAC-NIS1908:sql-etl nlangaliya$ python3 mysql_etl.py
Traceback (most recent call last):
File "mysql_etl.py", line 13, in <module>
run(URL, OPTIONS, PLATFORM)
File "/Users/nlangaliya/Documents/GitHub/datahub/metadata-ingestion/sql-etl/common.py", line 111, in run
produce_dataset_mce(mce, kafka_config)
File "/Users/nlangaliya/Documents/GitHub/datahub/metadata-ingestion/sql-etl/common.py", line 97, in produce_dataset_mce
record_schema = avro.load(kafka_config.avsc_path)
File "/Users/nlangaliya/Documents/GitHub/datahub/venv/lib/python3.7/site-packages/confluent_kafka/avro/load.py", line 36, in load
with open(fp) as f:
FileNotFoundError: [Errno 2] No such file or directory: '../../metadata-events/mxe-schemas/src/renamed/avro/com/linkedin/mxe/MetadataChangeEvent.avsc'
(venv) (base) USA-MAC-NIS1908:sql-etl nlangaliya$ cat mysql_etl.py
from common import run
# See <https://github.com/PyMySQL/PyMySQL> for more detail
hostname = '127.0.0.1'
DATABASE = 'datahub'
USER = 'datahub'
PASSWORD = 'datahub'
URL = '' # e.g. <mysql+pymysql://username>:password@hostname:port
URL = '<mysql+pymysql://datahub>:datahub@127.0.0.1'
OPTIONS = {} # e.g. {"encoding": "latin1"}
PLATFORM = 'mysql'
run(URL, OPTIONS, PLATFORM)
microscopic-receptionist-23548
10/01/2020, 5:56 PMmicroscopic-receptionist-23548
10/01/2020, 5:57 PMstale-pilot-26214
11/04/2020, 4:16 AMquaint-pharmacist-15654
11/17/2020, 6:44 PMaverage-city-12965
11/30/2020, 2:28 PMcreated
is another example. The field contains the creation timestamp and the author, but it is also required in the schema and can't be left empty for the GMS service to decide. Is the intention that I'll look that up from the GMS API when updating existing data via MCEs?clever-journalist-89046
12/15/2020, 11:05 AMclever-journalist-89046
12/15/2020, 11:06 AMInvalid project ID 'bq_demo'. Project IDs must contain 6-63 lowercase letters, digits, or dashes. Some project IDs also include domain name separated by a colon. IDs must start with a letter and may not end with a dash.
gentle-plumber-6625
12/16/2020, 12:25 PMgentle-plumber-6625
12/16/2020, 12:26 PMlemon-analyst-37781
02/09/2021, 1:57 AMlemon-analyst-37781
02/09/2021, 1:57 AMincalculable-ocean-74010
02/16/2021, 12:51 PMorg.springframework.beans.factory.BeanCreationException: Error creating bean with name 'restliRequestHandler' defined in ServletContext resource [/WEB-INF/beans.xml]: Bean instantiation via constructor failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.linkedin.restli.server.spring.ParallelRestliHttpRequestHandler]: Constructor threw exception; nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException: No bean named 'streamSearchDao' available
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:314)
at org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:295)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireConstructor(AbstractAutowireCapableBeanFactory.java:1358)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1204)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:557)
at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:517)
at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:323)
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:222)
at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:321)
at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:207)
at org.springframework.context.support.AbstractApplicationContext.getBean(AbstractApplicationContext.java:1114)
at org.springframework.web.context.support.HttpRequestHandlerServlet.init(HttpRequestHandlerServlet.java:61)
incalculable-ocean-74010
02/16/2021, 12:52 PMstreamSearchDao
as such:
package com.linkedin.gms.factory.stream;
import com.linkedin.metadata.configs.StreamSearchConfig;
import com.linkedin.metadata.dao.search.ESSearchDAO;
import com.linkedin.metadata.search.StreamDocument;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.DependsOn;
import javax.annotation.Nonnull;
@Configuration
public class StreamSearchDaoFactory {
@Autowired
ApplicationContext applicationContext;
@Nonnull
@DependsOn({"elasticSearchRestHighLevelClient"})
@Bean(name = "streamSearchDAO")
protected ESSearchDAO createInstance() {
return new ESSearchDAO(applicationContext.getBean(RestHighLevelClient.class), StreamDocument.class,
new StreamSearchConfig());
}
}
powerful-egg-69769
02/22/2021, 4:11 PMincalculable-ocean-74010
03/01/2021, 10:24 AM---
source:
type: hive
config:
username: hive
password: hive
database: default
host_port: localhost:10000
# table_pattern:
# allow:
# - "schema1.table1"
# - "schema1.table2"
# deny:
# - "^.*\.sys_.*" # deny all tables that start with sys_
sink:
type: console
However I get an unfamiliar stacktrace:incalculable-ocean-74010
03/01/2021, 10:24 AMincalculable-ocean-74010
03/01/2021, 10:24 AM▶ datahub ingest -c hive_to_console.yml
[2021-03-01 10:21:25,708] DEBUG {datahub.entrypoints:64} - Using config: {'source': {'type': 'hive', 'config': {'username': 'hive', 'password': 'hive', 'database': 'default', 'host_port': 'localhost:10000'}}, 'sink': {'type': 'console'}}
[2021-03-01 10:21:25,708] DEBUG {datahub.ingestion.run.pipeline:63} - Source type:hive,<class 'datahub.ingestion.source.hive.HiveSource'> configured
[2021-03-01 10:21:25,709] DEBUG {datahub.ingestion.run.pipeline:69} - Sink type:console,<class 'datahub.ingestion.sink.console.ConsoleSink'> configured
[2021-03-01 10:21:25,709] DEBUG {datahub.ingestion.source.sql_common:152} - sql_alchemy_url=<hive://hive:hive@localhost:10000/default>
Traceback (most recent call last):
File "/home/pedro/dev/datahub/metadata-ingestion/venv/bin/datahub", line 11, in <module>
load_entry_point('datahub', 'console_scripts', 'datahub')()
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/pedro/dev/datahub/metadata-ingestion/src/datahub/entrypoints.py", line 70, in ingest
pipeline.run()
File "/home/pedro/dev/datahub/metadata-ingestion/src/datahub/ingestion/run/pipeline.py", line 81, in run
for wu in self.source.get_workunits():
File "/home/pedro/dev/datahub/metadata-ingestion/src/datahub/ingestion/source/sql_common.py", line 154, in get_workunits
inspector = reflection.Inspector.from_engine(engine)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 135, in from_engine
return Inspector(bind)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 108, in __init__
bind.connect().close()
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2263, in connect
return self._connection_cls(self, **kwargs)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 104, in __init__
else engine.raw_connection()
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2369, in raw_connection
return self._wrap_pool_connect(
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
return fn()
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 304, in unique_connection
return _ConnectionFairy._checkout(self)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
rec = pool._do_get()
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 140, in _do_get
self._dec_overflow()
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 137, in _do_get
return self._create_connection()
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
return _ConnectionRecord(self)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
self.__connect(first_connect_check=True)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
pool.logger.debug("Error on connect(): %s", e)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
connection = pool._invoke_creator(self)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
return dialect.connect(*cargs, **cparams)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 508, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/pyhive/hive.py", line 94, in connect
return Connection(*args, **kwargs)
File "/home/pedro/dev/datahub/metadata-ingestion/venv/lib/python3.8/site-packages/pyhive/hive.py", line 123, in __init__
raise ValueError("Password should be set if and only if in LDAP or CUSTOM mode; "
ValueError: Password should be set if and only if in LDAP or CUSTOM mode; Remove password or use one of those modes
incalculable-ocean-74010
03/01/2021, 5:34 PMbrief-toothbrush-55766
03/05/2021, 12:36 PMbrief-toothbrush-55766
03/05/2021, 12:37 PMloud-island-88694
pip3 install --upgrade setuptools
curved-crayon-1929
03/05/2021, 3:26 PMloud-island-88694