calm-addition-66352
04/19/2021, 11:17 PMdatahub ingest -c ./mssql-recipe_._yml
. And my recipe file looks as below,
source:
type: mssql
config:
username: <user_name>
password: <password>
host_port: <ip>:1433
database: <db_name>
table_pattern:
#deny:
# - "^.*\\.sys_.*" # deny all tables that start with sys_
allow:
- "schema_name_1.*"
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
But when I try, I get the below error,
sqlalchemy.exc.OperationalError: (pytds.tds_base.OperationalError) Database '<organization_name>\<dba_name>' does not exist. Make sure that the name is entered correctly.
[SQL: use [<organization_name>\<dba_name>]]
⢠Is there a way to print or get the underlying queries submitted to the database (so I can figure out why it tries to query an object with the dba's name, probably a previously deleted item / user account ect etc...) ?
⢠Is there a list of permissions that needs to be assigned to the db user that we use for the crawler / ingestion. Current user that I am trying has some additional server level access and was wondering whether it is providing additional metadata of the server that is not expected (eg: dba user names etc etc...)gray-shoe-75895
04/19/2021, 11:28 PMdatahub --debug ingest ...
source:
type: mssql
# normal stuff
options:
echo: True
The list of permissions varies by DB system, so I'm not super sure what the precise list is for mssql - in general, the user should be able to list databases, schemas, tables, and column-level metadatacalm-addition-66352
04/19/2021, 11:30 PMcalm-addition-66352
04/19/2021, 11:34 PMecho option
1 validation error for PipelineConfig
source -> options
extra fields not permitted (type=value_error.extra)
This is how my yml file looks now
source:
type: mssql
options:
echo: true
config:
username: <user_name>
password: <password>
host_port: <ip>:1433
database: <db_name>
table_pattern:
#deny:
# - "^.*\\.sys_.*" # deny all tables that start with sys_
allow:
- "schema_name.*"
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
gray-shoe-75895
04/19/2021, 11:36 PMconfig
calm-addition-66352
04/20/2021, 12:16 AMTraceback (most recent call last):
File "/usr/local/bin/datahub", line 8, in <module>
sys.exit(datahub())
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/datahub/entrypoints.py", line 74, in ingest
pipeline.run()
File "/usr/local/lib/python3.6/site-packages/datahub/ingestion/run/pipeline.py", line 87, in run
for wu in self.source.get_workunits():
File "/usr/local/lib/python3.6/site-packages/datahub/ingestion/source/sql_common.py", line 209, in get_workunits
if not sql_config.table_pattern.allowed(dataset_name):
File "/usr/local/lib/python3.6/site-packages/datahub/configuration/common.py", line 69, in allowed
if re.match(deny_pattern, string):
File "/usr/lib64/python3.6/re.py", line 172, in match
return _compile(pattern, flags).match(string)
File "/usr/lib64/python3.6/re.py", line 301, in _compile
p = sre_compile.compile(pattern, flags)
File "/usr/lib64/python3.6/sre_compile.py", line 562, in compile
p = sre_parse.parse(p, flags)
File "/usr/lib64/python3.6/sre_parse.py", line 855, in parse
p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
File "/usr/lib64/python3.6/sre_parse.py", line 416, in _parse_sub
not nested and not items))
File "/usr/lib64/python3.6/sre_parse.py", line 616, in _parse
source.tell() - here + len(this))
sre_constants.error: nothing to repeat at position 0
gray-shoe-75895
04/20/2021, 12:17 AMcalm-addition-66352
04/20/2021, 12:17 AMgray-shoe-75895
04/20/2021, 12:17 AMgray-shoe-75895
04/20/2021, 12:18 AMgray-shoe-75895
04/20/2021, 12:19 AMschema_pattern
calm-addition-66352
04/20/2021, 12:22 AMschema_pattern:
deny:
- "schema_name"
Do you know where I might be able to find all supported configs / options (eg: schema_pattern, table_pattern, echo etc etc...) for a given source (if available) ? šcalm-addition-66352
04/20/2021, 12:37 AMschema_pattern
. Thanks for the help mate š
Just a one small question, when we are using datahub ingest
command, can we change the path of the metadata dataset that we create.
I assume by default, the path is like Dataset/prod/<data_source_type>/<data_source_name>/<table_name>
, can I change it to something like Dataset/nonprod/...
?gray-shoe-75895
04/20/2021, 1:09 AMgray-shoe-75895
04/20/2021, 1:11 AMprod
thing, there's an option to set env
under config in the recipe