able-rain-74449
04/08/2022, 7:29 AMtype: mysql
config:
host_port: '<http://datahub-mysql.oiasdaihsdoiahdoh.eu-west-1.rds.amazonaws.com::3306|datahub-mysql.oiasdaihsdoiahdoh.eu-west-1.rds.amazonaws.com::3306>'
database: datahub
username: admin
password: mypwd
include_tables: true
include_views: true
profiling:
enabled: false
sink:
type: datahub-rest
config:
server: '<http://myserver.eu-west-1.elb.amazonaws.com:9002/api/gms>'
many-guitar-67205
04/08/2022, 8:23 AMbillions-twilight-48559
04/08/2022, 12:40 PMplain-farmer-27314
04/08/2022, 2:25 PMmost-waiter-95820
04/08/2022, 5:03 PMicy-piano-35127
04/08/2022, 7:16 PMmammoth-fountain-32989
04/11/2022, 9:34 AMbrave-forest-5974
04/11/2022, 3:39 PMnutritious-bird-77396
04/11/2022, 8:30 PMurn:li:corpGroup:Data%20Platform
Is there a way i can use the okta_profile_to_group_name_regex
to remove these special characters - https://datahubproject.io/docs/metadata-ingestion/source_docs/okta
Adding a pattern like this "[^\\s]+"
returns Data
where as what I am expecting is DataPlatform
without any spaces...
Is there a way this can be achieved just with regex to urn:li:corpGroup:DataPlatform
?bitter-toddler-42943
04/12/2022, 1:35 AMbitter-toddler-42943
04/12/2022, 1:35 AM~~~~ Execution Summary ~~~~
RUN_INGEST - {'errors': [],
'exec_id': 'b41cc395-054d-47a6-a683-377b91965383',
'infos': ['2022-04-11 06:34:26.234370 [exec_id=b41cc395-054d-47a6-a683-377b91965383] INFO: Starting execution for task with name=RUN_INGEST',
'2022-04-11 06:34:54.884051 [exec_id=b41cc395-054d-47a6-a683-377b91965383] INFO: stdout=Requirement already satisfied: pip in '
'/tmp/datahub/ingest/venv-b41cc395-054d-47a6-a683-377b91965383/lib/python3.9/site-packages (21.2.4)\n'
'WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/pip/\n"
'WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/pip/\n"
'WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/pip/\n"
'WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/pip/\n"
'WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/pip/\n"
'WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/wheel/\n"
'WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/wheel/\n"
'WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/wheel/\n"
'WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/wheel/\n"
'WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/wheel/\n"
'ERROR: Could not find a version that satisfies the requirement wheel (from versions: none)\n'
'ERROR: No matching distribution found for wheel\n'
'WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/acryl-datahub/\n"
'WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/acryl-datahub/\n"
'WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/acryl-datahub/\n"
'WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/acryl-datahub/\n"
'WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by '
"'ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))': /simple/acryl-datahub/\n"
'ERROR: Could not find a version that satisfies the requirement acryl-datahub[datahub-rest,mssql]==0.8.26.6 (from versions: none)\n'
'ERROR: No matching distribution found for acryl-datahub[datahub-rest,mssql]==0.8.26.6\n'
'/tmp/datahub/ingest/venv-b41cc395-054d-47a6-a683-377b91965383/bin/python3: No module named datahub\n',
"2022-04-11 06:34:54.884314 [exec_id=b41cc395-054d-47a6-a683-377b91965383] INFO: Failed to execute 'datahub ingest'",
'2022-04-11 06:34:54.886440 [exec_id=b41cc395-054d-47a6-a683-377b91965383] INFO: Caught exception EXECUTING '
'task_id=b41cc395-054d-47a6-a683-377b91965383, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
' File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/default_executor.py", line 119, in execute_task\n'
' self.event_loop.run_until_complete(task_future)\n'
' File "/usr/local/lib/python3.9/site-packages/nest_asyncio.py", line 81, in run_until_complete\n'
' return f.result()\n'
' File "/usr/local/lib/python3.9/asyncio/futures.py", line 201, in result\n'
' raise self._exception\n'
' File "/usr/local/lib/python3.9/asyncio/tasks.py", line 256, in __step\n'
' result = coro.send(None)\n'
' File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 115, in execute\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
Execution finished with errors.
bitter-toddler-42943
04/12/2022, 3:13 AMicy-ram-1893
04/12/2022, 5:02 AMailed to establish a new connection: "
"[Errno 101] Network is unreachable')': /simple/pip/\n",
ERROR: Could not find a version that satisfies the requirement acryl-datahub[datahub-rest,mssql]==0.8.32 (from versions: none)\n'
'ERROR: No matching distribution found for acryl-datahub[datahub-rest,mssql]==0.8.32\n'
and this the log picture .
It should be mentioned that we deploy datahub by Containerd in different server, but I could not find any documentation about datahub in ContainerD .
Is it mandatory to install mssql plug-in to ingest data from it?bitter-toddler-42943
04/12/2022, 6:24 AMPlease check your configuration and make sure you are talking to the DataHub GMS (usually "
'<datahub-gms-host>:8080) or Frontend GMS API (usually <frontend>:9002/api/gms)
sticky-dawn-95000
04/12/2022, 11:21 AMsource:
type: "hive"
config:
host_port: <http://test.myhiveserver.com:10000|test.myhiveserver.com:10000>
database: default
username: datahub
password: 1234
options:
connect_args: {'ssl_cert':'cacert.pem'}
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
But it did not work. 😞
ValueError: Password should be set if and only if in LDAP or CUSTOM mode; Remove password or use one of whose modes
Please, help me..square-solstice-69079
04/12/2022, 12:58 PMbrave-forest-5974
04/12/2022, 5:28 PMplain-farmer-27314
04/12/2022, 6:59 PMcurved-crayon-1929
04/12/2022, 7:53 PMsource:
type: glue
config:
aws_region: us-east-2
aws_access_key_id: AKIA**************V
aws_secret_access_key: j4EzEH12YEQLYP************0p4+K
aws_session_token: null
database_pattern:
allow:
- sampledb
table_pattern:
allow:
- elb_logs
sink:
type: datahub-rest
config:
server: '<http://localhost:8080>'
mysterious-lamp-91034
04/12/2022, 9:20 PMsalmon-rose-54694
04/13/2022, 1:33 AMacryl-datahub[airflow]
open sourced? I find a bug and need to investigate.creamy-van-28626
04/13/2022, 5:43 AMdazzling-queen-76396
04/13/2022, 6:51 AMeager-animal-48107
04/13/2022, 10:02 AMdazzling-alarm-64985
04/13/2022, 10:50 AMdelightful-barista-90363
04/13/2022, 7:22 PMmysterious-nail-70388
04/14/2022, 2:50 AMincalculable-forest-10734
04/14/2022, 6:40 AMtb_20220414
, tb_20220413
). I want to ingest the latest table tb_20220414
but it ingested the oldest table tb_20220413
. How can i ingest the latest suffixed table?bland-orange-13353
04/14/2022, 7:35 AMcreamy-van-28626
04/14/2022, 7:56 AM