DataHub #troubleshoot

nutritious-salesclerk-57675

02/22/2023, 4:06 PM

Good day. When I try to create a local datahub cluster on minikube or docker desktop k8s cluster, my GMS service keeps crashing with this error. Is there a way to get this fixed?

👀 1

✅ 1

purple-oil-61897

02/22/2023, 4:24 PM

Hey, im having issues running the

datahub docker quickstart

, a few of the containers keep restarting and exiting... I dont see and obvious issues, no erros in mysql-setup, gms has error with:

Copy code

2023-02-22 17:21:43 2023/02/22 16:21:43 Problem with dial: dial tcp 172.18.0.8:29092: connect: connection refused. Sleeping 1s
2023-02-22 17:21:44 2023/02/22 16:21:44 Timeout after 4m0s waiting on dependencies to become available: [<http://elasticsearch:9200> <tcp://mysql:3306> <tcp://broker:29092>]

Zoopkeer looks like is not running:

Copy code

2023-02-22 17:18:48 [2023-02-22 16:18:48,129] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
2023-02-22 17:18:48 [2023-02-22 16:18:48,369] WARN Close of session 0x0 (org.apache.zookeeper.server.NIOServerCnxn)
2023-02-22 17:18:48 java.io.IOException: ZooKeeperServer not running
2023-02-22 17:18:48     at org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:544)
2023-02-22 17:18:48     at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:332)
2023-02-22 17:18:48     at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
2023-02-22 17:18:48     at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
2023-02-22 17:18:48     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
2023-02-22 17:18:48     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
2023-02-22 17:18:48     at java.base/java.lang.Thread.run(Thread.java:829)
2023-02-22 17:18:48 [2023-02-22 16:18:48,434] WARN Unexpected exception (org.apache.zookeeper.server.WorkerService)
2023-02-22 17:18:48 java.lang.NullPointerException

I dont really know what to do with this...

✅ 1

witty-motorcycle-52108

02/22/2023, 6:58 PM

hey all, we're seeing a weird issue in the recent update where given a column

ip_address

in a table, searching

address

shows results but searching

ip

has an empty result set. we have a large number of columns (and tables) that should match, so unsure what's going on

bland-barista-59197

02/22/2023, 11:57 PM

Hi Team, Please advise on following issue when deployed using datahub-helm 1. When I set

datahub-gms.replicaCount : 2

in value.yaml the workload is shows

containers with unready status: [datahub-gms]

and datahub-gms container restart multiple times. NOTE: datahub-gms works well when replicaCont is 1. 2. Same message for when disable system update

global.datahub.systemUpdate.enabled: false

and

datahub-gms.replicaCount : 1

in value.yaml

✅ 1

cuddly-butcher-39945

02/23/2023, 4:44 AM

Hi all, I am running the following command on my m1 datahub docker quickstart --version v0.8.33 I am ultimately trying to modify UI and see the changes locally. Here are some more details of my environment: openjdk version “1.8.0_352” OpenJDK Runtime Environment (Zulu 8.66.0.15-CA-macosx) (build 1.8.0_352-b08) OpenJDK 64-Bit Server VM (Zulu 8.66.0.15-CA-macosx) (build 25.352-b08, mixed mode) Thanks!

quickstart_errors.txt

polite-actor-701

02/23/2023, 9:08 AM

Hi all. I created a 'test.py' file to request a graphql query. Running this file sometimes results in errors and sometimes returns results normally. print error result

Copy code

{'errors': [{'message': 'An unknown error occurred.', 'locations': [{'line': 2, 'column': 3}], 'path': ['search'], 'extensions': {'code': 500, 'type': 'SERVER_ERROR', 'classification': 'DataFetchingException'}}], 'data': {'search': None}}

print correct result

Copy code

{'data': {'search': {'searchResults': [{'entity': {'urn': 'urn:li:tag:AUM추출', 'properties': {'name': 'AUM추출'}}}]}}}

Could you please advise which part I should correct to prevent this error? I attached the 'test.py' file and the 'datahub-gms-log.txt' file when there was an error.

test_py.py datahub-gms-log.txt

👀 1

fresh-postman-88589

02/23/2023, 4:01 PM

Hi all, I am trying Datahub and it works great, but I cannot make column lineage to show. Is there something obvious i am missing? see screenshot, thanks!

gentle-lifeguard-88494

02/24/2023, 1:15 AM

Hello again, quick follow-up question to metadata-models-custom. I can't seem to get my array (the distinctValues column) to display in the UI. Here are how my pdl files are configured: File: distinctColValues.pdl

Copy code

namespace com.mycompany.dq

record distinctColValues {
  column: string
  distinctValues: array[string]
}

File: distinctValues.pdl

Copy code

namespace com.mycompany.dq

@Aspect = {
  "name": "distinctValues",
  "autoRender": true,
  "renderSpec": {
    "displayType": "tabular", // or properties
    "key": "distinct_key",
    "displayName": "Distinct Values"
  }
}
record distinctValues {
  distinct_key: array[distinctColValues]
}
}

I put some screenshots below. I see the data in GraphQL, just not in the UI Any help would be appreciated, thanks!

rich-policeman-92383

02/24/2023, 5:50 AM

Datahub: v0.9.5 ## SSO enabled user On setting SESSION_TOKEN_DURATION_MS=1200000 in GMS & Frontend the user is not logged out.

bored-dentist-25467

02/24/2023, 6:17 AM

Hi all, I have upgraded my datahub to 0.10.0. I tried feeding datahub configurations using datahub init command. Then I tried to ingest metadata from mysql. I wonder why it doesn't take the configuration that I pre-feeded. It only takes the configurations that I have given in the ingestion recipe. I would like to know why this happens. Thank you in advance for your kind replies.

glamorous-elephant-17130

02/24/2023, 9:06 AM

Copy code

datahub-frontend:
  enabled: true
  image:
    repository: linkedin/datahub-frontend-react
    tag: "v0.10.0"
  # Set up ingress to expose react front-end
  extraVolumes:
      - name: user-props
        secret:
          secretName: datahub-pass-secret
  extraVolumeMounts:
      - name: user-props
        mountPath: /datahub-frontend/conf/user.props
        subPath: token
        readOnly: true
  ingress:
    enabled: false
  resources:
    limits:
      memory: 1400Mi
    requests:
      cpu: 100m
      memory: 512Mi

Tried to upgrade the default password using this. Had created the secret before hand. Now my frontend is stuck in ContainerCreating. Any clue guys?

✅ 1

agreeable-belgium-70840

02/24/2023, 11:25 AM

hello, I am trying to upgrade to v0.10.0. I ran the datahub-upgrade-job, however datahub-gms keeps giving this error:

Copy code

2023-02-24 11:24:01,525 [ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient - [Consumer clientId=consumer-generic-duhe-consumer-job-client-1, groupId=generic-duhe-consumer-job-client] Error while fetching metadata with correlation id 1434 : {DataHubUpgradeHistory_v1=UNKNOWN_TOPIC_OR_PARTITION}

Any idea why?

✅ 1

lively-jackal-83760

02/24/2023, 2:23 PM

Hi guys Question - is it possible to use some regex in advanced filters? Or via GraphQL for instance, I want to get all datasets that have descriptions like 'some text%', or even all datasets which have a non-empty description

agreeable-belgium-70840

02/24/2023, 3:02 PM

Hello guys, I am trying to upgrade to v0.10.0 and I have the feeling that this is my last issue... datahub-gms hangs there:

Copy code

2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer:292 - mce-consumer-job-client: partitions revoked: []
2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] (Re-)joining group
2023-02-24 14:43:11,561 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:552 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] (Re-)joining group
2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Successfully joined group with generation 1837
2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.AbstractCoordinator:503 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Successfully joined group with generation 1837
2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Adding newly assigned partitions: 
2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.a.k.c.c.i.ConsumerCoordinator:273 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Adding newly assigned partitions: 
2023-02-24 14:43:11,592 [ThreadPoolTaskExecutor-1] INFO  o.s.k.l.KafkaMessageListenerContainer:292 - mce-consumer-job-client: partitions assigned: []

Any idea why?

glamorous-elephant-17130

02/24/2023, 9:08 PM

Copy code

Development-Technology-Developer:~/environment/deploy-datahub-using-aws-managed-services-ingest-metadata (main) $ datahub actions -c actions/slack_integration.yaml 
[2023-02-24 20:55:05,426] INFO     {datahub_actions.cli.actions:77} - DataHub Actions version: 0.0.11
[2023-02-24 20:55:05,542] INFO     {datahub_actions.plugin.action.slack.slack:96} - Slack notification action configured with bot_token=SecretStr('**********') signing_secret=SecretStr('**********') default_channel='C04QT4JEYSK' base_url='<http://datahub.dev.creditsaison.xyz:9002>' suppress_system_activity=True
/home/ec2-user/.local/lib/python3.7/site-packages/slack_sdk/web/internal_utils.py:290: UserWarning: The top-level `text` argument is missing in the request payload for a chat.postMessage call - It's a best practice to always provide a `text` argument when posting a message. The `text` argument is used in places where content cannot be rendered such as: system push notifications, assistive technology such as screen readers, etc.
  warnings.warn(missing_text_message, UserWarning)
[2023-02-24 20:55:06,130] INFO     {datahub_actions.cli.actions:119} - Action Pipeline with name 'datahub_slack_action' is now running.
%4|1677272135.501|FAIL|rdkafka#consumer-1| [thrd:<http://b-2.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/boo|b-2.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/boo>]: <http://b-2.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/bootstrap|b-2.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/bootstrap>: Connection setup timed out in state CONNECT (after 30033ms in state CONNECT)
%4|1677272136.167|FAIL|rdkafka#consumer-1| [thrd:<http://b-1.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/boo|b-1.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/boo>]: <http://b-1.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/bootstrap|b-1.mskdatahub.0c8aba.c9.kafka.us-east-1.amazonaws.com:9092/bootstrap>: Connection setup timed out in state CONNECT (after 30037ms in state CONNECT)

glamorous-elephant-17130

02/24/2023, 9:09 PM

Guys was trying to setup slack notifications. Got the first message and now only errors. Please help!

glamorous-elephant-17130

02/24/2023, 9:09 PM

Screenshot 2023-02-25 at 02.39.09.png

gentle-lifeguard-88494

02/25/2023, 2:32 PM

Hey everyone, just wanted to follow-up on

datahub ingest list-runs

from this thread here: https://datahubspace.slack.com/archives/C029A3M079U/p1675534582989809 I figured it out, but I was wondering why I had to manually add my datahub token to the list runs function to get it to work. Am I doing something wrong with my setup possibly or is there something in the code that needs to be updated? Thanks!

✅ 1

powerful-cat-68806

02/26/2023, 4:11 PM

Hi team, I’m facing issue with my sso (OIDC) configuration I can’t access with my Okta user, only when I’m cleaning the cookies If not - I’m getting

502 bad gateway

nginx error Checking the

ingress-nginx

pods logs, I found the following errors:

Copy code

2023/02/26 12:47:47 [error] 2331#2331: *16878988 upstream sent too big header while reading response header from upstream, client: <http://xx.xx.xxx.xxx|xx.xx.xxx.xxx>, server: <http://datahub-xxx-xx.xxx.xxx|datahub-xxx-xx.xxx.xxx>, request: "GET /sso HTTP/1.1", upstream: "<http://datahub-xxx-xx.xxx.xxx:9002/sso>", host: "<http://datahub-xxx-xx.xxx.xxx|datahub-xxx-xx.xxx.xxx>", referrer: "<https://datahub-xxx-xx.xxx.xxx/login>"


I0226 14:43:16.410437  66544 request.go:665] Waited for 1.000965907s due to client-side throttling, not priority and fairness, request: GET:<https://xxxxxxxxxxxxx.xxx.us-east-1.eks.amazonaws.com/api/v1/namespaces/ingress-nginx/pods/ingress-nginx-controller-xxxxxxx-xxxx/log?container=controller&follow=true&tailLines=10>

I’ve validate: • DH pods are running • Ingress is configured correct • Services are configured with the right dns & ports •

env | grep AUTH

is configured with the correct values, for

datahub-gms

pod When trying to login with usr+pwd, all works fine I’ve checked with our IT team as well, if the blocker is from our VPN Pls. advise

best-umbrella-88325

02/27/2023, 10:18 AM

Hello community.. Facing this issue while trying to get datahub web react up and running on localhost:3000. I have done yarn install, and then yarn start to get this error.

Copy code

Failed to compile.

./node_modules/react-syntax-highlighter/dist/esm/async-languages/prism.js
Module not found: Can't resolve 'refractor/lang/asmatmel.js' in '/mnt/c/XXX/XXX/datahub/datahub-web-react/node_modules/react-syntax-highlighter/dist/esm/async-languages'

Can someone help me out with this please?

✅ 1

breezy-boots-97651

02/27/2023, 11:45 AM

hello datahub-team, when I tried to ingest data to datahub, I got below error suddenly. I just share the part of the log message start the error. I dont know what the problem with my ingestion • this ingestion is from Aws Athena I guess the python package “acryl” is the problem, but I dont know exactly, all of my ingestion got this same error 🥲

Copy code

"2023-02-27 08:59:21.828147 [exec_id=dd1bff69-e3ef-432f-af66-5ad29328d529] INFO: Failed to execute 'datahub ingest'",
           '2023-02-27 08:59:21.828429 [exec_id=dd1bff69-e3ef-432f-af66-5ad29328d529] INFO: Caught exception EXECUTING '
           'task_id=dd1bff69-e3ef-432f-af66-5ad29328d529, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
           '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/default_executor.py", line 123, in execute_task\n'
           '    task_event_loop.run_until_complete(task_future)\n'
           '  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete\n'
           '    return future.result()\n'
           '  File "/usr/local/lib/python3.10/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 168, in execute\n'
           '    raise TaskError("Failed to execute \'datahub ingest\'")\n'
           "acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"]}
Execution finished with errors.

lively-jackal-83760

02/27/2023, 2:16 PM

Hi guys I'm trying to integrate MS teams channel with Datahub. First question - according to your doc if everything is ok I should see logs in datahub-actions pod like

{datahub_actions.plugin.action.teams.teams:60} - Teams notification action configured with webhook_url

but I see these

Copy code

2023/02/27 13:25:30 Received 200 from <http://datahub-datahub-gms:8080/health>
[2023-02-27 13:25:32,685] DEBUG    {datahub.telemetry.telemetry:210} - Sending init Telemetry
[2023-02-27 13:25:33,292] DEBUG    {datahub.telemetry.telemetry:243} - Sending Telemetry
[2023-02-27 13:25:33,545] INFO     {datahub.cli.ingest_cli:182} - DataHub CLI version: 0.9.0.5rc2
[2023-02-27 13:25:33,551] DEBUG    {datahub.cli.ingest_cli:196} - Using config: ...
[2023-02-27 13:25:34,190] DEBUG    {datahub.ingestion.run.pipeline:174} - Sink type:console,<class 'datahub.ingestion.sink.console.ConsoleSink'> configured
[2023-02-27 13:25:34,190] INFO     {datahub.ingestion.run.pipeline:175} - Sink configured successfully. 
[2023-02-27 13:25:34,190] WARNING  {datahub.ingestion.run.pipeline:276} - Failed to configure reporter: datahub
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 264, in _configure_reporting
    reporter_class.create(
  File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/reporting/datahub_ingestion_run_summary_provider.py", line 92, in create
    raise ValueError(
ValueError: Datahub ingestion reporter will be disabled because sink type console is not supported
[2023-02-27 13:25:34,486] INFO     {acryl_action_fwk.source.datahub_streaming:176} - Action executor:ExecutionRequestAction: configured
[2023-02-27 13:25:34,486] DEBUG    {datahub.ingestion.run.pipeline:199} - Source type:datahub-stream,<class 'acryl_action_fwk.source.datahub_streaming.DataHubStreamSource'> configured
[2023-02-27 13:25:34,486] INFO     {datahub.ingestion.run.pipeline:200} - Source configured successfully.
[2023-02-27 13:25:34,488] INFO     {datahub.cli.ingest_cli:129} - Starting metadata ingestion
[2023-02-27 13:25:34,488] INFO     {acryl_action_fwk.source.datahub_streaming:196} - Will subscribe to MetadataAuditEvent_v4, MetadataChangeLog_Versioned_v1
[2023-02-27 13:25:34,489] INFO     {acryl_action_fwk.source.datahub_streaming:199} - Action framework started
[2023-02-27 13:26:06,968] INFO     {acryl_action_fwk.source.datahub_streaming:206} - Msg received: MetadataChangeLog_Versioned_v1, 1, 888957
[2023-02-27 13:26:06,968] INFO     {acryl_action_fwk.source.datahub_streaming:89} - Calling act of ExecutionRequestAction

looks like totally different. We updated our helm charts to the latest version Or it doesn't look like correct version of datahub-actions pod?

quiet-television-68466

02/27/2023, 2:20 PM

Hello, we are currently trying to implement the following transformer https://datahubproject.io/docs/metadata-ingestion/docs/transformer/dataset_transformer/#add-dataset-datasetproperties. The transformer part of my yaml looks as follows:

Copy code

transformers:
    -
        type: add_dataset_properties
        config:
            add_properties_resolver_class: 'cdp-datahub-actions.snowflake_properties:SnowflakePropertiesResolver'

and I can see that cdp-datahub-actions is installed in the datahub-actions pod using pip freeze. Despite this I’m getting the error

Failed to configure transformers: No module named 'cdp-datahub-actions'

. Have I installed the module in the wrong place? Any advice would be super appreciated!

handsome-flag-16272

02/27/2023, 5:33 PM

Hi Team, I found an error “ERROR c.datahub.telemetry.TrackingService:105 - Failed to send event to Mixpanel” in datahub-gms pod logs. I found it points to “https://track.datahubproject.io/mp/track” and “https://track.datahubproject.io/mp/engage” in class MixpanelApiFactory.java. Here are my questions: 1. Is this error caused by connecting failed to https://track.datahubproject.io? 2. Is there anyway to point Mixpanel on a self-hosted Mixpanel? 3. What’s benefits to enable this function and is there anyway to turn it off?

✅ 1

blue-agency-87812

02/27/2023, 5:39 PM

Hi all. Anyone know if is there a way of get the last synchronize time of each entity in datahub? I tried using a sql query in mysql database, but didn't found a place with this information.

✅ 1

gray-airplane-39227

02/27/2023, 6:23 PM

Hello, I’m wondering if BigQuery would support

platform_instance

, from document it shows it’s supported by default but in code,

metadata-ingestion/src/datahub/ingestion/source_config/sql/bigquery.py

has validator that says

bigquery_doesnt_need_platform_instance

, wondering how these two would align.

narrow-queen-90189

02/27/2023, 10:56 PM

Hi, I’m trying to use Datahub to pull in PowerBi reports but have been having issues. The ingestion is successful but doesn’t produce any reports.

narrow-queen-90189

02/27/2023, 10:58 PM

A log of one of the recent runs.

Copy code

~~~~ Ingestion Logs ~~~~
Obtaining venv creation lock...
Acquired venv creation lock
venv setup time = 0
This version of datahub supports report-to functionality
datahub  ingest run -c /tmp/datahub/ingest/c3d726e3-088f-4574-af3c-89d4831fb7f9/recipe.yml --report-to /tmp/datahub/ingest/c3d726e3-088f-4574-af3c-89d4831fb7f9/ingestion_report.json
[2023-02-27 22:25:55,940] INFO     {datahub.cli.ingest_cli:165} - DataHub CLI version: 0.10.0
[2023-02-27 22:25:55,965] INFO     {datahub.ingestion.run.pipeline:179} - Sink configured successfully. DataHubRestEmitter: configured to talk to <http://datahub-gms:8080>
/tmp/datahub/ingest/venv-powerbi-0.10.0/lib/python3.10/site-packages/datahub/ingestion/source/powerbi/powerbi.py:867: ConfigurationWarning: env is deprecated and will be removed in a future release. Please use platform_instance instead.
  config = PowerBiDashboardSourceConfig.parse_obj(config_dict)
[2023-02-27 22:25:56,235] INFO     {datahub.ingestion.source.powerbi.proxy:231} - Trying to connect to <https://login.microsoftonline.com/{tenant-id}>
[2023-02-27 22:25:56,235] INFO     {datahub.ingestion.source.powerbi.proxy:349} - Generating PowerBi access token
[2023-02-27 22:25:56,375] INFO     {datahub.ingestion.source.powerbi.proxy:363} - Generated PowerBi access token
[2023-02-27 22:25:56,375] INFO     {datahub.ingestion.source.powerbi.proxy:233} - Able to connect to <https://login.microsoftonline.com/{tenant-id}>
[2023-02-27 22:25:56,587] INFO     {datahub.ingestion.source.powerbi.proxy:231} - Trying to connect to <https://login.microsoftonline.com/{tenant-id}>
[2023-02-27 22:25:56,588] INFO     {datahub.ingestion.source.powerbi.proxy:349} - Generating PowerBi access token
[2023-02-27 22:25:56,679] INFO     {datahub.ingestion.source.powerbi.proxy:363} - Generated PowerBi access token
[2023-02-27 22:25:56,679] INFO     {datahub.ingestion.source.powerbi.proxy:233} - Able to connect to <https://login.microsoftonline.com/{tenant-id}>
[2023-02-27 22:25:56,679] INFO     {datahub.ingestion.run.pipeline:196} - Source configured successfully.
[2023-02-27 22:25:56,681] INFO     {datahub.cli.ingest_cli:120} - Starting metadata ingestion
[2023-02-27 22:25:56,682] INFO     {datahub.ingestion.source.powerbi.powerbi:892} - PowerBi plugin execution is started
[2023-02-27 22:25:56,682] INFO     {datahub.ingestion.source.powerbi.proxy:757} - Request to get groups endpoint URL=<https://api.powerbi.com/v1.0/myorg/groups>
[2023-02-27 22:25:56,852] INFO     {datahub.ingestion.reporting.file_reporter:52} - Wrote SUCCESS report successfully to <_io.TextIOWrapper name='/tmp/datahub/ingest/c3d726e3-088f-4574-af3c-89d4831fb7f9/ingestion_report.json' mode='w' encoding='UTF-8'>
[2023-02-27 22:25:56,852] INFO     {datahub.cli.ingest_cli:133} - Finished metadata ingestion

Cli report:
{'cli_version': '0.10.0',
 'cli_entry_location': '/tmp/datahub/ingest/venv-powerbi-0.10.0/lib/python3.10/site-packages/datahub/__init__.py',
 'py_version': '3.10.9 (main, Jan 23 2023, 22:32:48) [GCC 10.2.1 20210110]',
 'py_exec_path': '/tmp/datahub/ingest/venv-powerbi-0.10.0/bin/python3',
 'os_details': 'Linux-5.15.0-1026-aws-x86_64-with-glibc2.31',
 'mem_info': '70.89 MB'}
Source (powerbi) report:
{'events_produced': 0,
 'events_produced_per_sec': 0,
 'entities': {},
 'aspects': {},
 'warnings': {},
 'failures': {},
 'dashboards_scanned': 0,
 'charts_scanned': 0,
 'filtered_dashboards': [],
 'filtered_charts': [],
 'start_time': '2023-02-27 22:25:56.071050 (now)',
 'running_time': '0.94 seconds'}
Sink (datahub-rest) report:
{'total_records_written': 0,
 'records_written_per_second': 0,
 'warnings': [],
 'failures': [],
 'start_time': '2023-02-27 22:25:55.962587 (1.04 seconds ago)',
 'current_time': '2023-02-27 22:25:57.006657 (now)',
 'total_duration_in_seconds': 1.04,
 'gms_version': 'v0.10.0',
 'pending_requests': 0}

 Pipeline finished successfully; produced 0 events in 0.94 seconds.

bland-appointment-45659

02/28/2023, 4:41 AM

Team, Is there a way to update/add values that are coming from application.yaml ? We are trying to change the max-poll-records for kafka consumer.

cool-yacht-59889

02/28/2023, 7:42 AM

Hey i am trying to connect a remote debugger and i noticed that the example picture for IntelliJ Remote Debug Configuration is broken, can i find the example somewhere?