Hi all, have anyone experience this when sending a...
# troubleshoot
w
Hi all, have anyone experience this when sending a very very long MCP ? Any solution for this?
Copy code
[{'error': 'Unable to emit metadata to DataHub GMS',
               'info': {'message': "HTTPSConnectionPool(host='<http://datahub-gms.net|datahub-gms.net>', port=443): Max retries exceeded "
                                   "with url: /aspects?action=ingestProposal (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of "
                                   "protocol (_ssl.c:2384)')))"}}],
i
Hello Aezo, How are you posting this MCP to DataHub? Is this from an ingestion recipe?
The HTTP client has a limitation on the size of the payload being sent, a workaround for this is to use the kafka sink: https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub#datahub-kafka
w
Hi @incalculable-ocean-74010, i am using the recipe. I am able to send to my localhost but not my staging server. Which is HTTPS.
i
Are you able to ping the staging cluster from where you are runing the ingestion? It is possible that you are lacking certs to validate the connection?
w
Ingestion of other pbi workspace using the connector works. The payload is just too large. about 1.5mb… maybe increasing the ingress nginx will help?
I can confirm that increasing nginx proxy size works for me. Posting the helm values config here, might help others in the future.
Copy code
ingress:
      enabled: true
      annotations:
        <http://nginx.ingress.kubernetes.io/proxy-body-size|nginx.ingress.kubernetes.io/proxy-body-size>: 10m
Thanks Pedro! Now I also know that i can use Kafka sink. But I will need to increase topic message size.
g
mysql instances also frequently limits payload sizes to 4mb, so you might also run into issues there
I think the better solution would be to improve the powerbi connector to not emit 1.5mb payloads - @worried-branch-76677 any idea what sorts of powerbi reports are causing issue?
w
I have this
schemaMetadata
which is very very long, 1.5MB size. (Basically a collection of a PowerBI dataset (many tables and many columns into one datahub dataset) . We model it this way similar to other platform. But it looks like not a great idea now.
g
So 1.5mb should still work with our backend, so you don’t have to change anything in the short term. However, it’s likely that 1.5mb isn’t the biggest payload you’re going to come across, so in the long run it’s worthwhile tuning it to emit smaller schemaMetadata aspects
w
Yea, i agree with you. If only we can combine many
schemaMetadata
into one entity. 😆