Has anyone run into this error and found a solutio...
# integrate-tableau-datahub
w
Has anyone run into this error and found a solution? I've checked that the proper credentials are flowing through but still running into the error.
h
Hey @worried-solstice-95319 which DataHub version are you using ? Can you please set page_size config to lesser value in your recipe and confirm if it works in that case ? page_size is 10 by default.
w
Hey Mayuri - thx for the response! We're currently using 0.10.5 and had already set the page_size to 1 I believe
h
would be great if you can confirm the page_size once more. If its already minimal, i.e. 1, then this issue might be due to very complex data source with large number of fields/columns in single datasource. Is that true in your case ?
w
Yes the page_size is already 1. And yes, we do have some Tableau datasets that blend together datasets that end up becoming quite large
h
Oh interesting. Would you be willing to debug this using tableau metadata api using GraphiQL (https://help.tableau.com/current/api/metadata_api/en-us/docs/meta_api_start.html#explore-the-metadata-api-schema-using-graphiql) to help understand the scale of your datasource(s) better ? If yes, I can share a few graphql queries to run. Also, Could you confirm the value of
cli_version
in your ingestion report once more ? Some work was already done in 0.10.5 ingestion cli version to alleviate this problem via this . If you are still facing the problem, apparently the scale here is much much larger - more than 20k fields+upstream columns in single datasource. You can also consider increasing tableau configuration metadata query node limit to higher value as mentioned in docs
w
Sure I can try debugging using GraphiQL This error persisted even with 0.10.5, and even 0.11.0 now
h
Here is the graphql query you can run
Copy code
{
  embeddedDatasourcesConnection {
    nodes {
      id
      downstreamSheetsConnection {
        totalCount
      }
      upstreamTables {
        columnsConnection {
          totalCount
        }
      }
      fields {
        upstreamFieldsConnection {
          totalCount
        }
        upstreamColumnsConnection {
          totalCount
        }
      }
      upstreamDatasourcesConnection {
        totalCount
      }
    }
  }
}
Query will return count results for all embedded datasources in your case. It would be great if you can share representative (most complex - highest counts - order of thousands) response nodes to get a better idea on the limiting fields.
based on the image you shared earlier, looks like you have at least 6 such datasources that are beyond 20k node limit.
w
Hi Mayuri - running the query returns an error; should I be populating a field somewhere?
h
Just to confirm - where are you running these queries ? They should run on tableau graphiql interface - check here - https://help.tableau.com/current/api/metadata_api/en-us/docs/meta_api_start.html#explore-the-metadata-api-schema-using-graphiql not in datahub graphiql.