https://datahubproject.io logo
Join Slack
Powered by
# github-activities
  • p

    prehistoric-lamp-76173

    03/14/2024, 7:56 AM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 9:13 AM
    1 new commit pushed to
    <https://github.com/datahub-project/datahub/tree/master|master>
    by anshbansal
    <https://github.com/datahub-project/datahub/commit/0f2b15c93e3243ca37ca66b70465a43ab73bc71c|0f2b15c9>
    - fix(ui/lineage): show data is too large error when limitation exceeds (#10038) datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 9:29 AM
    Deployment to Production by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 10:29 AM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 11:45 AM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 12:50 PM
    #10048 MSSQL view lineage not shown Issue created by fowlerjk-tps Describe the bug Version 0.13.0, MSSQL view lineage does not show. Views are standalone. I've posted this on Slack and also seen one mention of it from almost a year ago but the explanation includes a dead link. To Reproduce Steps to reproduce the behavior: 1. Ingest MSSQL source 2. Open a view 3. See the view standing by itself without any connected tables Expected behavior The view should have upstream dependencies - specifically, the tables that drive the view YAML source: type: mssql config: host_port: * database: * username: * include_views: true include_tables: true profiling: enabled: true profile_table_level_only: false stateful_ingestion: enabled: true password: * include_jobs: false include_stored_procedures: false include_stored_procedures_code: false include_view_lineage: true include_view_column_lineage: true convert_urns_to_lowercase: true datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 1:36 PM
    #10049 fix(ingest/databricks): support hive metastore schemas with special char Pull request opened by mayurinehate Also adds • exception handling to some scopes limit impact on rest of ingestion • more details on unity-catalog source Checklist ☐ The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format) ☐ Links to related issues (if applicable) ☐ Tests for the changes have been added/updated (if applicable) ☐ Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same. ☐ For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub datahub-project/datahub GitHub Actions: Run Smoke Tests (no_cypress_suite1) GitHub Actions: Run Smoke Tests (no_cypress_suite0) GitHub Actions: [Monitoring] Scan MCE consumer images for vulnerabilities GitHub Actions: [Monitoring] Scan GMS images for vulnerabilities GitHub Actions: [Monitoring] Scan MAE consumer images for vulnerabilities GitHub Actions: [Monitoring] Scan Frontend images for vulnerabilities GitHub Actions: [Monitoring] Scan DataHub Upgrade images for vulnerabilities GitHub Actions: Build and Push DataHub MySQL Setup Docker Image GitHub Actions: Build and Push DataHub MCE Consumer Docker Image GitHub Actions: Build and Push DataHub MAE Consumer Docker Image GitHub Actions: Build and Push DataHub Kafka Setup Docker Image GitHub Actions: Build and Push DataHub GMS Docker Image GitHub Actions: Build and Push DataHub Frontend Docker Image GitHub Actions: Build and Push DataHub Elasticsearch Setup Docker Image GitHub Actions: Build and Push DataHub Upgrade Docker Image GitHub Actions: build (frontend, America/New_York) GitHub Actions: build (frontend, UTC) GitHub Actions: quickstart-compose-validation ✅ 12 other checks have passed 12/30 successful checks
  • p

    prehistoric-lamp-76173

    03/14/2024, 1:52 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 3:04 PM
    #10050 Glossary term junk links in related entities for Tableau workbook Issue created by YuriyGavrilov Describe the bug We use the Tableau with the datahub a lot and started testing the use of Glossary terms. Faced with an architectural error. We cannot associate terms with Tableau objects type workbook. The link just don't showed in UI here: image but if i go to the workbook ( not dataset, not chart, not Embedded data source ) I will see linked term. This is workbook image this is link: image It seems the Term link should be only for object to object but not Workbook ( 18 Entities ) to Object. To Reproduce Steps to reproduce the behavior: 1. Go to '...' 2. Click on '....' 3. Scroll down to '....' 4. See error Expected behavior Finally wish to have full and correct support Tableau objects like workbooks in Term, filters ant etc. Screenshots If applicable, add screenshots to help explain your problem. Desktop (please complete the following information): • OS: [e.g. iOS] • Browser [e.g. chrome, safari] • Version [e.g. 22] Additional context Add any other context about the problem here. datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 3:19 PM
    #10051 fix(ingestion): Handle Redshift string length limit in Serverless mode Pull request opened by skrydal According to https://stackoverflow.com/questions/72770890/redshift-result-size-exceeds-listagg-limit-on-svl-statementtext Redshift limits strings to be 64k in length. Since
    text
    field in
    SYS_QUERY_TEXT
    table can have at most 4k characters and then is split into chunks ordered by
    sequence
    we can allow at most 16 (16*4 = 64) chunks to be included. datahub-project/datahub GitHub Actions: Run Smoke Tests (no_cypress_suite1) GitHub Actions: Run Smoke Tests (no_cypress_suite0) GitHub Actions: [Monitoring] Scan MCE consumer images for vulnerabilities GitHub Actions: [Monitoring] Scan MAE consumer images for vulnerabilities GitHub Actions: [Monitoring] Scan DataHub Upgrade images for vulnerabilities GitHub Actions: [Monitoring] Scan GMS images for vulnerabilities GitHub Actions: [Monitoring] Scan Frontend images for vulnerabilities GitHub Actions: Build and Push DataHub MySQL Setup Docker Image GitHub Actions: Build and Push DataHub MCE Consumer Docker Image GitHub Actions: Build and Push DataHub MAE Consumer Docker Image GitHub Actions: Build and Push DataHub Kafka Setup Docker Image GitHub Actions: Build and Push DataHub GMS Docker Image GitHub Actions: Build and Push DataHub Frontend Docker Image GitHub Actions: Build and Push DataHub Upgrade Docker Image GitHub Actions: Build and Push DataHub Elasticsearch Setup Docker Image GitHub Actions: build (frontend, America/New_York) GitHub Actions: build (frontend, UTC) GitHub Actions: quickstart-compose-validation ✅ 12 other checks have passed 12/30 successful checks
  • p

    prehistoric-lamp-76173

    03/14/2024, 3:36 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 4:40 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 4:56 PM
    #10052 fix(metadata-ingestion)improve resilience and observability of glue-connector Pull request opened by siladitya2 Recently we are facing issue with Glue connector due to its fail fast behaviour for handling optional field
    PartitionKey["Type"]
    So making the connector fail safe in case PartitionKey["Type"] not found for a table. Also adding a debug logger, which will help to debug any issue when connector failed to process data for a particular table. Checklist ☐ The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format) ☐ Links to related issues (if applicable) ☐ Tests for the changes have been added/updated (if applicable) ☐ Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same. ☐ For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 5:01 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 5:16 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 5:36 PM
    #10054 fix(ui): show dataset display name in browse paths v2 Pull request opened by Masterchen09 With the (new) browse paths v2 it is possible to have in theory any entity as part of the browse path by referencing it via its urn. Unfortunately the UI is currently not "requesting" the display name of the dataset entity as part of the browse path v2, therefore only the URN is shown. Most of the time it makes no sense to have datasets as part of the browse path (e.g., if you think of SQL databases, most of the time you have databases or schemas, which are ingested as containers, but there is no entity "below" a dataset), but in case of SAP BW (we are still working on a source for it ;-)) it makes sense to also have a dataset as part of the browse path because of the special relationship between InfoProviders and Queries (Queries in SAP BW cannot be compared to SQL queries in SQL databases): image 0BW (InfoArea -> Ingested as a container with subtype InfoArea) 0BWTCT_STA (InfoArea -> Ingested as a container with subtype InfoArea, 0BW as a parent container) 0BWTC_C10 (InfoProvider -> Ingested as a dataset with subtype InfoProvider and MultiProvider, 0BWTCT_STA as a container) 0BWTC_C10_Q001 (Query -> Ingested as a dataset with subtype Query, 0BWTCT_STA as a container) Queries are defined on an InfoProvider and represent an own and persistent object (it is mostly comparable to a SQL view in SQL databases) and therefore have an upstream lineage aspect to the InfoProvider. Queries are always "shown" below an InfoProvider (the screenshot is taken from the modelling tools of SAP for SAP BW), therefore it also makes sense to show them below the InfoProvider in the browse path. However the container of the Query is not the InfoProvider, instead the Query has the same InfoArea as its InfoProvider. Checklist ☑︎ The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format) ☑︎ Links to related issues (if applicable) ☑︎ Tests for the changes have been added/updated (if applicable) ☑︎ Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same. ☑︎ For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub datahub-project/datahub GitHub Actions: Run Smoke Tests (cypress_rest) GitHub Actions: Run Smoke Tests (cypress_suite1) GitHub Actions: [Monitoring] Scan Datahub Ingestion Slim images for vulnerabilities GitHub Actions: [Monitoring] Scan Datahub Ingestion images for vulnerabilities GitHub Actions: Build and Push DataHub Ingestion Docker Images GitHub Actions: Build and Push DataHub Ingestion (Full) Docker Images GitHub Actions: Build and Push DataHub Ingestion (Base-Full) Docker Image GitHub Actions: Build and Push DataHub Ingestion (Base-Slim) Docker Image GitHub Actions: [Monitoring] Scan MCE consumer images for vulnerabilities GitHub Actions: [Monitoring] Scan DataHub Upgrade images for vulnerabilities GitHub Actions: [Monitoring] Scan GMS images for vulnerabilities GitHub Actions: [Monitoring] Scan MAE consumer images for vulnerabilities GitHub Actions: Build and Push DataHub MySQL Setup Docker Image GitHub Actions: Build and Push DataHub Kafka Setup Docker Image GitHub Actions: Build and Push DataHub Elasticsearch Setup Docker Image GitHub Actions: Build and Push DataHub Ingestion (Base) Docker Image GitHub Actions: Build and Push DataHub MCE Consumer Docker Image GitHub Actions: Build and Push DataHub GMS Docker Image GitHub Actions: Build and Push DataHub Upgrade Docker Image GitHub Actions: Build and Push DataHub MAE Consumer Docker Image GitHub Actions: quickstart-compose-validation ✅ 9 other checks have passed 9/30 successful checks
  • p

    prehistoric-lamp-76173

    03/14/2024, 5:53 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 6:43 PM
    #10055 feat(ingest/dbt): point dbt assertions at dbt nodes Pull request opened by hsheth2 Checklist ☐ The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format) ☐ Links to related issues (if applicable) ☐ Tests for the changes have been added/updated (if applicable) ☐ Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same. ☐ For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub datahub-project/datahub GitHub Actions: Run Smoke Tests (no_cypress_suite1) GitHub Actions: Run Smoke Tests (no_cypress_suite0) ✅ 28 other checks have passed 28/30 successful checks
  • p

    prehistoric-lamp-76173

    03/14/2024, 6:59 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 7:16 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 7:26 PM
    #10056 fix(ui) Add min width to the usage stats component Pull request opened by chriscollins3456 Adds a min-width to the usage stats in the schema field sidebar so we always see something even when the percentage is super small. image Checklist ☐ The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format) ☐ Links to related issues (if applicable) ☐ Tests for the changes have been added/updated (if applicable) ☐ Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same. ☐ For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub datahub-project/datahub GitHub Actions: Run Smoke Tests (cypress_rest) GitHub Actions: Run Smoke Tests (cypress_suite1) GitHub Actions: [Monitoring] Scan Datahub Ingestion images for vulnerabilities GitHub Actions: [Monitoring] Scan Datahub Ingestion Slim images for vulnerabilities GitHub Actions: Build and Push DataHub Ingestion (Full) Docker Images GitHub Actions: Build and Push DataHub Ingestion Docker Images GitHub Actions: [Monitoring] Scan MCE consumer images for vulnerabilities GitHub Actions: Build and Push DataHub Ingestion (Base-Full) Docker Image GitHub Actions: Build and Push DataHub Ingestion (Base-Slim) Docker Image GitHub Actions: [Monitoring] Scan MAE consumer images for vulnerabilities GitHub Actions: [Monitoring] Scan DataHub Upgrade images for vulnerabilities GitHub Actions: [Monitoring] Scan GMS images for vulnerabilities GitHub Actions: Build and Push DataHub Elasticsearch Setup Docker Image GitHub Actions: Build and Push DataHub Ingestion (Base) Docker Image GitHub Actions: Build and Push DataHub MySQL Setup Docker Image GitHub Actions: Build and Push DataHub Kafka Setup Docker Image GitHub Actions: Build and Push DataHub Upgrade Docker Image GitHub Actions: Build and Push DataHub MCE Consumer Docker Image GitHub Actions: Build and Push DataHub MAE Consumer Docker Image GitHub Actions: Build and Push DataHub GMS Docker Image GitHub Actions: quickstart-compose-validation ✅ 9 other checks have passed 9/30 successful checks
  • p

    prehistoric-lamp-76173

    03/14/2024, 7:32 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 7:43 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 8:35 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 8:51 PM
    #10057 feat(ingest/s3): set default spark version Pull request opened by hsheth2 Checklist ☐ The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format) ☐ Links to related issues (if applicable) ☐ Tests for the changes have been added/updated (if applicable) ☐ Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same. ☐ For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub datahub-project/datahub GitHub Actions: Deploy to Datahub HEAD GitHub Actions: Run Smoke Tests GitHub Actions: [Monitoring] Scan MAE consumer images for vulnerabilities GitHub Actions: Build and Push DataHub MAE Consumer Docker Image GitHub Actions: quickstart-compose-validation GitHub Actions: build (frontend, America/New_York) GitHub Actions: build (frontend, UTC) ✅ 23 other checks have passed 23/30 successful checks
  • p

    prehistoric-lamp-76173

    03/14/2024, 9:07 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 9:59 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 10:43 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 10:51 PM
    #10058 Pip packages are not installed for Iceberg source plugin with Hive type Issue created by usmanovbf Describe the bug Pip packages are not installed for Iceberg source plugin with hive type To Reproduce Steps to reproduce the behavior: 1. Create Iceberg ingestion source by the example below but without
    type: hive
    2. Click
    Save & Run
    3. Get an error about required
    type
    field 4. Update the recipe with
    type: hive
    5. Click
    Save & Run
    button 6. See the errors (logs are below): 1. ModuleNotFoundError: No module named 'thrift' 2. pyiceberg.exceptions.NotInstalledError: Apache Hive support not installed: pip install 'pyiceberg[hive]' Expected behavior All pypi packages
    'pyiceberg[hive]' thrift
    should be installed properly Solution Execute
    pip install
    every time before execution of recipe Screenshots image Desktop (please complete the following information): • OS: MacOS Sonoma arm64 • Browser Chrome • Version 122.0.6261.112 Additional context 1. Recipe:
    Copy code
    source:
        type: iceberg
        config:
            env: PROD
            catalog:
                name: iceberg-catalog
                type: hive
                config:
                    uri: '<https://hostname1:9083>'
                    s3.endpoint: '<https://hostname2>'
                    s3.access-key-id: '${secret1}'
                    s3.secret-access-key: '${secret2}'
            table_pattern:
                allow:
                    - 'test.*'
            profiling:
                enabled: false
    3. Error logs: exec-urn_li_dataHubExecutionRequest_1d2b870e-81e9-477a-8869-39505a9f2b3d.log 4. Even adding Extra Pip Libraries does not help Extra Pip Libraries9 9. Datahub version 0.12.1 10. As I see, it is not fixed in 0.13.0 from 0.12.1 https://github.com/datahub-project/datahub/commits/v0.12.1/metadata-ingestion/src/datahub/ingestion/source/iceberg datahub-project/datahub
  • p

    prehistoric-lamp-76173

    03/14/2024, 11:08 PM
    Deployment to Preview by vercel[bot] datahub-project/datahub