https://datahubproject.io logo
Join SlackCommunities
Powered by
# troubleshoot
  • h

    hallowed-machine-2603

    06/06/2022, 7:35 AM
    It's a basic question, but I have a trouble in connecting MSSQL. I build datahub using docker and my recipe. source: type: mssql config: host_port: <IP for connecting MSSQL database> database: <database name> username: <username> password: <password> sink: type: datahub-rest config: server: <datahub-gms ip in docker, ex) 0.0.0.0:9999/api/gms> first of all, I want to see tables list and data schema in database on datahub. How can I see tables list and data schema on datahub page? ex) dbo.Table_Fruit ~ dbo.Table_Animal~~~
    h
    b
    s
    • 4
    • 8
  • r

    rich-policeman-92383

    06/06/2022, 8:55 AM
    Domain owners are not able to add descriptions, create/add tags on datasets that are part of a domain. Getting "You are not authorised" error. Any suggestion on this. Also what is the privilege name for tag creation.
    b
    • 2
    • 16
  • c

    chilly-elephant-51826

    06/06/2022, 9:39 AM
    @here where can i give source code for public.ecr.aws/datahub/acryl-datahub-actions ? it seems the code is different over github master branch, can anyone help ? https://datahubspace.slack.com/archives/C029A3M079U/p1654344045411189
    b
    d
    • 3
    • 16
  • b

    bumpy-activity-74405

    06/06/2022, 9:53 AM
    Hi, previously I’ve been getting errors on gms regarding header size, but this PR fixed it. Now I am getting something similar in frontend:
    Copy code
    akka.actor.ActorSystemImpl - Illegal request, responding with status '431 Request Header Fields Too Large': HTTP header value exceeds the configured limit of 8192 characters
    Not sure how to test this, but I think this PR should enable users to solve the issue.
    b
    • 2
    • 1
  • s

    swift-breakfast-25077

    06/06/2022, 10:40 AM
    how can i set environment variable
    DATAHUB_DEBUG
    to
    true
    in order to enable debug logging for
    DataHubValidationAction
    ??
    h
    b
    • 3
    • 3
  • h

    high-hospital-85984

    06/06/2022, 10:58 AM
    We stumbled on a strange issue with a single, very central dataset (so a lot of depedencies). Error log in thread.
    b
    • 2
    • 38
  • t

    thankful-magazine-50386

    06/06/2022, 11:30 AM
    Hi team! I installed an ES analyzer in docker container and restarted the container. And then I updated a field's description and tried searching with part of the description I just added. Yet the table of which I just updated a field's description did not appear in the search result. I tried to locate the problem by directly searching ES indices. I checked this link and tried to follow the steps. But where is the datasetdocument index? https://github.com/datahub-project/datahub/issues/1772 Here is the command I ran
    curl -X GET --location "<http://192.168.25.133:9200/_cat/indices?v=&pretty=>"
    And here is the result.
    Copy code
    health status index                                                    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
    yellow open   dataset_datasetprofileaspect_v1                          hv4oWE6YSUSJdbcMBLU_DA   1   1          0            0       208b           208b
    yellow open   datajobindex_v2                                          1S-T1Y5jQziyqQJe61DYGA   1   1          0            0       208b           208b
    yellow open   datahubexecutionrequestindex_v2                          QIRguaTMRnCrR1FnjLt_pg   1   1          0            0       208b           208b
    yellow open   datahubsecretindex_v2                                    7ifKdzCVRfSY5fVgDyGlwA   1   1          0            0       208b           208b
    yellow open   mlmodelindex_v2                                          JQOJjEpzSUWka9Cz4HNdRA   1   1          0            0       208b           208b
    yellow open   dataflowindex_v2                                         TsEzAXk8RNK4i4EaPwpXsw   1   1          0            0       208b           208b
    yellow open   mlmodelgroupindex_v2                                     TsxHmhO6SDauKjuLAt5LkA   1   1          0            0       208b           208b
    yellow open   datahubpolicyindex_v2                                    V4vqL__2RTuYWpZUeyjVxg   1   1          5            0     10.9kb         10.9kb
    yellow open   assertionindex_v2                                        oe7-AhYMTZyYYCHAwzzLuQ   1   1          0            0       208b           208b
    yellow open   corpuserindex_v2                                         ERzx5nyjSRq9rStuV8v4JA   1   1          0            0       208b           208b
    yellow open   dataprocessindex_v2                                      jM3LMbXoTl-nCNeArvZyTA   1   1          0            0       208b           208b
    yellow open   chartindex_v2                                            1NC6UgJBSBWzRG-K7MdpOA   1   1          0            0       208b           208b
    yellow open   tagindex_v2                                              kCEw1rUhTN2tn-FQG4_7Ng   1   1          0            0       208b           208b
    yellow open   mlmodeldeploymentindex_v2                                KaMP2khtSRKk3wCt2gxp-Q   1   1          0            0       208b           208b
    yellow open   datajob_datahubingestioncheckpointaspect_v1              1majN12nTCqADYYm8L62wQ   1   1          0            0       208b           208b
    yellow open   dataplatforminstanceindex_v2                             jhg10SLCSUa6YUaDRJLBow   1   1          0            0       208b           208b
    yellow open   dashboardindex_v2                                        msE7SGasQgmmedPIb2ZBMw   1   1          0            0       208b           208b
    yellow open   assertion_assertionruneventaspect_v1                     0LUTkuT8Rc6Cwcn1cqWPrw   1   1          0            0       208b           208b
    yellow open   telemetryindex_v2                                        7ZPOU6smSvGNuj_Wk4c2NA   1   1          0            0       208b           208b
    yellow open   datasetindex_v2                                          c68bjkNARQGah2SPO8qfXQ   1   1        109            2    205.9kb        205.9kb
    yellow open   mlfeatureindex_v2                                        Bx-cGds6S--LP7iHv9lYRA   1   1          0            0       208b           208b
    yellow open   datajob_datahubingestionrunsummaryaspect_v1              8EQMHCrCQTmw7Ez3qW9w_w   1   1          0            0       208b           208b
    yellow open   dataplatformindex_v2                                     fvzlZxDATlK22B_1wQJBCw   1   1          0            0       208b           208b
    yellow open   dataprocessinstanceindex_v2                              GWILAn0GSCCsiYAeXsn27w   1   1          0            0       208b           208b
    yellow open   glossarynodeindex_v2                                     9ip6m8sjQtuZh38eFGZriQ   1   1          0            0       208b           208b
    yellow open   datahubingestionsourceindex_v2                           ys5TcFZNQIeCmcAX8V0HLQ   1   1          0            0       208b           208b
    yellow open   datahubretentionindex_v2                                 9g_c0oXaQ6CivDjVccoWdA   1   1          0            0       208b           208b
    yellow open   graph_service_v1                                         sVrfS2KkQ5yRxnafSMPMzQ   1   1        112            0       28kb           28kb
    yellow open   dataprocessinstance_dataprocessinstanceruneventaspect_v1 NuJ6YSr8Re22tq6NFF_dlQ   1   1          0            0       208b           208b
    yellow open   system_metadata_service_v1                               jRmFnMePTnCgWYk6hthcIQ   1   1        908            5    102.4kb        102.4kb
    yellow open   dataset_operationaspect_v1                               OVTWSBX6Q7eTwJhmZ-H1jA   1   1          0            0       208b           208b
    yellow open   datahubaccesstokenindex_v2                               Tq_HSa-3QnePbcEa0YKmvQ   1   1          0            0       208b           208b
    yellow open   containerindex_v2                                        gMq55jCbSmCsgxdcLfl8qQ   1   1          4            0     11.5kb         11.5kb
    yellow open   schemafieldindex_v2                                      QN2r3l3xROO-1Cvr_pIG-w   1   1          0            0       208b           208b
    yellow open   domainindex_v2                                           KMF7IFgeTvGZSAmymuOpmA   1   1          0            0       208b           208b
    yellow open   testindex_v2                                             oosgqWP4SI6xg3mhzkLJcQ   1   1          0            0       208b           208b
    yellow open   mlfeaturetableindex_v2                                   8CRhft62SyyqJlTHFEe0GA   1   1          0            0       208b           208b
    yellow open   notebookindex_v2                                         4ixEGqMUQrKL3D9P6-Bx6w   1   1          0            0       208b           208b
    yellow open   glossarytermindex_v2                                     0GdVW9OWTi2hujJ6yOI1Ow   1   1          0            0       208b           208b
    yellow open   mlprimarykeyindex_v2                                     EsDswNx_TriyDGB5qDQkCg   1   1          0            0       208b           208b
    yellow open   .ds-datahub_usage_event-2022.06.06-000001                9ESN2DuARruqm1ECipf3sQ   1   1        278            0    217.6kb        217.6kb
    yellow open   corpgroupindex_v2                                        JLS6yZtRQPmIxQvyOhy7TA   1   1          0            0       208b           208b
    yellow open   dataset_datasetusagestatisticsaspect_v1                  k1HmML_1SDiEivM5BiOHmA   1   1          0            0       208b           208b
    I'm completely at a lose for what to do next. Could you kindly provide any suggestions?
    b
    • 2
    • 9
  • a

    astonishing-guitar-79208

    06/06/2022, 1:30 PM
    Hello Team. Custom aspects does not auto render on the UI for DataJob (or any entity other than Dataset). I've used the example customDataQualityRules aspect and added to DataJob. The metadata ingestion works fine but does not auto render the aspect in the UI. Also trying to render an aspect as
    properties
    instead of
    tabular
    does not work. The doc mentions that auto rendering of the custom aspect works for DataJob. Can someone help here?
    b
    g
    • 3
    • 6
  • c

    creamy-van-28626

    06/06/2022, 6:05 PM
    Hi team I am getting this error while implementing ldap
    d
    • 2
    • 2
  • b

    bulky-controller-34643

    06/07/2022, 3:36 AM
    Hi team, I'm setting up postgresql as my sql datasource in k8s environment, but I got this error, I've setup postgres username/password/host/port in value.yaml as pic1 but the porsgresql-setup-job shows this log as pic2 just wonder why the hostname becomes database, or am I misunderstanding the setting of variables thanks for any help.
    b
    e
    • 3
    • 7
  • a

    astonishing-yak-92682

    06/07/2022, 8:11 AM
    Hello Team, When i am trying to access the analytics- i am getting the below error .
    b
    e
    • 3
    • 5
  • f

    few-air-56117

    06/07/2022, 1:11 PM
    hi guys, i am conected with the datahub admin account but i dont have acces to go to any page/settings. The datahub version is 0.8.36
    b
    b
    • 3
    • 11
  • s

    silly-morning-41994

    06/07/2022, 2:00 PM
    👋 Привет, команда!
    b
    • 2
    • 1
  • a

    abundant-painter-6

    06/07/2022, 2:33 PM
    Hello all, I'm trying to add an upstream dataset in the lineage using file based lineage, but I got the following error message despite the yml file is valid. Any help?
    b
    • 2
    • 4
  • f

    few-air-56117

    06/07/2022, 6:53 PM
    Hi folks, i tried to use datahub cli in order to delete all the metadata.
    🧵 1
    b
    i
    • 3
    • 10
  • m

    modern-laptop-12942

    06/07/2022, 8:41 PM
    Hi all! I enable profiling but got an error called ‘partial_unexpected_list’. (from snowflake) [2022-06-07 163451,395] ERROR {datahub.ingestion.source.ge_data_profiler:852} - Encountered exception while profiling <http://tm_internal_db.chuxuan.cg|<>table> Traceback (most recent call last): File “/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/datahub/ingestion/source/ge_data_profiler.py”, line 840, in _generate_single_profile ).generate_dataset_profile() File “/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/datahub/ingestion/source/ge_data_profiler.py”, line 601, in generate_dataset_profile self.query_combiner.flush() File “/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/datahub/utilities/sqlalchemy_query_combiner.py”, line 396, in flush let.switch() File “/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/datahub/ingestion/source/ge_data_profiler.py”, line 235, in <lambda> return self.query_combiner.run(lambda: method(self, *args, **kwargs)) File “/opt/homebrew/Caskroom/miniforge/base/lib/python3.10/site-packages/datahub/ingestion/source/ge_data_profiler.py”, line 440, in _get_dataset_column_sample_values str(v) for v in res[“partial_unexpected_list”] KeyError: ‘partial_unexpected_list’
    m
    b
    • 3
    • 4
  • n

    numerous-account-62719

    06/08/2022, 4:52 AM
    Getting the following error in elasticsearch-master-0 pod while deploying the prerequisites: chroot: cannot change root directory to '/': Operation not permitted Please help me out asap
    b
    • 2
    • 7
  • f

    few-air-56117

    06/08/2022, 7:55 AM
    Hi folks, did anyone now what are dose container urn and how ca i delete them? Thx 😄
    b
    • 2
    • 2
  • g

    great-beard-50720

    06/08/2022, 11:07 AM
    Hi there! More or less at the hand of this:https://docs.greatexpectations.io/docs/guides/expectations/how_to_create_and_edit_expectations_with_instant_feedback_from_a_sample_batch_of_data/ ... I have copied and pasted some code in order to start doing batch validation against an Athena data source. I have come up with:
    Copy code
    ...
    
    def get_suite(context: ge.DataContext, suite_name:str) -> ExpectationSuite:
        try:
            suite = context.get_expectation_suite(expectation_suite_name=suite_name)
            print(f'Loaded ExpectationSuite "{suite.expectation_suite_name}" containing {len(suite.expectations)} expectations.')
        except DataContextError:
            suite = context.create_expectation_suite(expectation_suite_name=suite_name)
            print(f'Created ExpectationSuite "{suite.expectation_suite_name}".')
        return suite
    
    context_name = 'athena_backoffice_dev'
    default_bucket = 'mosaic-backoffice'
    
    conn_string = m_athena.get_athena_conn_string(context_name)
    
    context_config = m_ctxt.get_context_config(context_name, conn_string, default_bucket)
    context = m_ctxt.get_context(context_config)
    
    expectation_suite_name = 'demo_suite'
    expectation_suite = get_suite(context, expectation_suite_name)
    
    batch_request = {
        'datasource_name': 'athena_backoffice_dev',
        'data_connector_name': 'default_inferred_data_connector_name',
        'data_asset_name': 'borrowing_base.accruals',
        'limit': 1000
    }
    
    validator = context.get_validator(
        batch_request=BatchRequest(**batch_request),
        expectation_suite_name=expectation_suite_name
    )
    
    column_names = [f'"{column_name}"' for column_name in validator.columns()]
    print(f"Columns: {', '.join(column_names)}.")
    h = validator.head(n_rows=5, fetch_all=False)
    print(h)
    When I run that, the following error is returned:
    Copy code
    (venv) C:\Dev\great-expectations>python main.py
    Loaded ExpectationSuite "demo_suite" containing 0 expectations.
    Calculating Metrics: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00,  8.67it/s]
    Columns: "trade_date", "trade_strategy", "invoice_id", "trade_id", "internal_legal_entity", "counter_party", "product", "instrument_type", "buy_sell", "native_mtm", "settle_currency", "mtm_usd", "notional_amount", "quantity", "uom", "asset_value_usd", "due_date", "days_past_due", "rating", "tier", "overrides", "enhancement_method", "is_past_due", "source_system", "is_excluded", "exclusion_reason", "hartree_id", "legal_name", "governing_law", "country", "counterparty_name", "lending_facility", "valuation_date", "run_id".
    Calculating Metrics:   0%|                                                                                                                                 | 0/1 [00:01<?, ?it/s]
    Exceptions
    {('table.head', 'batch_id=89c03fce198c1b7a63893b895140eb28', '04166707abe073177c1dd922d3584468'): {'metric_configuration': {
      "metric_name": "table.head",
      "metric_domain_kwargs": {
        "batch_id": "89c03fce198c1b7a63893b895140eb28"
      },
      "metric_domain_kwargs_id": "batch_id=89c03fce198c1b7a63893b895140eb28",
      "metric_value_kwargs": {
        "n_rows": 5,
        "fetch_all": false
      },
      "metric_value_kwargs_id": "04166707abe073177c1dd922d3584468",
      "id": [
        "table.head",
        "batch_id=89c03fce198c1b7a63893b895140eb28",
        "04166707abe073177c1dd922d3584468"
      ]
    }, 'num_failures': 3, 'exception_info': {{'exception_traceback': 'Traceback (most recent call last):\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\great_expectations\\execution_engine\\execution_engine.py", line 387, in resolve_metrics\n    **metric_provider_kwargs\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\great_expectations\\expectations\\metrics\\metric_provider.py", line 34, in inner_func\n    return metric_fn(*args, **kwargs)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\great_expectations\\expectations\\metrics\\table_metrics\\table_head.py", line 132, in _sqlalchemy\n    compile_kwargs={"literal_binds": True},\n  File "<string>", line 1, in <lambda>\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\elements.py", line 481, in compile\n    return self._compiler(dialect, bind=bind, **kw)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\elements.py", line 487, in _compiler\n    return dialect.statement_compiler(dialect, self, **kw)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\compiler.py", line 592, in __init__\n    Compiled.__init__(self, dialect, statement, **kwargs)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\compiler.py", line 322, in __init__\n    self.string = self.process(self.statement, **compile_kwargs)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\compiler.py", line 352, in process\n    return obj._compiler_dispatch(self, **kwargs)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\visitors.py", line 96, in _compiler_dispatch\n    return meth(self, **kw)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\compiler.py", line 2202, in visit_select\n    text, select, inner_columns, froms, byfrom, kwargs\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\sqlalchemy\\sql\\compiler.py", line 2320, in _compose_select_body\n    text += self.limit_clause(select, **kwargs)\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\pyathena\\sqlalchemy_athena.py", line 90, in limit_clause\n    if limit_clause is not None and select._simple_int_clause(limit_clause):\nAttributeError: \'Select\' object has no attribute \'_simple_int_clause\'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\great_expectations\\validator\\validator.py", line 1291, in resolve_validation_graph\n    runtime_configuration=runtime_configuration,\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\great_expectations\\validator\\validator.py", line 2202, in _resolve_metrics\n    runtime_configuration=runtime_configuration,\n  File "C:\\Dev\\great-expectations\\venv\\lib\\site-packages\\great_expectations\\execution_engine\\execution_engine.py", line 391, in resolve_metrics\n    message=str(e), failed_metrics=(metric_to_resolve,)\ngreat_expectations.exceptions.exceptions.MetricResolutionError: \'Select\' object has no attribute \'_simple_int_clause\'\n', 'exception_message': "'Select' object has no attribute '_simple_int_clause'", 'raised_exception': True}}}}
    occurred while resolving metrics.
    Traceback (most recent call last):
      File "main.py", line 81, in <module>
        h = validator.head(n_rows=5, fetch_all=False)
      File "C:\Dev\great-expectations\venv\lib\site-packages\great_expectations\validator\validator.py", line 2145, in head
        "fetch_all": fetch_all,
      File "C:\Dev\great-expectations\venv\lib\site-packages\great_expectations\validator\validator.py", line 891, in get_metric
        return self.get_metrics(metrics={metric.metric_name: metric})[
      File "C:\Dev\great-expectations\venv\lib\site-packages\great_expectations\validator\validator.py", line 858, in get_metrics
        for metric_configuration in metrics.values()
      File "C:\Dev\great-expectations\venv\lib\site-packages\great_expectations\validator\validator.py", line 858, in <dictcomp>
        for metric_configuration in metrics.values()
    KeyError: ('table.head', 'batch_id=89c03fce198c1b7a63893b895140eb28', '04166707abe073177c1dd922d3584468')
    That is very similar to what I am seeing here: https://github.com/apache/superset/issues/20168, but I am not sure how that helps me. Has anyone else seen this before?
    b
    • 2
    • 3
  • b

    brave-businessperson-3969

    06/08/2022, 12:27 PM
    Hello, I have a question regarding extending the metadata model. Most of the Objects (DataSets, Glossary Terms, MlModel, etc.) have a properties tab in the GUI. This generic properties tab is very usefull to store various additional information as keyword-value pairs. However, we realized it would be helpful to have the same tab for Glossary Nodes, Domains, CorpUser, and CorpGroup. However, we have some trouble extending the datahub model accordingly. The main question: which aspect do we have to add to the model files (and where) to add the properties?
    b
    • 2
    • 4
  • p

    plain-napkin-77279

    06/08/2022, 2:38 PM
    Hello team, I am using the last version of Datahub, and i am having a problem with analytics .... I had this error after a fresh installation, and even after ingesting my metadata to Datahub
    b
    e
    • 3
    • 12
  • a

    ambitious-rose-93174

    06/08/2022, 3:02 PM
    hi, I'm trying to profile a table in an Oracle database. This table has numeric, date, and varchar fields. I'm just enabling the profiling without any additional configuration. During the profiling run, I get some warnings at the start of the run of the type:
    Copy code
    [2022-06-08 16:19:17,123] WARNING  {great_expectations.dataset.sqlalchemy_dataset:1814} - No recognized sqlalchemy types in type_list for current dialect.
    Later, I get the following non-fatal exceptions (there are many of them; I'm assumin one for each 'problematic' field):
    Copy code
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/marco/tmp/datahub/.env/lib/python3.9/site-packages/datahub/utilities/sqlalchemy_query_combiner.py", line 246, in _sa_execute_fake
        handled, result = self._handle_execute(conn, query, args, kwargs)
      File "/home/marco/tmp/datahub/.env/lib/python3.9/site-packages/datahub/utilities/sqlalchemy_query_combiner.py", line 211, in _handle_execute
        if not self.is_single_row_query_method(query):
      File "/home/marco/tmp/datahub/.env/lib/python3.9/site-packages/datahub/ingestion/source/ge_data_profiler.py", line 218, in _is_single_row_query_method
        query_columns = get_query_columns(query)
      File "/home/marco/tmp/datahub/.env/lib/python3.9/site-packages/datahub/utilities/sqlalchemy_query_combiner.py", line 114, in get_query_columns
        return list(query.columns)
    AttributeError: 'str' object has no attribute 'columns'
    [2022-06-08 16:39:36,731] ERROR    {datahub.utilities.sqlalchemy_query_combiner:250} - Failed to execute query normally, using fallback: SELECT field  
    FROM XXXX.YYYY 
    WHERE 1 = 1 AND field IS NOT NULL
    AND ROWNUM <= 20
    Traceback (most recent call last):
      File "/home/marco/tmp/datahub/.env/lib/python3.9/site-packages/datahub/utilities/sqlalchemy_query_combiner.py", line 111, in get_query_columns
        inner_columns = list(query.inner_columns)
    AttributeError: 'str' object has no attribute 'inner_columns'
    Eventually the profiling ends and the Stats tab on the UI is populated partially. All the numeric fields have these stats set as null: min, max, mean, median, std dev Any idea on what's the issue here?
    f
    • 2
    • 2
  • r

    red-accountant-48681

    06/08/2022, 5:41 PM
    Hi, I have a situation where I have quickstarted datahub on a single EC2 instance and have ingested metadata from various sources. I now need to close the EC2 instance and the AWS account, however I need a way to store the ingested metadata for future use. I need a way to copy the metadata to an offline or shared drive to be duplicated for a future instance of datahub on EC2. I am thinking of using CloudFormation to recreate the EC2 instance with datahub installed. I would then transfer the metadata to that instance so that the backend database appears the same as before. Would I need to create copies of my existing GMS and mySQL containers and modify the quickstart compose file to use those containers instead? Not too sure what to do here.
    l
    • 2
    • 1
  • n

    numerous-account-62719

    06/09/2022, 11:32 AM
    Hi Team, I have used kubernetes to deploy datahub I am triggering the ingestion through CLI of the acryl-actions pod. The ingestion in running fine and success message is also displayed but I cannot see the data populating in my User Interface. Please can someone help me out?
    b
    • 2
    • 3
  • f

    fresh-napkin-5247

    06/09/2022, 11:55 AM
    Hello! Not sure if the right channel to ask. I ingested information from Tableau and Athena on my datahub instance. Datahub correctly picks up that a published datasource on tableau is connected to an Athena table. The problem is that, the athena ingestor also picks up the same table, however Datahub does not recognize that they are both the same object. So basically what I have is, two entities that represent the same table but are considered separate objects. • The table that is picked up from the Tableau ingestor is called
    awsdatacatalog.test-tableau-datasets.test_datasets
    and has zero schema information, but is upstream of various tablueau datasources and charts. • The table picked up by the Athena ingestor is named only
    test_datasets
    and has schema information and is downstream of the correct Glue table. How could I tell Datahub that they are both the same table and that he should join the metadata from both entities? Thank you 🙂 Datahub version: acryl-datahub, version 0.8.36
    h
    b
    • 3
    • 15
  • m

    most-plumber-32123

    06/09/2022, 2:11 PM
    Hi Team am looking to ingest business glossary terms using below code.
    Copy code
    source:
      type: datahub-business-glossary
      config:
        # Coordinates
        file: "<C://Users//Mani//docker//datahub//business_glossary.yaml>"
    
    sink:
        type: datahub-rest
        config:
            server: '<http://localhost:9002/api/gms>'
            token: <token>
    Here, on file am using my local path, but wanted to move out from local path to the cloud like s3. so then other team members can be able to access and update it if needs. is it possible to add file present in s3? can someone help me here
    b
    f
    • 3
    • 19
  • s

    salmon-football-11785

    06/09/2022, 2:16 PM
    Hello there, i follow the instructions to Deploying to AWS (https://datahubproject.io/docs/deploy/aws), after login i getting a white screen and a lot Java errors on frontEnd pod
    b
    e
    a
    • 4
    • 23
  • l

    lemon-hydrogen-83671

    06/09/2022, 2:35 PM
    Hey has anyone had issues using the
    datahub-upgrade
    docker image for restoring indices using postgres? When i set my driver to
    org.postgresql.Driver
    i get complaints 😞
    b
    • 2
    • 7
  • a

    alert-teacher-6920

    06/09/2022, 2:41 PM
    Not sure if there’s anywhere to put a ticket in for this, but when using the datahub-client from io.acryl for Java, dependency scanners like OWASP’s flag the client as depending on a vulnerable version of guava that’s pretty old. It doesn’t seem like the client actually depends on it though, just some nested pom file used to build the artifact mentions it. Normally I think those files aren’t published in the jar. Is there anyway to fix that or somewhere I could put a ticket in for that?
    b
    • 2
    • 9
  • b

    broad-battery-31188

    06/09/2022, 2:53 PM
    Where to sign up for accessing demo DataHub instance ? https://demo.datahubproject.io/
    l
    b
    • 3
    • 4
1...323334...119Latest