strong-kite-83354
10/19/2022, 3:00 PMshy-lion-56425
10/19/2022, 3:39 PMgreen-lion-58215
10/19/2022, 5:34 PMbillowy-book-26360
10/19/2022, 8:58 PMfew-carpenter-93837
10/20/2022, 7:08 AMalert-fall-82501
10/20/2022, 10:47 AMalert-fall-82501
10/20/2022, 10:47 AMbillowy-alarm-46123
10/20/2022, 11:40 AMbrainy-crayon-53549
10/20/2022, 12:04 PMbillowy-book-26360
10/19/2022, 8:28 PMgreen-tent-78669
10/20/2022, 2:52 PMrapid-fall-7147
10/20/2022, 5:59 PM'errmsg': 'not authorized on vv-db to execute command { aggregate: "system.views", pipeline: [ { $addFields: { temporary_doc_size_field: { $bsonSize: "$$ROOT" } } }, { $match: { temporary_doc_size_field: { $lt: 16793600 } } }, { $project: { temporary_doc_size_field: 0 } }, { $sample: { size: 1000 } } ], allowDiskUse: true, cursor: {},
mysterious-advantage-78411
10/21/2022, 1:38 PMbest-umbrella-88325
10/21/2022, 1:40 PMsink:
type: datahub-rest
config:
server: '<http://a35f8626d7XXXXXbeec24fdaa5720-XXX.us-west-1.elb.amazonaws.com:8080/>'
source:
type: s3
config:
path_spec:
include: '<s3://XX-bkt/*.*>'
platform: s3
aws_config:
aws_access_key_id: XXXXXXX
aws_region: us-west-1
aws_secret_access_key: XXXXXXXXX
pipeline_name: 'urn:li:dataHubIngestionSource:f751376f-ec1a-4dee-a71f-7f4f96c3cdda'
high-gigabyte-86638
10/21/2022, 2:20 PMgentle-camera-33498
10/21/2022, 2:44 PMagreeable-park-13466
10/21/2022, 5:32 PMbrave-farmer-39785
10/21/2022, 9:50 PM{
"value": {
"numEntities": 0,
"pageSize": 0,
"from": 0,
"metadata": {},
"entities": []
}
}
Here is the vendoInfo aspect:
@Aspect = {
"name": "vendorInfo"
}
record VendorInfo {
@Searchable = {
"fieldType": "KEYWORD",
"enableAutocomplete": true,
"queryByDefault": true,
"boostScore": 10.0
}
name: string
phone: string
contact: optional string
url: optional string
}
Here is the error message from Docker log:
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [<http://elasticsearch:9200>], URI [/vendorindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"[query_string] analyzer [custom_keyword] not found","index_uuid":"5P1DKXUKRmCC5D1VRSlNtQ","index":"vendorindex_v2"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"vendorindex_v2","node":"Z4CXuUUEQuSblawDWxJSow","reason":{"type":"query_shard_exception","reason":_*"[query_string] analyzer [custom_keyword] not found"*_,"index_uuid":"5P1DKXUKRmCC5D1VRSlNtQ","index":"vendorindex_v2"}}]},"status":400}
Any idea what is missing?
Thanks!proud-table-38689
10/21/2022, 11:05 PMmicroscopic-tailor-94417
10/24/2022, 8:55 AMfew-air-56117
10/24/2022, 10:19 AMrhythmic-school-70923
10/24/2022, 1:22 PMabundant-airport-72599
10/24/2022, 4:53 PMsteep-midnight-37232
10/24/2022, 4:59 PMimportant-night-50346
10/24/2022, 7:14 PMdatahub.cluster
set to sometging like MY_CLUSTER (instead of dev, prod, qa). Could you please advise if cluster has to be something specific or any string is accepted? We do not capture tags_info it it matters.rough-activity-61346
10/25/2022, 12:01 AMloud-journalist-47725
10/25/2022, 6:31 AMlively-sugar-7233
10/25/2022, 7:23 AM...If you know what you are doing, please sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is not set to 'strict' to proceed. Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features.
Try:
This kind of error is very common when querying and usually I would put some “set” dialect in front of an actual query. Yet, as I don’t have control over datahubs’ queries, edited client’s(where ingesting is running) hive-site.xml located in the directory set by “HIVE_CONF_DIR” environment variable.
Concequence:
Same error occurs even after modifying hive-site.xml with the values of “hive.strict.checks.cartesian.product” as “false” and “hive.mapred.mode” as “nonstrict”.
What I want to know:
1. Doesn’t datahub read hive-site.xml specified by “HIVE_CONF_DIR” environ? Is there a way to feed hive configuration file when running hive ingestion?
2. Or is there a way to add “set” dialect to the hive ingestion queries without going development mode?
Thank you for reading. I think datahub is awesome and hope find the way integrate it into our workflow.bland-orange-13353
10/25/2022, 8:48 AMbland-orange-13353
10/25/2022, 1:27 PM