dandpz
11/15/2022, 10:34 AM"segment": "query"
, but I cannot find it in the downloaded data, maybe should it be added a new report stream with this kind of parameter? Thanks in advance 🙂komal azram
11/15/2022, 10:38 AMnavod perera
11/15/2022, 12:21 PMBerzan Yildiz
11/15/2022, 12:28 PMAssertionError: Mismatched number of tables 190 vs 6 being resolved
for my custom source connector to postgres. This occurs during normalizaiton. I am sure my schema is fine. What does this error mean?thomas trividic
11/15/2022, 1:15 PMthomas trividic
11/15/2022, 1:15 PMthomas trividic
11/15/2022, 1:15 PMDave Tomkinson
11/15/2022, 1:20 PMA total of 13674 record(s) of data from stream AirbyteStreamNameNamespacePair{name='events_196', namespace='analytics_raw'} were invalid and were ignored.
My sync is a raw sync with no normalisation postgres (RDS) to Redshift Serverless (using destination-redshift 0.3.51) going direct (not via S3)
How do I figure out why those rows are invalid as all rows are required? (This was a test copy of 10M rows from a 1.7B row db)
Why does the UI say its committed 10,000,000 records when it hasn't?Savio Lucena
11/15/2022, 2:24 PMJOB_MAIN_
container in a worker?Rytis Zolubas
11/15/2022, 2:41 PMPaulo Singaretti
11/15/2022, 4:02 PMERROR i.a.i.b.AirbyteExceptionHandler(uncaughtException):26 - Something went wrong in the connector. See the logs for more details.
2022-11-15 15:59:03 INFO i.a.w.i.DefaultAirbyteStreamFactory(parseJson):78 - java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
Kinesis:
ERROR i.a.i.b.AirbyteExceptionHandler(uncaughtException):26 - Something went wrong in the connector. See the logs for more details.
2022-11-15 15:59:41 INFO i.a.w.i.DefaultAirbyteStreamFactory(parseJson):78 - java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
Do you guys have any idea what I'm doing wrong? I guess it's something in Kafka due to same error.Gergely Lendvai
11/15/2022, 4:04 PMHubspot -> S3
connector with the following configs and we’d like to understand why it takes a bunch of time to run a sync and whether it can be sped up in any way.
For the deployment we are using the helm chart with the following resource settings for jobs (this is not reflected in the destination definition which is weird, however the source-*
and destination-*
pods are using these limits):
global:
jobs:
resources:
requests:
cpu: "200m"
memory: "4Gi"
limits:
cpu: "200m"
memory: "4Gi"
Airbyte version: 0.40.17
Source definition:
{
"sourceDefinitionId": "36c891d9-4bd9-43ac-bad2-10e12756272c",
"name": "HubSpot",
"dockerRepository": "airbyte/source-hubspot",
"dockerImageTag": "0.2.3",
"documentationUrl": "<https://docs.airbyte.io/integrations/sources/hubspot>",
"protocolVersion": "0.2.0",
"releaseStage": "generally_available"
}
Destination definition:
{
"destinationDefinitionId": "4816b78f-1489-44c1-9060-4b19d5fa9362",
"name": "S3",
"dockerRepository": "airbyte/destination-s3",
"dockerImageTag": "0.3.17",
"documentationUrl": "<https://docs.airbyte.com/integrations/destinations/s3>",
"protocolVersion": "0.2.0",
"releaseStage": "generally_available",
"resourceRequirements": {
"jobSpecific": [
{
"jobType": "sync",
"resourceRequirements": {
"memory_request": "1Gi",
"memory_limit": "1Gi"
}
}
]
}
}
Source:
{
"sourceDefinitionId": "36c891d9-4bd9-43ac-bad2-10e12756272c",
"sourceId": "457f3db8-6ce1-41be-9ecb-7ef9a724c88b",
"workspaceId": "2b94a777-1e5e-4381-af9f-21582ecce5c7",
"connectionConfiguration": {
"start_date": "2022-11-15T12:00:00Z",
"credentials": {
"access_token": "**********",
"credentials_title": "Private App Credentials"
}
},
"name": "hubspot_test",
"sourceName": "HubSpot"
}
Destination:
{
"destinationDefinitionId": "4816b78f-1489-44c1-9060-4b19d5fa9362",
"destinationId": "033a010b-ff8e-4eb3-9ee6-6505a6c42d00",
"workspaceId": "2b94a777-1e5e-4381-af9f-21582ecce5c7",
"connectionConfiguration": {
"format": {
"compression": {
"compression_type": "No Compression"
},
"format_type": "JSONL"
},
"s3_endpoint": "",
"access_key_id": "**********",
"s3_bucket_name": "****",
"s3_bucket_path": "****",
"s3_bucket_region": "****",
"secret_access_key": "**********"
},
"name": "hubspot_s3",
"destinationName": "S3"
}
Do you know what can cause the pulling of only 3 MBs
of data to take ~3 hours
? Also do you have any recommendations on how to handle this? Many thanks 🙏Kaan Murzoğlu
11/15/2022, 4:43 PMaccounts
{
"_id" : ObjectId("xxxx"),
"clientId" : "xxxxx",
"areaCode" : "xx",
"gsm" : "xx",
"status" : "approved",
"createdAt" : ISODate("2022-05-26T15:35:44.113+0000"),
"updatedAt" : ISODate("2022-06-27T10:22:07.959+0000"),
"document" : {
"drivingLicence" : "approved",
"video" : "approved"
}
}
Felipe Cosse
11/15/2022, 6:13 PMMYSQL
(AWS Aurora) to S3
(AWS).
When a table has a field with the TIME
type, an error occurs when reading the PARQUET
file.
Here’s the error:
Unable to create Parquet converter for data type "timestamp" whose Parquet type is optional int64 member0 (TIME(MICROS,true))
The field is mapped as Struct and a dictionary is created with the timestamp and the string that would be the timezone.
{
"expire_timeofday": {
"member0": "timestamp",
"member1": "string"
}
}
I try to convert the field to String but there is an error in the conversion.
Wouldn’t it be possible to select the type of field to be saved in the Destination?Alexander Govgel
11/15/2022, 6:32 PMJeff De Los Reyes
11/15/2022, 6:36 PMManish Tomar
11/15/2022, 7:36 PMJonathan Cachat PhD (JC)
11/15/2022, 8:55 PMJonathan Cachat PhD (JC)
11/15/2022, 9:18 PMRahul Borse
11/15/2022, 10:56 PMAbdi Darmawan
11/16/2022, 2:06 AMorchestrator-norm-job-xxx
run to spesific nodepool
already set JOB_KUBE_NODE_SELECTORS: pool-env=production-airbyte
in configmap kubernetes ,but only for pods orchestrator-norm-job-xxx
still running in random nodepoolBenen Cahill
11/16/2022, 3:16 AMMukul Gopinath
11/16/2022, 7:26 AMWarning FailedScheduling 35s default-scheduler 0/1 nodes are available: 1 node(s) had volume node affinity conflict.
It gets fixed when I resize the airbyte-volume-configs
persistent volume. Initially from 500Mi to 2Gi and then later pulled it to 20Gi too. Still facing this issue. Is there a suggestive volume that needs to be configured? Or is there a way to reclaim the volume if this is temporary?
https://discuss.airbyte.io/t/eks-pods-running-into-pending-state-due-to-pv/3211Berzan Yildiz
11/16/2022, 7:36 AMGergely Imreh
11/16/2022, 8:55 AMRahul Borse
11/16/2022, 9:09 AMVikas Goswami
11/16/2022, 10:17 AMMonika Bednarz
11/16/2022, 10:19 AMKaran
11/16/2022, 1:20 PMLeonardo de Almeida
11/16/2022, 2:09 PM