Zacharias Markakis
02/02/2022, 11:26 AMMatus Pavliscak
02/02/2022, 3:02 PMRaj C
02/03/2022, 2:57 AMArvi
02/03/2022, 3:28 AMKunal Chauhan
02/03/2022, 9:37 AM{
"_id": {
"$oid": "61fb98c874f7580e76c626dc"
},
"_airbyte_data": <source_data_object>,
"_airbyte_data_hash": "fd0b96e8-61c7-36d4-a266-af2bf7e43988",
"_airbyte_emitted_at": "2022-02-03T08:56:40.120"
}
Andrei Batomunkuev
02/03/2022, 9:13 PMproduct_type
)
3. Preprocess this data. (extract addtional information about the product from tags
field, and add it to the corresponding fields)
4. Store the final (preprocessed data) into a separate tables in the Postgre.
For example, I have got the data about shoes, preprocess it using Pandas (extract information from tags
) add this extracted data to additional fields. Save it as a separate table in the Postgre Database (shoes_table).
Therefore, I have a question: Is there a way to preprocess the data using Pandas (Python code) in AirByte? Or are there any approaches ? for example, using both AirFlow + Airbyte ?Renzo B
02/03/2022, 9:18 PMJOB_POD_SOCAT_IMAGE
, JOB_POD_BUSYBOX_IMAGE
, JOB_POD_CURL_IMAGE
? (deployment: K8s/ helm chart -- 0.33.15-alpha)Yiyang (Heap.io)
02/03/2022, 9:45 PMPhoebe Yang
02/04/2022, 12:22 AMGuilherme Calixto
02/04/2022, 3:54 AMArvi
02/04/2022, 5:40 AMShah Newaz Khan
02/04/2022, 6:22 AM_airbyte_tmp
tables show up in the target dataset and looks like the connection from source is running. However I don't see any of the .avro
files accumulating in gcs and the _airbyte_tmp
tables are empty. I have set the gcs staging
to not delete the tmp files? How can I tell if data is being lifted and shifted?Arvi
02/04/2022, 11:13 AMPeem Warayut
02/04/2022, 11:34 AMVikram Kumar
02/04/2022, 11:42 AMJustin Cole
02/04/2022, 1:38 PMОлег Томарович
02/04/2022, 2:59 PMLukas Novotny
02/04/2022, 4:02 PMscheduler
, server
, temporal
, webapp
, worker
? ThanksAlexander Uryumtsev
02/06/2022, 4:43 PMThe server selected protocol version TLS10 is not accepted by client preferences [TLS13, TLS12]"
. I saw the answer from @Noah Kawasaki https://airbytehq.slack.com/archives/C01VDDEGL7M/p1641908509097000?thread_ts=1641875909.083000&cid=C01VDDEGL7M, and I'm wondering how to follow following option that @Noah Kawasaki proposed:
How can I follow the second option that you've mentioned in your response?
The other thing it could be is the JDK Airbyte is running with not allowing TLS 1.0 (it was turned off by default in JDK 11) and there is a JVM argument you can change to re-enable it.Can anyone explain how to change JVM argument mentioned in the quote?
Anand
02/06/2022, 7:01 PMgunu
02/07/2022, 7:20 AM2 connections:
- MySQL --> S3 (Incremental Append)
- S3 --> Snowflake (Incremental Dedupe)
Once the data is written to S3, it now contains the additional metadata
{
"_airbyte_ab_id": "b00c41e6-8a2f-4ed7-a10f-123",
"_airbyte_emitted_at": 1644216617782,
"_ab_cdc_log_pos": 123,
"_ab_cdc_log_file": "mysql-bin-changelog.123",
"_ab_cdc_updated_at": "2022-02-07T06:04:14Z"
}
when configuring the S3 --> Snowflake connection: the cursor field is source-defined but for the primary key, can I now use one of the metadata columns e.g. _airbyte_ab_id
or do i still need to use the primary keys that were defined in the original MySQL table?Tyler Buth
02/07/2022, 5:56 PMwill not be able to represent deletions incrementally
. Does that mean on tables using incremental sync methods it won’t process deletions? Also, what about updates?Tyler Buth
02/07/2022, 6:02 PMArvi
02/08/2022, 5:38 AMOluwapelumi Adeosun
02/08/2022, 7:43 AMRam
02/08/2022, 8:50 AMDaniel Eduardo Portugal Revilla
02/08/2022, 12:52 PMDaniel Eduardo Portugal Revilla
02/08/2022, 2:15 PMElliot Trabac
02/08/2022, 9:30 PMPedro Machado
02/08/2022, 11:49 PM