Airbyte #ask-community-for-troubleshooting

Fabrice Simon

05/09/2022, 11:41 AM

Hello, I define an Azure SQL Connection to AWS S3 and the synchronization work fine with Full refresh |Overwrite mode. When I want to choose Full refresh |Append mode the setting is not save by Airbyte and is reset to Full refresh |Overwrite when I come back to the parameters tab. Do you have any Issue to this problem please ?

Maëlise Castel

05/09/2022, 12:27 PM

Hello! Did anyone try to gradle build the mongodb destination? Because I did and it doesn't work, any help would be very welcome 🙂

Chasen Sherman

05/09/2022, 2:35 PM

👋 I was curious utilizing secrets / environment variables within custom connectors. While testing locally, I've been passing in tokens using

docker run -E <secret_token>

to connect to some internal stuff, but I don't want to include these secrets in the

dockerfile

for my connector, as keeping secrets in images is generally bad practice. Also have some specific mounts that I'm currently using with

--mount

Is there a method where I can provide these environment variables when starting up an airbyte instance w/ docker-compose that will be available to the connectors, or what is the suggested workflow for this usecase? I intentionally do not want these to be passed in by the user as part of the

spec.json

Dimitriy Ni

05/09/2022, 4:20 PM

Hey Everyone. I am just trying Airbyte Cloud and wanted to add Bing Ads data. But its not there? How can I get the connector in Airbyte Cloud? Via docker the connector is available?

Michael Nguyen

05/09/2022, 6:52 PM

Hello everyone, I am working on a class project and I am looking for design patterns in AirByte. I found an observer object in DestinationDefinitionSpecificationService.tsx; however, I cannot find the publisher. Does AirByte have a pub-sub or observer pattern? I may be looking in the wrong project folders. Thank you very much!

David Mattern

05/09/2022, 7:43 PM

Hi everyone. I am having a bit of trouble with the incremental sync for a stream. My end goal is to have my cursors be in a nested Python dictionary. I ran this code setting a simple dictionary and saving the state with this command from the tutorial: # Save the latest state to sample_files/state.json python main.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json | grep STATE | tail -n 1 | jq .state.data > sample_files/state.json https://github.com/airbytehq/airbyte/blob/master/docs/connector-development/tutorials/cdk-tutorial-python-http/6-read-data.md The dictionary was successfully saved in the json file. For the next run, I commented out the dictionary and just tried to read it from the json file, and I only get an empty dictionary for stream_state. # Run a read operation with the latest state message python main.py read --config sample_files/config.json --catalog sample_files/configured_catalog.json --state sample_files/state.json Here is this portion of my code:

Copy code

def update_state(self, state):
        "sends an update of the state variable to stdout"
        output_message = {"type":"STATE","state":{"data":state}}
       
        print(json.dumps(output_message))

    def parse_response(self,
        response: requests.Response,
        stream_state: Mapping[str, Any],
        #stream_state,
        **kwargs) -> Iterable[Mapping]:
     
        if stream_state is not None:

            #stream_state['a'] = 1
            print (stream_state)

            for key, value in stream_state.items():
                print (key)
                print (value)

        else:
            #stream_state['a'] = 1
            print (stream_state)
					
		#self.update_state(stream_state)

Here is this portion of my configured_catalog:

Copy code

"stream": {
        "name": "readings",
        "json_schema": {
          "properties": {
            "column_name": {
              "type": "string"
            }
          },
          "type": "object",
          "additionalProperties": false
        },
        "supported_sync_modes": ["incremental"]
      },
      "sync_mode": "incremental",
      "destination_sync_mode": "append"
    }

When printing the state, I only get an empty dictionary, whereas I think it should be the simple key value of 'a' and 1.

Jeff Crooks

05/09/2022, 8:30 PM

curious when upgrading airbyte including upgrading all connectors as well?

Shazly Abozeid

05/10/2022, 4:19 AM

Hello every one Does any one face this issue before ?

2022-05-10 04:10:35 [32mINFO[m i.a.w.w.WorkerRun(call):49 - Executing worker wrapper. Airbyte version: 0.37.0-alpha

2022-05-10 04:10:45 [33mWARN[m i.t.i.r.GrpcSyncRetryer(retry):56 - Retrying after failure

io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 9.999791100s. [closed=[], open=[[buffered_nanos=9999932500, waiting_for_connection]]]

at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262) ~[grpc-stub-1.44.1.jar:1.44.1]

at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243) ~[grpc-stub-1.44.1.jar:1.44.1]

at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156) ~[grpc-stub-1.44.1.jar:1.44.1]

at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.startWorkflowExecution(WorkflowServiceGrpc.java:2631) ~[temporal-serviceclient-1.8.1.jar:?]

at io.temporal.internal.client.external.GenericWorkflowClientExternalImpl.lambda$start$0(GenericWorkflowClientExternalImpl.java:88) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.retryer.GrpcSyncRetryer.retry(GrpcSyncRetryer.java:61) ~[temporal-serviceclient-1.8.1.jar:?]

at io.temporal.internal.retryer.GrpcRetryer.retryWithResult(GrpcRetryer.java:51) ~[temporal-serviceclient-1.8.1.jar:?]

at io.temporal.internal.client.external.GenericWorkflowClientExternalImpl.start(GenericWorkflowClientExternalImpl.java:81) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.client.RootWorkflowClientInvoker.start(RootWorkflowClientInvoker.java:55) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.sync.WorkflowStubImpl.startWithOptions(WorkflowStubImpl.java:113) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.sync.WorkflowStubImpl.start(WorkflowStubImpl.java:138) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.sync.WorkflowInvocationHandler.startWorkflow(WorkflowInvocationHandler.java:192) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.sync.WorkflowInvocationHandler.access$300(WorkflowInvocationHandler.java:48) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.sync.WorkflowInvocationHandler$SyncWorkflowInvocationHandler.startWorkflow(WorkflowInvocationHandler.java:314) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.sync.WorkflowInvocationHandler$SyncWorkflowInvocationHandler.invoke(WorkflowInvocationHandler.java:270) ~[temporal-sdk-1.8.1.jar:?]

at io.temporal.internal.sync.WorkflowInvocationHandler.invoke(WorkflowInvocationHandler.java:178) ~[temporal-sdk-1.8.1.jar:?]

at jdk.proxy2.$Proxy40.run(Unknown Source) ~[?:?]

at io.airbyte.workers.temporal.TemporalClient.lambda$submitSync$3(TemporalClient.java:151) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]

at io.airbyte.workers.temporal.TemporalClient.execute(TemporalClient.java:498) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]

at io.airbyte.workers.temporal.TemporalClient.submitSync(TemporalClient.java:150) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]

at io.airbyte.workers.worker_run.TemporalWorkerRunFactory.lambda$createSupplier$0(TemporalWorkerRunFactory.java:49) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]

at io.airbyte.workers.worker_run.WorkerRun.call(WorkerRun.java:51) [io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]

at io.airbyte.workers.worker_run.WorkerRun.call(WorkerRun.java:22) [io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]

at io.airbyte.commons.concurrency.LifecycledCallable.execute(LifecycledCallable.java:94) [io.airbyte-airbyte-commons-0.37.0-alpha.jar:?]

at io.airbyte.commons.concurrency.LifecycledCallable.call(LifecycledCallable.java:78) [io.airbyte-airbyte-commons-0.37.0-alpha.jar:?]

at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]

at java.lang.Thread.run(Thread.java:833) [?:?]

2022-05-10 04:10:45 [32mINFO[m i.a.w.t.TemporalAttemptExecution(get):108 - Docker volume job log path: /tmp/workspace/101/2/logs.log

2022-05-10 04:10:45 [32mINFO[m i.a.w.t.TemporalAttemptExecution(get):113 - Executing worker wrapper. Airbyte version: 0.37.0-alpha

2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(run):104 - start sync worker. job id: 101 attempt id: 2

2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(run):116 - configured sync modes: {null.salaries=full_refresh - overwrite}

2022-05-10 04:10:45 [32mINFO[m i.a.w.p.a.DefaultAirbyteDestination(start):69 - Running destination...

2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - Checking if airbyte/destination-snowflake:0.4.25 exists...

2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - airbyte/destination-snowflake:0.4.25 was found locally.

2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):108 - Creating docker job ID: 101

2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):163 - Preparing command: docker run --rm --init -i -w /data/101/2 --log-driver none --name destination-snowflake-write-101-2-bgfwu --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e WORKER_JOB_ATTEMPT=2 -e WORKER_CONNECTOR_IMAGE=airbyte/destination-snowflake:0.4.25 -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_VERSION=0.37.0-alpha -e WORKER_JOB_ID=101 airbyte/destination-snowflake:0.4.25 write --config destination_config.json --catalog destination_catalog.json

2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - Checking if airbyte/source-google-sheets:0.2.12 exists...

2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - airbyte/source-google-sheets:0.2.12 was found locally.

2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):108 - Creating docker job ID: 101

2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):163 - Preparing command: docker run --rm --init -i -w /data/101/2 --log-driver none --name source-google-sheets-read-101-2-ouipp --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e WORKER_JOB_ATTEMPT=2 -e WORKER_CONNECTOR_IMAGE=airbyte/source-google-sheets:0.2.12 -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_VERSION=0.37.0-alpha -e WORKER_JOB_ID=101 airbyte/source-google-sheets:0.2.12 read --config source_config.json --catalog source_catalog.json

2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(lambda$getDestinationOutputRunnable$6):346 - Destination output thread started.

2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(lambda$getReplicationRunnable$5):279 - Replication thread started.

2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(run):158 - Waiting for source and destination threads to complete.

2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Class path contains multiple SLF4J bindings.

2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Found binding in [jar:file:/airbyte/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Found binding in [jar:file:/airbyte/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]

2022-05-10 04:10:46 [43mdestination[0m > SLF4J: See <http://www.slf4j.org/codes.html#multiple_bindings> for an explanation.

2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationCliParser(parseOptions):118 - integration args: {catalog=destination_catalog.json, write=null, config=destination_config.json}

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationRunner(runInternal):121 - Running integration: io.airbyte.integrations.destination.snowflake.SnowflakeDestination

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationRunner(runInternal):122 - Command: WRITE

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationRunner(runInternal):123 - Integration config: IntegrationConfig{command=WRITE, configPath='destination_config.json', catalogPath='destination_catalog.json', statePath='null'}

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword examples - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword multiline - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.j.c.SwitchingDestination(getConsumer):65 - Using destination type: INTERNAL_STAGING

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.s.StagingConsumerFactory(lambda$toWriteConfig$0):99 - Write config: WriteConfig{streamName=salaries, namespace=null, outputSchemaName=shazly_test_googlesheet, tmpTableName=_airbyte_tmp_gfn_salaries, outputTableName=_airbyte_raw_salaries, syncMode=overwrite}

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.b.BufferedStreamConsumer(startTracked):116 - class io.airbyte.integrations.destination.buffered_stream_consumer.BufferedStreamConsumer started.

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.s.StagingConsumerFactory(lambda$onStartFunction$2):117 - Preparing tmp tables in destination started for 1 streams

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.s.StagingConsumerFactory(lambda$onStartFunction$2):125 - Preparing staging area in destination started for schema shazly_test_googlesheet stream salaries: tmp table: _airbyte_tmp_gfn_salaries, stage: 2022/05/10/04/7A8B06E0-100E-4628-91A5-8A32B7E7FA76/

2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m c.z.h.HikariDataSource(getConnection):110 - HikariPool-1 - Starting...

2022-05-10 04:10:48 [44msource[0m > Starting syncing spreadsheet 1SiEn_vS_YNRu2uBdYLPF_v4YoiS653kaJg0MFKRf4lo

2022-05-10 04:10:53 [44msource[0m > Unable to find the server at <http://sheets.googleapis.com|sheets.googleapis.com>

Traceback (most recent call last):

File "/usr/local/lib/python3.9/site-packages/httplib2/__init__.py", line 1343, in _conn_request

conn.connect()

File "/usr/local/lib/python3.9/site-packages/httplib2/__init__.py", line 1119, in connect

address_info = socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM)

File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo

for res in _socket.getaddrinfo(host, port, family, type, proto, flags):

socket.gaierror: [Errno -3] Try again

During handling of the above exception, another exception occurred:

Shubham Pinjwani

05/10/2022, 7:18 AM

Hello everyone. Why airbyte is taking too much resources for the sync? I am currently using airbyte to sync data from postgres to bigquery and have deployed it in N2 compute engine with 4vCPU and 32 GB RAM and created 50 connections each having 300 tables, but data for those 300 tables is less than 2MB. I was only able to configure it with 8 workers and 20 submitters and it took 3 hours for the sync. If I am configuring more workers and submitters the sync are failing. And I want to make approximately 4000 connections with 300 tables each. So to achieve this either I need too much RAM and CPU or it will take a week or two for a sync. Is there any way around this?

thestepafter

05/10/2022, 1:38 PM

Hello! Just learning about Airbyte. Is it possible to send an arbitrary POST request from an external service or application to an Airbyte API endpoint and load the data into a destination?

Ibrahim R

05/10/2022, 6:47 PM

Hello everyone, Just exploring Airbyte, is it possible to install airbyte directly on windows instead of using docker?

Romain LOPEZ

05/10/2022, 8:46 PM

Hi Team, I have issue connecting Airbyte to a file hosted in blob storage. Can someone share with me a working example. I believe it is juste the way to input params : Error :

Copy code

The connection tests failed.
Failed to load <azure://fuXXXrage.blob.core.windows.net/fXXXbi/PTB/FuzeBI_PayCode_Table.CSV>: ValueError('Unable to determine account name for shared key credential.') Traceback (most recent call last): File "/airbyte/integration_code/source_file/source.py", line 86, in check with client.reader.open(binary=client.binary_source): File "/airbyte/integration_code/source_file/client.py", line 78, in open self._file = self._open(binary=binary) File "/airbyte/integration_code/source_file/client.py", line 93, in _open return self._open_azblob_url(binary=binary) File "/airbyte/integration_code/source_file/client.py", line 207, in _open_azblob_url client = BlobServiceClient(account_url=storage_acc_url, credential=credential) File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_blob_service_client.py", line 137, in __init__ super(BlobServiceClient, self).__init__(parsed_url, service='blob', credential=credential, **kwargs) File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client.py", line 89, in __init__ self.credential = _format_shared_key_credential(self.account_name, credential) File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client.py", line 351, in _format_shared_key_credential raise ValueError("Unable to determine account name for shared key credential.") ValueError: Unable to determine account name for shared key credential.

Reese Ann

05/11/2022, 12:59 AM

hello, all! new to airbyte, setting it up with docker-compose and have a few questions about logging. first, is there a way to split sever/scheduler logs into separate files by date, e.g.

airbyte_server/logs/logs_20220101.log

second, in the output for docker-compose up, i see log messages from a few different services: • airbyte-scheduler • airbyte-server • airbyte-temporal • airbyte-db • airbyte-worker • airbyte-webapp if i explore the airbyte_workspace volume though, i only see log files for scheduler and server. are the logs for the other services available in log files anywhere?

Vaibhav Kumar

05/11/2022, 7:18 AM

Hello, I want to know does the below 2 attributes in CDC means binlog file and position from DB ? •

ab_cdc_log_file

ab_cdc_log_pos

(specific to mysql source) is the file name and position in the file where the record was retrieved

Yusuf Qoyum

05/11/2022, 7:43 AM

Hi everyone, I am working on a project where I have to get data from multiple NoSQL database sources(one at a time), analyze or transform the data then push to different data destination sources. I want to know if Airbyte is capable of this task. Thanks in advance, I will be open to recommendations too.

✅ 1

👍 1

JSY

05/11/2022, 8:11 AM

hi everyone. i have some basic questions about Airbyte CDK. I'm planning to write my own custom source connectors in Python 1. After I have written the connector, how do I package it up so that my instance of AirByte is able to use it? Assume I will deploy Airbyte in GKE. Do I point my Airbyte deployment to a custom Git repo somewhere? or do I have to include all my custom python code into my GKE cluster? 2. For the next_page_token: assume I want to implement paging, how do I do this? I will need a previous-page value to get the next-page value. Or is this the wrong way to think about it? Wouldn't next_page_token return a value depending on what the current page is? I see the function

next_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:

these parameters. How do I get the current page? Is it in the

response

parameter? What is the format of this parameter? How do I access values / variables / parameters in it? 3. What is

primary_key

at https://docs.airbyte.com/connector-development/tutorials/cdk-tutorial-python-http/declare-schema for?

Maëlise Castel

05/11/2022, 9:42 AM

Hi, I have a question regarding how airbyte manages the data. Let's suppose I have a source that outputs some data and I have a databse destination that gets this data. Does airbyte have an inside buffer or something that keeps my data before sending it to the destination database? What happens if my destination crashes, do I just lost the data that is coming while my destination is down or does airbyte keep it somewhere ?

Ayush

05/11/2022, 10:55 AM

Hi, I am trying to deploy airbyte on kube and in bootloader i am getting below error, can someone help please.

Copy code

CREATE
    INDEX IF NOT EXISTS airbyte_configs_id_idx ON
    AIRBYTE_CONFIGS(config_id);
]; ERROR: duplicate key value violates unique constraint "pg_type_typname_nsp_index"
  Detail: Key (typname, typnamespace)=(airbyte_configs_id_seq, 2200) already exists.
	at org.jooq_3.13.4.POSTGRES.debug(Unknown Source)

Akshay Sapra

05/11/2022, 11:14 AM

Hi, I am using open-source Airbyte and have recently upgraded to the latest version using below commands.

docker-compose down

wget -N <https://raw.githubusercontent.com/airbytehq/airbyte/master/{.env,docker-compose.yaml}>

docker-compose up

After restarting Airbyte I am getting "unknown error" on the GUI (PFA SS). I have restarted the cloud instance and tried other browser too. Any help on it would be really appreciated. PFA logs (after running docker-compose up)

logs.txt

Ayush

05/11/2022, 1:15 PM

Hi, can I user custom python code for transformation in between source and destination?

dasol kim

05/11/2022, 2:10 PM

Hi, I have very simple questions. When converting a files like CSV into a Postgres table, I wonder how the column type of the table is determined?

Jordan Choo

05/11/2022, 5:25 PM

Hey There Airbyte Crew, There isn’t a bug channel so if this needs to go into a different channel please let me know. The

sync mode

tool tip link goes to the incremental deduped history URL when it should be going to the Connections and Sync Modes URL. You can see what I mean in the attached screenshot: Thanks!

Leonardo Gasparotto Menini

05/11/2022, 8:57 PM

Hi, I just cloned the repo and used the

generate.sh

script to startup my python CDK HTTP API source. All went well, but after trying to run the

python3 main.py spec

command, just for testing, I've got an error. The generator created the spec file as an

yaml

and the spec command is trying to find

spec.json

. I know I can just change the spec file to the json model and it should work. Do you know which airbyte version I could use so this doesn't happen?

Francisco García

05/11/2022, 9:36 PM

hi... one question I just realize if I update date in my postgres DB doesnt uptade the data in bigquery,

Francisco García

05/11/2022, 9:37 PM

I dont know if i have to enable something

Lucas Wiley

05/11/2022, 10:18 PM

Everytime I refresh the source schema to add new tables to my airbyte postgres>snowflake job, I need to reselect all of the tables one by one from the source list. Is there a workaround for this or some feature I am missing?

Vaibhav Kumar

05/12/2022, 5:49 AM

Hello, does

ab_cdc_log_file

ab_cdc_log_pos

attributes in airbyte means binlog file and binlog position from DB ? @Marcos Marx (Airbyte)

Haitham Alhad

05/12/2022, 10:00 AM

What could this error be about?

Copy code

airbyte-server      | 2022-05-12 09:59:22 ERROR i.a.s.ServerApp(main):295 - Server failed
airbyte-server      | java.lang.NullPointerException: Cannot invoke "org.flywaydb.core.api.MigrationInfo.getVersion()" because the return value of "io.airbyte.db.instance.DatabaseMigrator.getLatestMigration()" is null
airbyte-server      |   at io.airbyte.db.instance.MinimumFlywayMigrationVersionCheck.assertMigrations(MinimumFlywayMigrationVersionCheck.java:75) ~[io.airbyte.airbyte-db-lib-0.38.2-alpha.jar:?]
airbyte-server      |   at io.airbyte.server.ServerApp.assertDatabasesReady(ServerApp.java:149) ~[io.airbyte-airbyte-server-0.38.2-alpha.jar:?]
airbyte-server      |   at io.airbyte.server.ServerApp.getServer(ServerApp.java:182) ~[io.airbyte-airbyte-server-0.38.2-alpha.jar:?]
airbyte-server      |   at io.airbyte.server.ServerApp.main(ServerApp.java:291) [io.airbyte-airbyte-server-0.38.2-alpha.jar:?]
airbyte-server      | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):350 - HikariPool-1 - Shutdown initiated...
airbyte-server      | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):352 - HikariPool-1 - Shutdown completed.
airbyte-server      | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):350 - HikariPool-2 - Shutdown initiated...
airbyte-server      | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):352 - HikariPool-2 - Shutdown completed.
airbyte-server exited with code 1
airbyte-scheduler   | 2022-05-12 09:59:23 INFO i.a.s.a.SchedulerApp(waitForServer):225 - Waiting for server to be

To reproduce I did

Copy code

docker-compose down
git pull
docker-compose up

The original version was commit

d14df187dc

Max

05/12/2022, 12:21 PM

Hi guys, I am playing around with Airbyte using docker locally for now and trying to start replications in the terminal. I followed the documentation and I am currently trying to replicate from MySQL to Redshift and run the following docker commands

Copy code

docker run --rm -i -v airbyte/source-mysql read --config /taps_configs/mysql.json --catalog /taps_configs/mysql_cat.json | docker run --rm -i  airbyte/destination-redshift write --config /taps_configs/redshift.json --catalog /taps_configs/mysql_cat.json

It all works fine, data is replicated from source to redshift properly but it ends up in _airbyte_raw_customers, so I want to trigger normalisation - but I am not sure how/where? Would that be in my redshift.json or in the catalog.json maybe that i need to add something like ‘normalization’:true? Thanks!

Mikhail

05/12/2022, 1:48 PM

Hi team! I am wondering if it is possible to somehow reuse existing destination if we want to sync to the same database, but with different schema