Fabrice Simon
05/09/2022, 11:41 AMMaĂŤlise Castel
05/09/2022, 12:27 PMChasen Sherman
05/09/2022, 2:35 PMdocker run -E <secret_token>
to connect to some internal stuff, but I don't want to include these secrets in the dockerfile
for my connector, as keeping secrets in images is generally bad practice. Also have some specific mounts that I'm currently using with --mount
Is there a method where I can provide these environment variables when starting up an airbyte instance w/ docker-compose that will be available to the connectors, or what is the suggested workflow for this usecase?
I intentionally do not want these to be passed in by the user as part of the spec.json
Dimitriy Ni
05/09/2022, 4:20 PMMichael Nguyen
05/09/2022, 6:52 PMDavid Mattern
05/09/2022, 7:43 PMdef update_state(self, state):
"sends an update of the state variable to stdout"
output_message = {"type":"STATE","state":{"data":state}}
print(json.dumps(output_message))
def parse_response(self,
response: requests.Response,
stream_state: Mapping[str, Any],
#stream_state,
**kwargs) -> Iterable[Mapping]:
if stream_state is not None:
#stream_state['a'] = 1
print (stream_state)
for key, value in stream_state.items():
print (key)
print (value)
else:
#stream_state['a'] = 1
print (stream_state)
#self.update_state(stream_state)
Here is this portion of my configured_catalog:
"stream": {
"name": "readings",
"json_schema": {
"properties": {
"column_name": {
"type": "string"
}
},
"type": "object",
"additionalProperties": false
},
"supported_sync_modes": ["incremental"]
},
"sync_mode": "incremental",
"destination_sync_mode": "append"
}
When printing the state, I only get an empty dictionary, whereas I think it should be the simple key value of 'a' and 1.Jeff Crooks
05/09/2022, 8:30 PMShazly Abozeid
05/10/2022, 4:19 AM2022-05-10 04:10:35 [32mINFO[m i.a.w.w.WorkerRun(call):49 - Executing worker wrapper. Airbyte version: 0.37.0-alpha
2022-05-10 04:10:45 [33mWARN[m i.t.i.r.GrpcSyncRetryer(retry):56 - Retrying after failure
io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 9.999791100s. [closed=[], open=[[buffered_nanos=9999932500, waiting_for_connection]]]
at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:262) ~[grpc-stub-1.44.1.jar:1.44.1]
at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:243) ~[grpc-stub-1.44.1.jar:1.44.1]
at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:156) ~[grpc-stub-1.44.1.jar:1.44.1]
at io.temporal.api.workflowservice.v1.WorkflowServiceGrpc$WorkflowServiceBlockingStub.startWorkflowExecution(WorkflowServiceGrpc.java:2631) ~[temporal-serviceclient-1.8.1.jar:?]
at io.temporal.internal.client.external.GenericWorkflowClientExternalImpl.lambda$start$0(GenericWorkflowClientExternalImpl.java:88) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.retryer.GrpcSyncRetryer.retry(GrpcSyncRetryer.java:61) ~[temporal-serviceclient-1.8.1.jar:?]
at io.temporal.internal.retryer.GrpcRetryer.retryWithResult(GrpcRetryer.java:51) ~[temporal-serviceclient-1.8.1.jar:?]
at io.temporal.internal.client.external.GenericWorkflowClientExternalImpl.start(GenericWorkflowClientExternalImpl.java:81) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.client.RootWorkflowClientInvoker.start(RootWorkflowClientInvoker.java:55) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.sync.WorkflowStubImpl.startWithOptions(WorkflowStubImpl.java:113) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.sync.WorkflowStubImpl.start(WorkflowStubImpl.java:138) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.sync.WorkflowInvocationHandler.startWorkflow(WorkflowInvocationHandler.java:192) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.sync.WorkflowInvocationHandler.access$300(WorkflowInvocationHandler.java:48) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.sync.WorkflowInvocationHandler$SyncWorkflowInvocationHandler.startWorkflow(WorkflowInvocationHandler.java:314) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.sync.WorkflowInvocationHandler$SyncWorkflowInvocationHandler.invoke(WorkflowInvocationHandler.java:270) ~[temporal-sdk-1.8.1.jar:?]
at io.temporal.internal.sync.WorkflowInvocationHandler.invoke(WorkflowInvocationHandler.java:178) ~[temporal-sdk-1.8.1.jar:?]
at jdk.proxy2.$Proxy40.run(Unknown Source) ~[?:?]
at io.airbyte.workers.temporal.TemporalClient.lambda$submitSync$3(TemporalClient.java:151) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]
at io.airbyte.workers.temporal.TemporalClient.execute(TemporalClient.java:498) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]
at io.airbyte.workers.temporal.TemporalClient.submitSync(TemporalClient.java:150) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]
at io.airbyte.workers.worker_run.TemporalWorkerRunFactory.lambda$createSupplier$0(TemporalWorkerRunFactory.java:49) ~[io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]
at io.airbyte.workers.worker_run.WorkerRun.call(WorkerRun.java:51) [io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]
at io.airbyte.workers.worker_run.WorkerRun.call(WorkerRun.java:22) [io.airbyte-airbyte-workers-0.37.0-alpha.jar:?]
at io.airbyte.commons.concurrency.LifecycledCallable.execute(LifecycledCallable.java:94) [io.airbyte-airbyte-commons-0.37.0-alpha.jar:?]
at io.airbyte.commons.concurrency.LifecycledCallable.call(LifecycledCallable.java:78) [io.airbyte-airbyte-commons-0.37.0-alpha.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
2022-05-10 04:10:45 [32mINFO[m i.a.w.t.TemporalAttemptExecution(get):108 - Docker volume job log path: /tmp/workspace/101/2/logs.log
2022-05-10 04:10:45 [32mINFO[m i.a.w.t.TemporalAttemptExecution(get):113 - Executing worker wrapper. Airbyte version: 0.37.0-alpha
2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(run):104 - start sync worker. job id: 101 attempt id: 2
2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(run):116 - configured sync modes: {null.salaries=full_refresh - overwrite}
2022-05-10 04:10:45 [32mINFO[m i.a.w.p.a.DefaultAirbyteDestination(start):69 - Running destination...
2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - Checking if airbyte/destination-snowflake:0.4.25 exists...
2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - airbyte/destination-snowflake:0.4.25 was found locally.
2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):108 - Creating docker job ID: 101
2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):163 - Preparing command: docker run --rm --init -i -w /data/101/2 --log-driver none --name destination-snowflake-write-101-2-bgfwu --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e WORKER_JOB_ATTEMPT=2 -e WORKER_CONNECTOR_IMAGE=airbyte/destination-snowflake:0.4.25 -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_VERSION=0.37.0-alpha -e WORKER_JOB_ID=101 airbyte/destination-snowflake:0.4.25 write --config destination_config.json --catalog destination_catalog.json
2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - Checking if airbyte/source-google-sheets:0.2.12 exists...
2022-05-10 04:10:45 [32mINFO[m i.a.c.i.LineGobbler(voidCall):82 - airbyte/source-google-sheets:0.2.12 was found locally.
2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):108 - Creating docker job ID: 101
2022-05-10 04:10:45 [32mINFO[m i.a.w.p.DockerProcessFactory(create):163 - Preparing command: docker run --rm --init -i -w /data/101/2 --log-driver none --name source-google-sheets-read-101-2-ouipp --network host -v airbyte_workspace:/data -v /tmp/airbyte_local:/local -e WORKER_JOB_ATTEMPT=2 -e WORKER_CONNECTOR_IMAGE=airbyte/source-google-sheets:0.2.12 -e AIRBYTE_ROLE= -e WORKER_ENVIRONMENT=DOCKER -e AIRBYTE_VERSION=0.37.0-alpha -e WORKER_JOB_ID=101 airbyte/source-google-sheets:0.2.12 read --config source_config.json --catalog source_catalog.json
2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(lambda$getDestinationOutputRunnable$6):346 - Destination output thread started.
2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(lambda$getReplicationRunnable$5):279 - Replication thread started.
2022-05-10 04:10:45 [32mINFO[m i.a.w.DefaultReplicationWorker(run):158 - Waiting for source and destination threads to complete.
2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Class path contains multiple SLF4J bindings.
2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Found binding in [jar:file:/airbyte/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Found binding in [jar:file:/airbyte/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2022-05-10 04:10:46 [43mdestination[0m > SLF4J: See <http://www.slf4j.org/codes.html#multiple_bindings> for an explanation.
2022-05-10 04:10:46 [43mdestination[0m > SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationCliParser(parseOptions):118 - integration args: {catalog=destination_catalog.json, write=null, config=destination_config.json}
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationRunner(runInternal):121 - Running integration: io.airbyte.integrations.destination.snowflake.SnowflakeDestination
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationRunner(runInternal):122 - Command: WRITE
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.b.IntegrationRunner(runInternal):123 - Integration config: IntegrationConfig{command=WRITE, configPath='destination_config.json', catalogPath='destination_catalog.json', statePath='null'}
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword examples - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword order - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword airbyte_secret - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [33mWARN[m c.n.s.JsonMetaSchema(newValidator):338 - Unknown keyword multiline - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.j.c.SwitchingDestination(getConsumer):65 - Using destination type: INTERNAL_STAGING
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.s.StagingConsumerFactory(lambda$toWriteConfig$0):99 - Write config: WriteConfig{streamName=salaries, namespace=null, outputSchemaName=shazly_test_googlesheet, tmpTableName=_airbyte_tmp_gfn_salaries, outputTableName=_airbyte_raw_salaries, syncMode=overwrite}
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.b.BufferedStreamConsumer(startTracked):116 - class io.airbyte.integrations.destination.buffered_stream_consumer.BufferedStreamConsumer started.
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.s.StagingConsumerFactory(lambda$onStartFunction$2):117 - Preparing tmp tables in destination started for 1 streams
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m i.a.i.d.s.StagingConsumerFactory(lambda$onStartFunction$2):125 - Preparing staging area in destination started for schema shazly_test_googlesheet stream salaries: tmp table: _airbyte_tmp_gfn_salaries, stage: 2022/05/10/04/7A8B06E0-100E-4628-91A5-8A32B7E7FA76/
2022-05-10 04:10:47 [43mdestination[0m > 2022-05-10 04:10:47 [32mINFO[m c.z.h.HikariDataSource(getConnection):110 - HikariPool-1 - Starting...
2022-05-10 04:10:48 [44msource[0m > Starting syncing spreadsheet 1SiEn_vS_YNRu2uBdYLPF_v4YoiS653kaJg0MFKRf4lo
2022-05-10 04:10:53 [44msource[0m > Unable to find the server at <http://sheets.googleapis.com|sheets.googleapis.com>
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/httplib2/__init__.py", line 1343, in _conn_request
conn.connect()
File "/usr/local/lib/python3.9/site-packages/httplib2/__init__.py", line 1119, in connect
address_info = socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM)
File "/usr/local/lib/python3.9/socket.py", line 954, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Try again
During handling of the above exception, another exception occurred:
Shubham Pinjwani
05/10/2022, 7:18 AMthestepafter
05/10/2022, 1:38 PMIbrahim R
05/10/2022, 6:47 PMRomain LOPEZ
05/10/2022, 8:46 PMThe connection tests failed.
Failed to load <azure://fuXXXrage.blob.core.windows.net/fXXXbi/PTB/FuzeBI_PayCode_Table.CSV>: ValueError('Unable to determine account name for shared key credential.') Traceback (most recent call last): File "/airbyte/integration_code/source_file/source.py", line 86, in check with client.reader.open(binary=client.binary_source): File "/airbyte/integration_code/source_file/client.py", line 78, in open self._file = self._open(binary=binary) File "/airbyte/integration_code/source_file/client.py", line 93, in _open return self._open_azblob_url(binary=binary) File "/airbyte/integration_code/source_file/client.py", line 207, in _open_azblob_url client = BlobServiceClient(account_url=storage_acc_url, credential=credential) File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_blob_service_client.py", line 137, in __init__ super(BlobServiceClient, self).__init__(parsed_url, service='blob', credential=credential, **kwargs) File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client.py", line 89, in __init__ self.credential = _format_shared_key_credential(self.account_name, credential) File "/usr/local/lib/python3.9/site-packages/azure/storage/blob/_shared/base_client.py", line 351, in _format_shared_key_credential raise ValueError("Unable to determine account name for shared key credential.") ValueError: Unable to determine account name for shared key credential.
Reese Ann
05/11/2022, 12:59 AMairbyte_server/logs/logs_20220101.log
second, in the output for docker-compose up, i see log messages from a few different services:
⢠airbyte-scheduler
⢠airbyte-server
⢠airbyte-temporal
⢠airbyte-db
⢠airbyte-worker
⢠airbyte-webapp
if i explore the airbyte_workspace volume though, i only see log files for scheduler and server. are the logs for the other services available in log files anywhere?Vaibhav Kumar
05/11/2022, 7:18 AMab_cdc_log_file
& ab_cdc_log_pos
(specific to mysql source) is the file name and position in the file where the record was retrievedYusuf Qoyum
05/11/2022, 7:43 AMJSY
05/11/2022, 8:11 AMnext_page_token(self, response: requests.Response) -> Optional[Mapping[str, Any]]:
these parameters. How do I get the current page? Is it in the response
parameter? What is the format of this parameter? How do I access values / variables / parameters in it?
3. What is primary_key
at https://docs.airbyte.com/connector-development/tutorials/cdk-tutorial-python-http/declare-schema for?MaĂŤlise Castel
05/11/2022, 9:42 AMAyush
05/11/2022, 10:55 AMCREATE
INDEX IF NOT EXISTS airbyte_configs_id_idx ON
AIRBYTE_CONFIGS(config_id);
]; ERROR: duplicate key value violates unique constraint "pg_type_typname_nsp_index"
Detail: Key (typname, typnamespace)=(airbyte_configs_id_seq, 2200) already exists.
at org.jooq_3.13.4.POSTGRES.debug(Unknown Source)
Akshay Sapra
05/11/2022, 11:14 AMdocker-compose down
wget -N <https://raw.githubusercontent.com/airbytehq/airbyte/master/{.env,docker-compose.yaml}>
docker-compose up
After restarting Airbyte I am getting "unknown error" on the GUI (PFA SS).
I have restarted the cloud instance and tried other browser too. Any help on it would be really appreciated.
PFA logs (after running docker-compose up)Ayush
05/11/2022, 1:15 PMdasol kim
05/11/2022, 2:10 PMJordan Choo
05/11/2022, 5:25 PMsync mode
tool tip link goes to the incremental deduped history URL when it should be going to the Connections and Sync Modes URL. You can see what I mean in the attached screenshot:
Thanks!Leonardo Gasparotto Menini
05/11/2022, 8:57 PMgenerate.sh
script to startup my python CDK HTTP API source. All went well, but after trying to run the python3 main.py spec
command, just for testing, I've got an error. The generator created the spec file as an yaml
and the spec command is trying to find spec.json
. I know I can just change the spec file to the json model and it should work. Do you know which airbyte version I could use so this doesn't happen?Francisco GarcĂa
05/11/2022, 9:36 PMFrancisco GarcĂa
05/11/2022, 9:37 PMLucas Wiley
05/11/2022, 10:18 PMVaibhav Kumar
05/12/2022, 5:49 AMab_cdc_log_file
& ab_cdc_log_pos
attributes in airbyte means binlog file and binlog position from DB ? @Marcos Marx (Airbyte)Haitham Alhad
05/12/2022, 10:00 AMairbyte-server | 2022-05-12 09:59:22 ERROR i.a.s.ServerApp(main):295 - Server failed
airbyte-server | java.lang.NullPointerException: Cannot invoke "org.flywaydb.core.api.MigrationInfo.getVersion()" because the return value of "io.airbyte.db.instance.DatabaseMigrator.getLatestMigration()" is null
airbyte-server | at io.airbyte.db.instance.MinimumFlywayMigrationVersionCheck.assertMigrations(MinimumFlywayMigrationVersionCheck.java:75) ~[io.airbyte.airbyte-db-lib-0.38.2-alpha.jar:?]
airbyte-server | at io.airbyte.server.ServerApp.assertDatabasesReady(ServerApp.java:149) ~[io.airbyte-airbyte-server-0.38.2-alpha.jar:?]
airbyte-server | at io.airbyte.server.ServerApp.getServer(ServerApp.java:182) ~[io.airbyte-airbyte-server-0.38.2-alpha.jar:?]
airbyte-server | at io.airbyte.server.ServerApp.main(ServerApp.java:291) [io.airbyte-airbyte-server-0.38.2-alpha.jar:?]
airbyte-server | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):350 - HikariPool-1 - Shutdown initiated...
airbyte-server | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):352 - HikariPool-1 - Shutdown completed.
airbyte-server | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):350 - HikariPool-2 - Shutdown initiated...
airbyte-server | 2022-05-12 09:59:22 INFO c.z.h.HikariDataSource(close):352 - HikariPool-2 - Shutdown completed.
airbyte-server exited with code 1
airbyte-scheduler | 2022-05-12 09:59:23 INFO i.a.s.a.SchedulerApp(waitForServer):225 - Waiting for server to be
To reproduce I did
docker-compose down
git pull
docker-compose up
The original version was commit d14df187dc
Max
05/12/2022, 12:21 PMdocker run --rm -i -v airbyte/source-mysql read --config /taps_configs/mysql.json --catalog /taps_configs/mysql_cat.json | docker run --rm -i airbyte/destination-redshift write --config /taps_configs/redshift.json --catalog /taps_configs/mysql_cat.json
It all works fine, data is replicated from source to redshift properly but it ends up in _airbyte_raw_customers, so I want to trigger normalisation - but I am not sure how/where? Would that be in my redshift.json or in the catalog.json maybe that i need to add something like ânormalizationâ:true?
Thanks!Mikhail
05/12/2022, 1:48 PM