Hi! I have had a working deployment of Airbyte 0.3...
# kubernetes
h
Hi! I have had a working deployment of Airbyte 0.39.14 on GKE k8s. A couple of days ago all connections started failing and everything pointed to minIO local storage (Cannot publish to S3: Storage backend has reached its minimum free disk threshold). Therefore I have deployed the (almost) latest version v0.39.39 and created a GCS bucket and configured the key in the secret file. I can confirm that the logging has been directed to GCS bucket. However, the airbyte-worker seems to be throwing errors on sync as I probably cannot get the .env file configured properly. I am using kustomize based off of stable template. Here are the relevant variables in the .env file based on docs https://docs.airbyte.com/deploying-airbyte/on-kubernetes/#configure-logs but I have found no info on the STATE_STORAGE_MINIO_ variables:
Copy code
# S3/Minio Log Configuration
# S3_LOG_BUCKET=airbyte-dev-logs
# S3_LOG_BUCKET_REGION=
# S3_MINIO_ENDPOINT=<http://airbyte-minio-svc:9000>
# S3_PATH_STYLE_ACCESS=true
S3_LOG_BUCKET=
S3_LOG_BUCKET_REGION=
S3_MINIO_ENDPOINT=
S3_PATH_STYLE_ACCESS=

# GCS Log Configuration
GCS_LOG_BUCKET=airbyte-logging-prod

# State Storage Configuration
STATE_STORAGE_MINIO_BUCKET_NAME=airbyte-dev-logs
STATE_STORAGE_MINIO_ENDPOINT=<http://airbyte-minio-svc:9000>
The errors from the log are:
Copy code
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable DEPLOYMENT_MODE: 'OSS'
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable CONNECTOR_SPECIFIC_RESOURCE_DEFAULTS_ENABLED: 'false'
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable STATE_STORAGE_MINIO_BUCKET_NAME: ''
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable STATE_STORAGE_MINIO_ENDPOINT: ''
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable TEMPORAL_HISTORY_RETENTION_IN_DAYS: '30'
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable ACTIVITY_MAX_ATTEMPT: '5'
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable ACTIVITY_INITIAL_DELAY_BETWEEN_ATTEMPTS_SECONDS: '30'
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable ACTIVITY_MAX_DELAY_BETWEEN_ATTEMPTS_SECONDS: '600'
2022-07-27 11:50:07 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable TEMPORAL_CLOUD_ENABLED: 'false'
2022-07-27 11:50:07 [32mINFO[m i.a.w.t.TemporalUtils(getTemporalClientWhenConnected):232 - Waiting for temporal server...
2022-07-27 11:50:07 [33mWARN[m i.a.w.t.TemporalUtils(getTemporalClientWhenConnected):243 - Waiting for namespace default to be initialized in temporal...
2022-07-27 11:50:11 [32mINFO[m i.t.s.WorkflowServiceStubsImpl(<init>):188 - Created GRPC client for channel: ManagedChannelOrphanWrapper{delegate=ManagedChannelImpl{logId=1, target=airbyte-temporal-svc:7233}}
2022-07-27 11:50:16 [32mINFO[m i.a.w.t.TemporalUtils(getTemporalClientWhenConnected):260 - Temporal namespace default initialized!
2022-07-27 11:50:16 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable TEMPORAL_CLOUD_ENABLED: 'false'
2022-07-27 11:50:16 [32mINFO[m i.a.c.EnvConfigs(getEnvOrDefault):968 - Using default value for environment variable TEMPORAL_CLOUD_ENABLED: 'false'
2022-07-27 11:50:16 [32mINFO[m i.a.w.t.TemporalUtils(configureTemporalNamespace):140 - Workflow execution TTL already set for namespace default. Remains unchanged as: 30 days
2022-07-27 11:50:16 [1;31mERROR[m i.a.w.WorkerApp(main):540 - Worker app failed
java.lang.IllegalArgumentException: null
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:131) ~[guava-31.0.1-jre.jar:?]
	at io.airbyte.config.storage.DefaultS3ClientFactory.validateBase(DefaultS3ClientFactory.java:38) ~[io.airbyte.airbyte-config-config-models-0.39.39-alpha.jar:?]
	at io.airbyte.config.storage.MinioS3ClientFactory.validate(MinioS3ClientFactory.java:33) ~[io.airbyte.airbyte-config-config-models-0.39.39-alpha.jar:?]
	at io.airbyte.config.storage.MinioS3ClientFactory.<init>(MinioS3ClientFactory.java:27) ~[io.airbyte.airbyte-config-config-models-0.39.39-alpha.jar:?]
	at io.airbyte.workers.storage.S3DocumentStoreClient.minio(S3DocumentStoreClient.java:39) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.storage.StateClients.create(StateClients.java:21) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.WorkerApp.getContainerOrchestratorConfig(WorkerApp.java:354) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.WorkerApp.launchWorkerApp(WorkerApp.java:448) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.WorkerApp.main(WorkerApp.java:537) [io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
If I set the STATE_STORAGE_MINIO_ variables to empty then I also get errors similar to this:
Copy code
022-07-27 11:32:43 [32mINFO[m i.a.w.p.AsyncOrchestratorPodProcess(copyFilesToKubeConfigVolumeMain):372 - Waiting for kubectl cp to complete
2022-07-27 11:32:43 [32mINFO[m i.a.w.p.AsyncOrchestratorPodProcess(copyFilesToKubeConfigVolumeMain):379 - kubectl cp complete, closing process
2022-07-27 11:32:48 [32mINFO[m i.a.w.t.TemporalUtils(withBackgroundHeartbeat):322 - Stopping temporal heartbeating...
2022-07-27 11:32:48 [32mINFO[m i.a.w.t.TemporalAttemptExecution(lambda$getWorkerThread$2):158 - Completing future exceptionally...
java.lang.RuntimeException: io.airbyte.workers.exception.WorkerException: Running the launcher replication-orchestrator failed
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:320) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.temporal.sync.LauncherWorker.run(LauncherWorker.java:90) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalAttemptExecution.lambda$getWorkerThread$2(TemporalAttemptExecution.java:155) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at java.lang.Thread.run(Thread.java:1589) [?:?]
Caused by: io.airbyte.workers.exception.WorkerException: Running the launcher replication-orchestrator failed
	at io.airbyte.workers.temporal.sync.LauncherWorker.lambda$run$3(LauncherWorker.java:181) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:315) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	... 3 more
Caused by: io.airbyte.workers.exception.WorkerException: Non-zero exit code!
	at io.airbyte.workers.temporal.sync.LauncherWorker.lambda$run$3(LauncherWorker.java:165) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:315) ~[io.airbyte-airbyte-workers-0.39.39-alpha.jar:?]
	... 3 more
Update: I got this sorted out The docs are not clear, but based on the code it should be enough to have the variable GCS_LOG_BOCKET defined with a value
Copy code
private Optional<CloudStorageConfigs> getLogConfiguration() {
    if (getEnv(LogClientSingleton.GCS_LOG_BUCKET) != null && !getEnv(LogClientSingleton.GCS_LOG_BUCKET).isBlank()) {
      return Optional.of(CloudStorageConfigs.gcs(new GcsConfig(
          getEnvOrDefault(LogClientSingleton.GCS_LOG_BUCKET, ""),
          getEnvOrDefault(LogClientSingleton.GOOGLE_APPLICATION_CREDENTIALS, ""))));
    } else if (getEnv(LogClientSingleton.S3_MINIO_ENDPOINT) != null && !getEnv(LogClientSingleton.S3_MINIO_ENDPOINT).isBlank()) {
      return Optional.of(CloudStorageConfigs.minio(new MinioConfig(
          getEnvOrDefault(LogClientSingleton.S3_LOG_BUCKET, ""),
          getEnvOrDefault(LogClientSingleton.AWS_ACCESS_KEY_ID, ""),
          getEnvOrDefault(LogClientSingleton.AWS_SECRET_ACCESS_KEY, ""),
          getEnvOrDefault(LogClientSingleton.S3_MINIO_ENDPOINT, ""))));
    } else if (getEnv(LogClientSingleton.S3_LOG_BUCKET_REGION) != null && !getEnv(LogClientSingleton.S3_LOG_BUCKET_REGION).isBlank()) {
      return Optional.of(CloudStorageConfigs.s3(new S3Config(
          getEnvOrDefault(LogClientSingleton.S3_LOG_BUCKET, ""),
          getEnvOrDefault(LogClientSingleton.AWS_ACCESS_KEY_ID, ""),
          getEnvOrDefault(LogClientSingleton.AWS_SECRET_ACCESS_KEY, ""),
          getEnvOrDefault(LogClientSingleton.S3_LOG_BUCKET_REGION, ""))));
    } else {
      return Optional.empty();
    }
  }
Although I haven't been able to configure the STATE_STORAGE_GCS ... and the next method is using those variables: getStateStorageConfiguration