https://linen.dev logo
Title
y

Yusuf Mirkar

05/25/2023, 5:52 AM
what are spec worker, connection check worker, sync worker, discovery worker ?
u

user

05/25/2023, 5:52 AM
A message from kapa.ai
u

user

05/25/2023, 5:53 AM
In Airbyte, all interactions with connectors are run as jobs performed by a Worker. Each job has a corresponding worker: 1. Spec worker: Retrieves the specification of a connector (the inputs needed to run this connector). 2. Check connection worker: Verifies that the inputs to a connector are valid and can be used to run a sync. 3. Discovery worker: Retrieves the schema of the source underlying a connector. 4. Sync worker: Used to sync data between a source and destination. Thus, there are generally 4 types of workers in Airbyte. Note: Workers here refers to Airbyte workers. Temporal, which Airbyte uses under the hood for scheduling, has its own worker concept. This distinction is important. Source: [Workers & Jobs](https://docs.airbyte.com/understanding-airbyte/jobs)
y

Yusuf Mirkar

05/25/2023, 5:55 AM
@kapa.ai I have just one source and one destination so can i keep MAX_SPEC_WORKERS=2, MAX_CHECK_WORKERS=2, MAX_DISCOVERY_WORKERS=2, MAX_SYNC_WORKERS=2
u

user

05/25/2023, 5:55 AM
A message from kapa.ai
u

user

05/25/2023, 5:55 AM
New message text here
y

Yusuf Mirkar

05/25/2023, 5:57 AM
@kapa.ai if i keep value as 2, it means 2 workers for source and 2 for destination each or two workers in the sense that one for source and one for destination ?
u

user

05/25/2023, 5:57 AM
A message from kapa.ai
u

user

05/25/2023, 5:58 AM
The values you set for
MAX_SPEC_WORKERS
,
MAX_CHECK_WORKERS
,
MAX_DISCOVERY_WORKERS
, and
MAX_SYNC_WORKERS
represent the maximum number of workers allowed to run in parallel for each specific type of job, not for sources or destinations. For example, if you set
MAX_SYNC_WORKERS=2
, it means that up to 2 sync jobs can run in parallel, regardless of the sources or destinations involved. Here's the configuration again for reference:
MAX_SPEC_WORKERS=2 MAX_CHECK_WORKERS=2 MAX_DISCOVERY_WORKERS=2 MAX_SYNC_WORKERS=2
For more information on configuring workers, refer to the [Configuring Workers section](https://docs.airbyte.com/understanding-airbyte/jobs#configuring-workers) in the Airbyte documentation.
y

Yusuf Mirkar

05/25/2023, 5:59 AM
@kapa.ai if i keep MAX_SYNC_WORKERS=1 , then for one complete sync between one source and one destination , this will work right?
u

user

05/25/2023, 5:59 AM
A message from kapa.ai
u

user

05/25/2023, 6:00 AM
New message text here
y

Yusuf Mirkar

05/25/2023, 6:19 AM
@kapa.ai airbyte-worker container is sync worker ?
u

user

05/25/2023, 6:19 AM
A message from kapa.ai
u

user

05/25/2023, 6:19 AM
New message text here
y

Yusuf Mirkar

05/25/2023, 6:21 AM
@kapa.ai so when i start sync, airbyte-worker container starts a sync job and for this it starts two other containers i.e. source-worker-container & destination-worker-container , correct ?
u

user

05/25/2023, 6:21 AM
A message from kapa.ai
u

user

05/25/2023, 6:21 AM
New message text here
y

Yusuf Mirkar

05/25/2023, 6:24 AM
@kapa.ai so, job_main_container_memory_request env variables denotes the memory for source and destination worker containers and not the airbyte-worker container , right ?
u

user

05/25/2023, 6:24 AM
A message from kapa.ai
u

user

05/25/2023, 6:24 AM
@Yusuf Mirkar That's a great question, but unfortunately I am only capable of responding to a maximum of five questions per thread. Don't hesitate to create a new thread for any further questions you have though. I'm here to help!