Andrew Morrison
02/21/2022, 3:48 PMAffects the size limit of an individual Redshift table. Optional. Increase this if syncing tables larger than 100GB. Files are streamed to S3 in parts. This determines the size of each part, in MBs. As S3 has a limit of 10,000 parts per file, part size affects the table size. This is 10MB by default, resulting in a default table limit of 100GB. Note, a larger part size will result in larger memory requirements. A rule of thumb is to multiply the part size by 10 to get the memory requirement. Modify this with care.
• I see that the default value is '5' in my local build, though the documentation says '10MB'. Is this just a typo, or am I missing something?
• I want to try syncing a ~1TB table using this method, which would mean that I should increase the part size to ~100MB, correct?
• Which container does the memory requirement rule of thumb apply to? I am working in Kubernetes and want to get my resource limits right.Augustin Lafanechere (Airbyte)
02/21/2022, 5:42 PMAugustin Lafanechere (Airbyte)
02/21/2022, 5:49 PMAndrew Morrison
02/22/2022, 12:19 PMAugustin Lafanechere (Airbyte)
02/22/2022, 7:19 PMJOB_MAIN_CONTAINER_MEMORY_REQUEST
and JOB_MAIN_CONTAINER_MEMORY_LIMIT
environment variable to change the memory requirements.Andrew Morrison
02/23/2022, 10:01 AM