Hi Team I have loaded Zendesk data to s3 in parque...
# troubleshooting
s
Hi Team I have loaded Zendesk data to s3 in parquet file format. Note: the data contains json columns. Now, while I am trying to copy this s3 parquet file to redshift, I am getting the below issue: SQL Error [XX000]: ERROR: Spectrum Scan Error Detail: ----------------------------------------------- error: Spectrum Scan Error code: 15007 context: Unsupported implicit cast: Column s3://airbyte-sync-tn/data_sync/test_zendesk_super/tickets/2022_02_28_1646039367500_0.parquet.via, FromType: struct<struct<struct<byte_array,byte_array,byte_array,byte_array,byte_array,byte_array,byte_array,byte_array,map<b
So here via column is json and in redshift the corresponding column is in *VARCHAR*(65535).
u
Also when I have changed *VARCHAR*(65535) to SUPER data type for via column in redshift. I am getting getting the below error : SQL Error [XX000]: ERROR: External Catalog Error. Detail: ----------------------------------------------- error: External Catalog Error. code: 16000 context: Unsupported column type found for column: 4. Remove the column from the projection to continue.
i
Hey can you share the complete log of the sync?
a
the above is the error message, no log as such
@Harshith (Airbyte) can you please let me know
This is basically problem on changing the schema on destination right?
From the error message, it looks like the “via” column (json data) is unable to copy in redshift either in VARCHAR or SUPER datatype. How we should copy this? @Harshith (Airbyte)
Can you share the catalog for the connection?
m
Are you referring to this?
After creating destination you will be creating a connection right where you will choose streams that one
Quick question? Is it possible to transfer data directly to Redshift suing Airbyte rather than loading file from S3
Yes, but that consumes a lot of Redshift cpu capacity.