Given i have postgres source with redshift destina...
# ask-ai
h
Given i have postgres source with redshift destination. What is the recommended procedure to get the data that airbyte ingests into redshift in redshift specific format with columns such as super?
u
To get the data that Airbyte ingests into Redshift in Redshift specific format with columns such as SUPER, you need to follow these steps: 1. Set up the Redshift destination connector in Airbyte. Depending on whether you're using Airbyte Cloud or Airbyte Open Source, the steps will vary slightly. In general, you'll need to navigate to the "Destinations" section, click "+ new destination", select "Redshift" from the Destination type dropdown, and fill in all the required fields to use the INSERT or COPY strategy. Finally, click
Set up destination
[source](https://docs.airbyte.com/integrations/destinations/redshift#step-2-set-up-the-destination-connector-in-airbyte). 2. Configure a Postgres source. Create a Postgres source called "Postgres-source" [source](https://airbyte.com/tutorials/full-data-synchronization#step-5). 3. Understand the output schema. Each stream will be output into its own raw table in Redshift. Each table will contain 4 columns:
_airbyte_raw_id
,
_airbyte_extracted_at
,
_airbyte_loaded_at
, and
_airbyte_data
. The
_airbyte_data
column is a json blob representing the event data and its column type in Redshift is
SUPER
[source](https://docs.airbyte.com/integrations/destinations/redshift#output-schema). 4. Understand the data type map. Airbyte maps its data types to Redshift data types. For example, the Airbyte type
OBJECT
and
ARRAY
are mapped to the Redshift type
SUPER
[source](https://docs.airbyte.com/integrations/destinations/redshift#data-type-map). 5. Run the sync. After setting up the source and destination, you can run the sync. Airbyte will first upload the data to an S3 bucket and then issue a COPY command to Redshift [source](https://docs.airbyte.com/integrations/destinations/redshift#redshift). Please note that there are some limitations when dealing with SUPER data type in Redshift. For example, an individual FIELD within a SUPER cannot exceed 65k chars [source](https://github.com/airbytehq/airbyte/issues/28541). If you encounter issues with large data, you might need to consider normalizing the
_airbyte_data
field from JSON back to a columnar format outside of Redshift before attempting to load into Redshift [source](https://github.com/airbytehq/airbyte/issues/28541).
h
@kapa.ai how can i define a target schema?
u
Defining a target schema in Airbyte can be done in a few ways depending on the context. If you're working with a stream in a connector, you can define the schema of each stream your connector can output using JSONSchema. This can be done by describing the schema of your streams using one
.json
file per stream, or dynamically generating the schema of your stream in code. The schema of a stream is the return value of
Stream.get_json_schema
. For example, if you want to define your schema in code, you can override
Stream.get_json_schema
in your stream class to return a
dict
describing the schema using JSONSchema. Here's an example:
Copy code
python def get_json_schema(self):       schema = super().get_json_schema()       schema['dynamically_determined_property'] = "property"       return schema
[Source](https://docs.airbyte.com/connector-development/cdk-python/full-refresh-stream#defining-the-streams-schema) If you're setting up a destination like Postgres, you can specify the target schema during the setup process. For example, when setting up a Postgres destination, you would specify the
schema
parameter as shown below:
Copy code
python postgres_destination = PostgresDestination(     name="postgres",     host="localhost",     port=5432,     database="postgres",     schema="public",     username="postgres",     password=POSTGRES_PASSWORD,     ssl_mode=PostgresDestination.Disable(), )
[Source](https://airbyte.com/tutorials/configure-airbyte-with-python-dagster) Please note that the schema names are case sensitive and multiple schemas may be used at one time. If no schemas are set explicitly, all existing schemas will be synced. [Source](https://docs.airbyte.com/integrations/destinations/postgres#naming-conventions-part-2) In some cases, you might encounter permission issues when trying to write to a target schema. If you receive an error stating
Current role does not have permissions on the target schema
, make sure that the destination
SCHEMA
is one that the role you've provided has permissions on. [Source](https://docs.airbyte.com/integrations/destinations/snowflake#troubleshooting)