https://linen.dev logo
#ask-community-for-troubleshooting
Title
# ask-community-for-troubleshooting
d

Duan Uys

07/16/2021, 4:09 AM
Hello! Creating a custom source connector and running into some trouble I am having the hardest of times trying to jump from "discover" to "read" Following the Python tutorial and I can't understand WHERE configured airbyte catalog comes from?? The output from discover generates a Airbyte Catalog, but then during the "read" command, it takes a --catalog option to read from a file??? Even in the tutorials you never write to anyfile.. so where is this configured airbyte catalog supposed to come from??\
s

s

07/16/2021, 5:36 PM
hi @Duan Uys sorry about the confusion! The configured catalog is a slight variation of the “vanilla” catalog. this tutorial (https://docs.airbyte.io/understanding-airbyte/catalog) hopefully clarifies the difference Typically when creating a custom connector in order to run read, one would take the output of
discover
and convert it to a configured catalog. what that entails is configuring each stream additionally with its configured sync modes and cursor fields if required. For example, if you had the following catalog:
Copy code
{
  "streams": [
     {
       "name": "users", 
       "json_schema": {
           "type":"object", 
            "properties": ...
        },
        "supported_sync_modes": ["incremental", "full_refresh"],
        "source_defined_cursor": true
     }
  ]
}
you’d convert it to a configured catalog like so:
Copy code
{
  "streams": [
    {
      "stream": {
       "name": "users", 
       "json_schema": {
           "type":"object", 
            "properties": ...
        },
        "supported_sync_modes": ["incremental", "full_refresh"],
        "source_defined_cursor": true, 
        "primary_key": [["id"]]
      },
      "sync_mode": "incremental",
      "user_configured_primary_key": [["id"]], 
      "destination_sync_mode": "overwrite"
    }
  ]
}
thanks for pointing this out, we should definitely make this clearer/easier to work with
j

James Mulholland

01/18/2024, 12:29 PM
I’m following up on this old thread after running into this problem myself 3 years later. Does anyone have advice on how I should go about auto-converting the discover output into the read input? This looks like it needs to factor in some of my stream config. Is this possible with the CDK at all?
m

Marcos Marx (Airbyte)

01/18/2024, 1:21 PM
@James Mulholland please post in #public-help-connector-development
j

James Mulholland

01/19/2024, 11:51 AM
I did some more digging here and discovered the schema generator tool which is exactly what I was after Can be found here: https://github.com/airbytehq/airbyte/blob/d083d156652fa6849bce4946a9897c01edafcdec/tools/schema_generator/README.md#L0-L1