hello! New to airbyte, great tool btw. I do have a...
# ask-community-for-troubleshooting
d
hello! New to airbyte, great tool btw. I do have a question. I have setup a connection from a online csv to a local postgres. I am unsure how to change the data type and field names of the source before it hits the destination?
u
Airbyte will mirror the source name for columns and try to infer the data type, at the moment you cannot change the type selected from Airbyte. You can, as already mentioned, use custom dbt to post-processed your data
c
this is an advanced usage of source-file: in the docs of the source-file connector, the last example shows how to rename field names from the CSV file though: https://docs.airbyte.io/integrations/sources/file it’s similar to this blog post (and specific to this connector since in Airbyte in general, you can’t rename or change types yet as @[DEPRECATED] Marcos Marx mentionned) https://www.kite.com/python/answers/how-to-set-column-names-when-importing-a-csv-into-a-pandas-dataframe-in-python
👍 1
If you make use of the custom dbt transformation to operate on the destination data, you can change column names and types of course though!
If I use the default parameters with the
epidemiology
dataset example in my source, I get these types (noticed the number columns):
But if i change the reader_options to
{"dtype":"string"}
, i can force all columns to be parsed as strings… which results in the following schema:
dtype: Type name or dict of column -> type, optional
Data type for data or columns. E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’} Use str or object together with suitable na_values settings to preserve and not interpret dtype. If converters are specified, they will be applied INSTEAD of dtype conversion.
from https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
c
Is it possible to specified only for one colums the dtype ?
Like, dtype:{'column name': string}
c
Yes, it’s in the pandas docs as i quoted:
E.g. {‘a’: np.float64, ‘b’: np.int32, ‘c’: ‘Int64’}
c
Ok thanks !! It worked ! I've been looking into this problem for 2 days haha