https://linen.dev logo
#troubleshooting
Title
# troubleshooting
g

Gary K

03/11/2022, 7:29 AM
Hi everyone 👋 Apologies if I'm making a few assumptions here, (it's friday afternoon and i've only done minimal searching), but I'm wondering if/how I can change the
number -> double precision
conversion that appears to be happening with the postgres connector (0.3.15 in airbyte 0.35.42-alpha)? I've got a mysql source bigint column stored with full precision in the _airbyte_data json, but the normalization is converting it to a double and I'm losing precision 😱 (Note, I'd rather not have to do a custom normalisation (from raw) of all the connection streams manually; ie no heavy lifting on my part if possible 🏋️)
a

Augustin Lafanechere (Airbyte)

03/11/2022, 3:28 PM
Hi @Gary K, I think I found the related normalization code:
Copy code
elif is_number(definition["type"]):
  sql_type = jinja_call("dbt_utils.type_float()")
And in dbt:
Copy code
{% macro mysql__type_float() %}
    float
{% endmacro %}
BIGINT - and all other int types - are considered as numbers in our JSON schema (code from airbyte-integrations/connectors/source-mysql/src/main/java/io/airbyte/integrations/source/mysql/MySqlSourceOperations.java):
Copy code
public JsonSchemaType getJsonType(MysqlType mysqlType) {
    return switch (mysqlType) {
      case
      // TINYINT(1) is boolean, but it should have been converted to MysqlType.BOOLEAN in {@link
      // getFieldType}
      TINYINT, TINYINT_UNSIGNED, SMALLINT, SMALLINT_UNSIGNED, INT, INT_UNSIGNED, MEDIUMINT, MEDIUMINT_UNSIGNED, BIGINT, BIGINT_UNSIGNED, FLOAT, FLOAT_UNSIGNED, DOUBLE, DOUBLE_UNSIGNED, DECIMAL, DECIMAL_UNSIGNED -> JsonSchemaType.NUMBER;
      case BOOLEAN -> JsonSchemaType.BOOLEAN;
      case NULL -> JsonSchemaType.NULL;
      // BIT(1) is boolean, but it should have been converted to MysqlType.BOOLEAN in {@link getFieldType}
      case BIT, TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB, BINARY, VARBINARY, GEOMETRY -> JsonSchemaType.STRING_BASE_64;
      default -> JsonSchemaType.STRING;
    };
which results in float conversion by the normalization (code from airbyte-integrations/bases/base-normalization/normalization/transform_catalog/stream_processor.py):
Copy code
elif is_number(definition["type"]):
  sql_type = jinja_call("dbt_utils.type_float()")
. @Chris Duong [Airbyte] are you aware of this? @Gary K I've no workaround to suggest at the moment except the custom normalization that you don't feel like tackling 😄
c

Chris Duong [Airbyte]

03/11/2022, 3:40 PM
yes i am aware of that it’s a problem on the source side, not normalization side see https://github.com/airbytehq/airbyte/pull/2600#discussion_r600743814
g

Gary K

03/15/2022, 10:54 PM
@[DEPRECATED] Augustin Lafanechere ouch! 😬 After some hindbrain thinking I've decided to try using views of the original tables with "safer" data types (ie numeric cast) and sync from those instead. Should work in theory, and if I don't get back to you then assume it worked ok.