Meltano #singer-targets

Eni S

10/21/2024, 9:04 AM

Hi everyone, i am trying to intstall target-snowflake on a server which sits behind a proxy and the isntallation fails beacause when installing the polugin a call is made to this url: https://i32s35df22.execute-api.us-west-2.amazonaws.com. Does anyone know what this url is? All the other plugins install without any problems. Thanks!

steven_wang

10/31/2024, 5:19 PM

Is it possible to specify

load_method

for each stream? For example I want one table to be overwritten every time whereas I want another one to be append-only

Danilo Campana Fuchs

11/06/2024, 3:32 PM

Hello! Is anyone using Pipelinewise target-postgres from the new GitHub repository? It seems the package referenced when installing

meltano add loader target-postgres --variant transferwise

is not updated anymore: https://pypi.org/project/pipelinewise-target-postgres/ The original repository is archived https://github.com/transferwise/pipelinewise-target-postgres And was moved to https://github.com/transferwise/pipelinewise/tree/master/singer-connectors/target-postgres but does not publish to pip anymore

Eni S

11/13/2024, 1:15 PM

Hello! I am using target-snowflake as loader to store some data with flattening enabled. Is there a way to skip flattening on arrays? When flattening is enabled these arrays are converted to string. If no, how do you handle this problem? Thnaks!

Paul Heyns

11/24/2024, 10:50 AM

Hi, has anyone come across this: google.api_core.exceptions.TooManyRequests: 429 Exceeded rate limits: too many table update operations for this table. For more information, see https://cloud.google.com/bigquery/docs/troubleshoot-quotas; reason: rateLimitExceeded, location: table.write, message: Exceeded rate limits: too many table update operations for this table.

Paul Heyns

11/24/2024, 10:50 AM

For BigQueryTarget. It almost seems impossible to replicate big tables with a batch_job

Jeff Chen

11/25/2024, 2:59 PM

Hi all, I am new here. I just wanted to post an issue that I had with Meltano that I still cannot figure it out. It is about the add_record_metadata configuration. Basically when I set add_record_metadata=True, I do see all the following columns showing in this link (https://sdk.meltano.com/en/latest/implementation/record_metadata.html) appear. However when I set it to False, I still see the column, __sdc__table_version in my output which is a mssql table. I started to think whether this is a bug. I Just want to ask how I can remove this column from my output. Here is my meltano.yml: version: 1 default_environment: dev project_id: 99290a45-4a7e-42ae-8208-0b2ca725d652 environments: - name: dev config: plugins: extractors: - name: tap-mssql config: database: LiftStaging host: localhost port: '1434' user: sa select: - crm-contact_extract.id - crm-contact_extract.country_code loaders: - name: target-mssql config: database: LiftStaging default_target_schema: crm host: localhost port: '1434' username: sa add_record_metadata: false - name: staging - name: prod plugins: extractors: - name: tap-mssql variant: wintersrd pip_url: tap-mssql loaders: - name: target-mssql variant: meltanolabs pip_url: git+https://github.com/MeltanoLabs/target-mssql.git mappers: - name: meltano-map-transformer variant: meltano pip_url: git+https://github.com/MeltanoLabs/meltano-map-transform.git mappings: - name: table-map config: stream_maps: crm-contact_extract: __alias__: contact_loaded

Matthew Podolak

11/25/2024, 4:11 PM

Hey guys, has anyone encountered the psycopg out of memory error with the postgres target? This is my meltano config:

Copy code

version: 1
default_environment: prod
project_id: ownr-wh-meltano
environments:
  - name: prod

database_uri: $DATABASE_URL?options=-csearch_path%3Dmeltano

plugins:
  extractors:
  - name: tap-postgres
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/tap-postgres.git>
    select:
      - '!public-pg_stat_statements.*'
      - '!public-pg_stat_statements_info.*'
      - public-*.*
  - name: tap-sage-db
    inherit_from: tap-postgres
    config:
      sqlalchemy_url: $SAGE_DB_URL
      default_replication_method: INCREMENTAL
      filter_schemas: 
        - public
    metadata:
      '*':
        replication-key: updated_at
      '*campaign_history*':
        replication-key: id
      '*email_history':
        replication-key: id
      '*leaving_reasons':
        replication-key: id
  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
    config:
      sqlalchemy_url: $DATABASE_URL
      load_method: upsert
      validate_records: false
  - name: target-sage-warehouse-schema
    inherit_from: target-postgres
    config:
      default_target_schema: sage
      batch_size_rows: 5000

👀 1

Massimiliano Marcon

11/27/2024, 9:07 AM

Hello! 👋 I'm using

target-postgres

with the following schema definition (shortened):

Copy code

{"type":"SCHEMA","stream":"data_platform_raw-tbii_daily_gmv_nmv","schema":{"properties":{"platform":{"type":["string","null"]},"at_brand_id":{"type":["integer","null"]} ...

The field

at_brand_id

gets created in Postgres with type bigint (i.e int(8)). How can I make

target-postgres

create it as a standard integer (i.e. int(4)) ?

Deepanshu Mishra

11/27/2024, 5:33 PM

Hi everyone, I'm running into an issue with

target-snowflake

variant=

meltanolabs

and a custom Shopify GraphQL tap that we've built. While the tap works fine with target-jsonl, I'm getting the error "Loader failed, Block run completed." when loading into Snowflake. I've attached the relevant logs and config files. Any pointers or troubleshooting tips would be greatly appreciated! Thanks in advance!

meltano-support-logs.txt meltano.yml

✅ 1

Abednego Santoso

11/29/2024, 7:41 AM

hi everyone, is there anyway to use MinIO as target?

Tymo van Rijn

12/02/2024, 11:24 AM

Hi everyone, I'm seeking advice on associating contacts with their primary companies during data ingestion into HubSpot using Meltano. Specifically, I'm utilizing the csv-tap (Im giving the company domain and firstname,lastname,email of the contact person) and the hubspot-target. My goal is to link contacts to their respective primary companies as the data is loaded. Could anyone share best practices or conventional methods to achieve this within this setup? Any guidance would be greatly appreciated!"

Michal Ondras

12/18/2024, 6:49 PM

Hi, seeming some issues with some of my target-snowflake(s). anybody having an issue with timestamp columns i.e.

TIMESTAMP

type instead of

TIMESTAMP_NTZ

? anybody else. just popped recently... thinking it's the new tag possibly?

Tymo van Rijn

01/07/2025, 12:49 PM

Hello guys, I have a question about my target-snowflake, how can I config the target to send my data to a specific table in my database? Right now it just makes a new table with the streamname of my tab. Thanks in advance!

Samuel Nogueira Farrus

01/08/2025, 7:53 PM

Greetings, everyone! I am trying to use target-postgres with load_method overwrite for a project, and it is not actually overwriting, but appending data to the target table, without excluding the previously inserted data. Can someone help me?

Pawel Plaszczak

01/13/2025, 6:33 PM

What is the correct format of target schema and table (schema mapping) in meltano.yml? I am trying to copy a simple table from Oracle source (schema PLASZPA:SIMPLE_SOURCE) to Postgres destination (schema public.simple_target). I know that everything is correct except, I think, the schema-mapping, for which I am not finding any example anywhere. I did correctly test Oracle target connection using:

meltano invoke tap-oracle > out

And this works. Regarding the destination, I did several experiments and I think meltano correctly logs in to the target, because if I change the password to something wrong it throws an error. With the configuration below, it completes "successfully", that is, no error. Even if I set log-level debug, there are no errors. I run:

meltano run tap-oracle target-postgres

And I receive "Block run completed". However, the target does not get populated with any data. What's more, if I change the values of schema-mapping below to something silly, it gets silently ignored. This is how I know that the problem must lie in the final schema-mapping part of the config file below, but I could not find the correct format in the documentation (by the way, why isn't there an error message?)

version: 1

default_environment: dev

project_id: e3cb5953-cc6c-4546-8311-a4d841f8d5b8

environments:

- name: dev

- name: staging

- name: prod

plugins:

extractors:

- name: tap-github

variant: meltanolabs

pip_url: git+<https://github.com/MeltanoLabs/tap-github.git>

- name: tap-oracle

variant: s7clarke10

pip_url: git+<https://github.com/s7clarke10/pipelinewise-tap-oracle.git>

config:

host: ***

port: 1597

common_service_name: ***

user: PLASZPA

service_name: ***

filter_schemas: PLASZPA

filter_tables:

- PLASZPA-SIMPLE_SOURCE

metadata:

PLASZPA-SIMPLE_SOURCE:

replication-method: INCREMENTAL

replication-key: KW_UPDATE_DATE

- name: tap-csv

variant: meltanolabs

pip_url: git+<https://github.com/MeltanoLabs/tap-csv.git>

config:

files:

- ./test.csv

loaders:

- name: target-postgres

variant: meltanolabs

pip_url: meltanolabs-target-postgres

settings:

- name: host

value: localhost

- name: port

value: 5432

- name: user

value: postgres

- name: password

value: mypassword

- name: database

value: mydatabase

- name: schema_mapping

value:

PLASZPA:

target_schema: public

tables:

SIMPLE_SOURCE: simple_target

Update: In the log, I see two entries that may give some clue:

Copy code

025-01-14T08:25:07.708595Z [info     ] time=2025-01-14 09:25:07 name=singer level=INFO message=Resuming Incremental replication from KW_UPDATE_DATE = 2024-06-15T00:00:00.00+00:00 + INTERVAL '0' SECOND cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=tap-oracle producer=True run_id=43b148ed-d4c9-481e-a087-b04c276a6bee stdio=stderr string_id=tap-oracle

2025-01-14T08:25:07.851158Z [info     ] 2025-01-14 09:25:07,850 | INFO     | target-postgres.PLASZPA-SIMPLE_SOURCE | Inserting with SQL: INSERT INTO b7c75d2f_a7e4_4e65_b28b_7bd4ab721870 ("KW_UPDATE_DATE", _sdc_extracted_at, _sdc_received_at, _sdc_batched_at, _sdc_deleted_at, _sdc_sequence, _sdc_table_version, _sdc_sync_started_at) VALUES (:KW_UPDATE_DATE, :_sdc_extracted_at, :_sdc_received_at, :_sdc_batched_at, :_sdc_deleted_at, :_sdc_sequence, :_sdc_table_version, :_sdc_sync_started_at) cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=43b148ed-d4c9-481e-a087-b04c276a6bee stdio=stderr string_id=target-postgres

I think the first entry above indicates that the reason nothing gets written is that incremental load somehow did set the state to the youngest record of 2024-06-15, eventhough the older records never got written. I wonder how to reset the state but also understand why this have happened and prevent this from happening. The second entry, if I am correct, indicates that the insert statement is is also faulty. I would expect

INSERT INTO public.simple_target ("KW_UPDATE_DATE") VALUES...

Update 2: For the moment, to isolate the problem better, I changed the replication method to FULL_TABLE. I also added to meltano.yml the variable add_record_metadata, following the hints from here. The data still does not get populated, but it seems to me there are many clues in the log. The relevant log entries are:

2025-01-14T10:16:50.359653Z [info     ] time=2025-01-14 11:16:50 name=singer level=INFO message=select SELECT to_char( "KW_UPDATE_DATE" ), ORA_ROWSCN cmd_type=elb consumer=Fals

e job_name=dev:tap-oracle-to-target-postgres name=tap-oracle producer=True run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=tap-oracle

2025-01-14T10:16:50.359847Z [info     ]                                 FROM PLASZPA.SIMPLE_SOURCE cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=t

ap-oracle producer=True run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=tap-oracle

2025-01-14T10:16:50.359980Z [info     ]                                ORDER BY ORA_ROWSCN ASC cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=tap-o

racle producer=True run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=tap-oracle

2025-01-14T10:16:50.458891Z [info     ] time=2025-01-14 11:16:50 name=singer level=INFO message=METRIC: b'{"type":"counter","metric":"record_count","value":2,"tags":{"schema":"

PLASZPA","table":"SIMPLE_SOURCE"}}' cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=tap-oracle producer=True run_id=f98306be-36d6-4ad6-8178-28876657

38c2 stdio=stderr string_id=tap-oracle

2025-01-14T10:16:50.459451Z [info     ] 2025-01-14 11:16:50,459 | INFO     | target-postgres.PLASZPA-SIMPLE_SOURCE | An activate version message for 'PLASZPA-SIMPLE_SOURCE' was

received. Draining... cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 std

io=stderr string_id=target-postgres

2025-01-14T10:16:50.479410Z [info     ] 2025-01-14 11:16:50,478 | INFO     | target-postgres.PLASZPA-SIMPLE_SOURCE | Inserting with SQL: INSERT INTO "93902c9f_a35d_4875_9ecf_3b

1dfa13b059" ("KW_UPDATE_DATE", _sdc_extracted_at, _sdc_received_at, _sdc_batched_at, _sdc_deleted_at, _sdc_sequence, _sdc_table_version, _sdc_sync_started_at) VALUES (:KW_UPDAT

E_DATE, :_sdc_extracted_at, :_sdc_received_at, :_sdc_batched_at, :_sdc_deleted_at, :_sdc_sequence, :_sdc_table_version, :_sdc_sync_started_at) cmd_type=elb consumer=True job_na

me=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres

2025-01-14T10:16:50.505376Z [info     ] 2025-01-14 11:16:50,505 | INFO     | target-postgres.PLASZPA-SIMPLE_SOURCE | Hard delete: False cmd_type=elb consumer=True job_name=dev:

tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres

2025-01-14T10:16:50.519219Z [info     ] 2025-01-14 11:16:50,519 | INFO     | target-postgres      | Target 'target-postgres' completed reading 6 lines of input (1 schemas, 2 re

cords, 0 batch manifests, 2 state messages). cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6

-8178-2887665738c2 stdio=stderr string_id=target-postgres

2025-01-14T10:16:50.532380Z [info     ] 2025-01-14 11:16:50,532 | INFO     | target-postgres.PLASZPA-SIMPLE_SOURCE | Cleaning up PLASZPA-SIMPLE_SOURCE cmd_type=elb consumer=Tru

e job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres

2025-01-14T10:16:50.532713Z [info     ] 2025-01-14 11:16:50,532 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"

stream": "PLASZPA-SIMPLE_SOURCE", "pid": 463639}} cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6

-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres

2025-01-14T10:16:50.532847Z [info     ] 2025-01-14 11:16:50,532 | INFO     | target-postgres      | Emitting completed target state {"bookmarks": {"PLASZPA-SIMPLE_SOURCE": {"la

st_replication_method": "FULL_TABLE", "version": 1736849810358, "ORA_ROWSCN": null}}, "currently_syncing": null} cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-po

stgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres

2025-01-14T10:16:50.542641Z [info     ] Incremental state has been updated at 2025-01-14 10:16:50.542620+00:00.

2025-01-14T10:16:50.626925Z [info     ] Block run completed.           block_type=ExtractLoadBlocks err=None set_number=0 success=True

Pawel Plaszczak

01/14/2025, 11:35 AM

Update: I got partial solution: I found that the data replicated correctly, but it was stored in separate schema called tap_oracle. I am now looking to see how to explicitly tell to the tap-postgres loader what target schema/table it should be using.

Samuel Nogueira Farrus

01/14/2025, 11:38 AM

Greetings! A question for the community: since target-postgres load_method is not implemented yet, anyone has any idea on how to truncate/empty the target table before inserting new data??

✅ 1

Pawel Plaszczak

01/15/2025, 3:13 PM

I am unable to copy data of type NUMBER to target-postgres. I try:

cat input|meltano invoke target-postgres

Where input file has this:

{"type":"SCHEMA","stream":"MYTABLE","schema":{"properties":{"EX_ID":{"multipleOf":1e-38,"type":["null","number"]}, "KW_UPDATE_DATE":{"description":"date","format":"date-time","type":["null","string"]}},"type":"object"},"key_properties":[],"bookmark_properties":["KW_UPDATE_DATE"]}

{"type":"RECORD","stream":"MYTABLE","record":{"EX_ID":1098236,"KW_UPDATE_DATE":"2025-01-14T22:03:49.00+00:00"},"version":1736868956608,"time_extracted":"2025-01-15T13:51:52.393818Z"}

And I receive:

decimal.InvalidOperation: [<class 'decimal.DivisionImpossible'>]

The error disappears if I remove

EX_ID":1098236

which is of type number. Is this a bug? Is this linked to this issue, which seems to be saying that some numeric types are not supported? feat: Support integer types other than `BIGINT` #484 I believe @edgar_ramirez_mondragon is mentioned in this issue. If it is, then what is a solution for now? I am pulling data from Oracle so I can't do much about the incoming data type:

meltano run tap-oracle target-postgres

More generally, I am evaluating Meltano for production project and won't be able to proceed if issues like this block me. For data coming from Oracle, can someone recommend some Meltano targets with reasonable level of maturity which require less troubleshooting? Perhaps other postgres variants, or if not, other relational databases?

✅ 1

Balder Huerta

01/28/2025, 3:49 AM

Hello everyone. I hope you are having a fantastic day! I'm a new user of the tool, and having difficulties while trying to build a pipeline: We’re mirroring data from MS SQL Server to PostgreSQL using

tap-mssql

→

target-postgres

. A table with 1.1B rows and 50+ columns (

detalle_prod_emitidas

) is causing severe performance issues during upserts or incremental syncs. Here is the execution history: Attempt Method Duration Result 1 append-only 10 days Initial load (1.1B rows, 10k rows batches) 2 upsert 6 days Cancelled (timeout, 50M new rows, 10k rows batches) 3 append-only 4+ days Cancelled (timeout, 50M new rows, 10k rows batches) 4 upsert 1+ days Ongoing (30M new rows, testing new index in primary key and 100k batches) Questions for the Community 1. Upsert Optimization: ◦ How to configure

target-postgres

for faster merges? ◦ Recommended indexes/PostgreSQL tuning for tables >1B rows? 2. Incremental Extraction: ◦ Best practices for configuring

tap-mssql

to extract only new inserts (no updates) if

detalle_prod_emitidas

is append-only? 3. Alternative Approaches: ◦ Can we combine

append-only

for this table +

upsert

for others? ◦ Would using `COPY`/`pg_bulkload` for this table outside Meltano be advisable? Infrastructure: ◦ Meltano + PostgreSQL: Ubuntu server (same machine), installed on virtualenv, no containers. ◦ MS SQL: Windows Server (separate machine, same site). Here is the YML file config:

Copy code

version: 1
default_environment: dev
project_id: XXXXXXX-d7a5-4bc3-8f98-XXXXX
environments:
- name: dev
- name: staging
- name: prod
plugins:
  extractors:
  - name: tap-mssql
    variant: wintersrd
    pip_url: tap-mssql
    config:
      cursor_array_size: 100000
      database: db_source
      host: XXXXXX
      user: XXXX
    select:
    - facturacion-cat_clave_prod_serv.*
    - facturacion-cat_clave_unidad.*
    - facturacion-cat_regimen_fiscal.*
    - facturacion-cp33.*
    - facturacion-cp33_det_imp.*
    - facturacion-detalle_prod_emitidas.*
    - facturacion-nc_emitidas.*
  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
    config:
      batch_size_rows: 100000
      database: dl_main
      default_target_schema: raw_data
      host: localhost
      load_method: upsert
      port: 5500
      user: postgres

A heads snap of the current log:

Copy code

2025-01-25T18:48:43.632638Z [info     ] time=2025-01-25 19:48:43 name=singer level=INFO message=Server Parameters: version: Microsoft SQL Server 2017 (RTM) - 14.0.1000.169 (X64) cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.632965Z [info     ]     Aug 22 2017 17:04:49          cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633119Z [info     ]     Copyright (C) 2017 Microsoft Corporation cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633246Z [info     ]     Enterprise Edition (64-bit) on Windows Server 2022 Standard 10.0 <X64> (Build 20348: ) cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633398Z [info     ] , lock_timeout: -1,            cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633546Z [info     ] time=2025-01-25 19:48:43 name=singer level=INFO message=Beginning sync cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.842427Z [info     ] time=2025-01-25 19:48:43 name=singer level=INFO message=Preparing Catalog cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.944933Z [info     ] 2025-01-25 19:48:43,944 | INFO     | target-postgres      | Target 'target-postgres' is listening for input from tap. cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-25T18:48:44.052279Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=Fetching tables cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.121602Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=Tables fetched, fetching columns cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.511912Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=ARRAYSIZE=100000 cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.564381Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=Columns Fetched cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.599467Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=Catalog ready cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601060Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=selected-by-default: False cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601220Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=database-name: facturacion cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601382Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=is-view: False cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601573Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=table-key-properties: ['id'] cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601742Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=selected: True cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601876Z [info     ] time=2025-01-25 19:48:44 name=singer level=INFO message=replication-method: FULL_TABLE cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql

And a tails snap:

Copy code

2025-01-28T03:41:39.148292Z [info     ] 2025-01-28 04:41:39,148 | INFO     | target-postgres      | Target sink for 'facturacion-detalle_prod_emitidas' is full. Current size is '100000'. Draining... cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:41:41.015557Z [info     ] 2025-01-28 04:41:41,012 | INFO     | target-postgres.facturacion-detalle_prod_emitidas | Inserting with SQL: INSERT INTO ec70724a_6f93_4609_90e7_371371dbbc70 (id, idfactura, idproducto, idunidad, cantidad, subtotal, descuento, total, importe, "valorUnitario", iva_tra, tasa_iva_tra, iva_ret, tasa_iva_ret, isr_tra, tasa_isr_tra, isr_ret, tasa_isr_ret, ieps_tra, tasa_ieps_tra, ieps_ret, tasa_ieps_ret, estatus, fecha_alta, descripcion, unidad, tipo_factor_isr_ret, tipo_factor_iva_ret, tipo_factor_iva_tra, tipo_factor_ieps_ret, tipo_factor_ieps_tra, objeto_imp, _sdc_extracted_at, _sdc_received_at, _sdc_batched_at, _sdc_deleted_at, _sdc_sequence, _sdc_table_version, _sdc_sync_started_at) VALUES (:id, :idfactura, :idproducto, :idunidad, :cantidad, :subtotal, :descuento, :total, :importe, :valorUnitario, :iva_tra, :tasa_iva_tra, :iva_ret, :tasa_iva_ret, :isr_tra, :tasa_isr_tra, :isr_ret, :tasa_isr_ret, :ieps_tra, :tasa_ieps_tra, :ieps_ret, :tasa_ieps_ret, :estatus, :fecha_alta, :descripcion, :unidad, :tipo_factor_isr_ret, :tipo_factor_iva_ret, :tipo_factor_iva_tra, :tipo_factor_ieps_ret, :tipo_factor_ieps_tra, :objeto_imp, :_sdc_extracted_at, :_sdc_received_at, :_sdc_batched_at, :_sdc_deleted_at, :_sdc_sequence, :_sdc_table_version, :_sdc_sync_started_at) cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:41:56.105391Z [info     ] 2025-01-28 04:41:54,907 | INFO     | singer_sdk.metrics   | METRIC: {"type": "timer", "metric": "batch_processing_time", "value": 15.758918285369873, "tags": {"stream": "facturacion-detalle_prod_emitidas", "pid": 3457125, "status": "succeeded"}} cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:42:23.409094Z [info     ] 2025-01-28 04:42:23,408 | INFO     | singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 91050, "tags": {"stream": "facturacion-detalle_prod_emitidas", "pid": 3457125}} cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:42:23.861373Z [info     ] time=2025-01-28 04:42:23 name=singer level=INFO message=METRIC: b'{"type":"counter","metric":"record_count","value":156011,"tags":{"database":"facturacion","table":"detalle_prod_emitidas"}}' cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql

Thank you very much in advance! - Balder Huerta.

TomasB

01/29/2025, 3:47 PM

I'm using

tap-postgres

(

meltanolabs

variant) and

target-bigquery

(

youcruit

variant) . Having an issue when the column datatype from postgres side is

date

. Getting the below error

Copy code

joblib.externals.loky.process_executor._RemoteTraceback:                     
Traceback (most recent call last): 
File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/joblib/_utils.py", line 72, in __call__ 
    return self.func(**kwargs) 
   File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/joblib/parallel.py", line 598, in __call__ 
     return [func(*args, **kwargs) 
   File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/joblib/parallel.py", line 598, in <listcomp> 
     return [func(*args, **kwargs) 
   File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 470, in _drain_sink 
     self.drain_one(sink)       
   File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 460, in drain_one 
     sink.process_batch(draining_status) 
   File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/target_bigquery/sinks.py", line 354, in process_batch 
     writer(tempfile, self.parsed_schema, avro_records) 
   File "fastavro/_write.pyx", line 796, in fastavro._write.writer 
   File "fastavro/_write.pyx", line 738, in fastavro._write.Writer.write 
   File "fastavro/_write.pyx", line 465, in fastavro._write.write_data 
   File "fastavro/_write.pyx", line 409, in fastavro._write.write_record 
   File "fastavro/_write.pyx", line 463, in fastavro._write.write_data 
   File "fastavro/_write.pyx", line 343, in fastavro._write.write_union 
 ValueError: datetime.datetime(2024, 12, 17, 0, 0) (type <class 'datetime.datetime'>) do not match ['null', 'string'] on field <column_name>

and fails the sync. Is it possible to overwrite the datatype on bigquery side?

✅ 1

Pawel Plaszczak

01/31/2025, 4:03 PM

How do I ensure that the oracle-target uses thin mode, rather than cx_oracle? My problem is that I have table that contains CLOB data type, and that CLOB gets somehow interpreted as VARCHAR. I think that this is because I am using cx_oracle, which can be spotted in the following line of error stream after running "time meltano invoke target-oracle <out": cx_Oracle.DatabaseError: ORA-00910: specified length too long for its datatype of which the root cause can be seen a few lines later:

CREATE TABLE mytable (

description VARCHAR(<tel:4294967295|4294967295> CHAR),

But the field description should be CLOB, as can be seen in the out file (generated earlier by tap-oracle ingest) that I am using as input:

{"type":"SCHEMA","stream": [... omitting not relevant stuff...]

"PL_ID":{"multipleOf":1e-38,"type":["null","number"]},"PROJECT_NBR":{"multipleOf":1e-38,"type":["null","number"]},

I suppose the problem would disappear (hopefully) if I was using thin client rather than cx_Oracle both in tap and target. I achieved success in tap-oracle, it reports using thin. But how to force target-oracle to use thin client? I have already set these env variables, and ensured that oracledb is installed while cx_oracle is not installed:

$ env|grep thin

TARGET_ORACLE_DRIVER_NAME=oracle+thin

ORA_PYTHON_DRIVER_TYPE=thin

TAP_ORACLE_ORA_PYTHON_DRIVER_TYPE=thin

$ python -c "import oracledb; print(oracledb.version)"

2.5.1

$ python -c "import cx_Oracle; print(cx_Oracle.version)"

Traceback (most recent call last):

File "<string>", line 1, in <module>

ModuleNotFoundError: No module named 'cx_Oracle'

I also experimented with various contents of the variable TARGET_ORACLE_DRIVER_NAME, trying to set "thin" or "oracle+thin", for the documentation is unclear, with no effect - cx_oracle is still being used, I think. The documentation is here: https://hub.meltano.com/loaders/target-oracle Any hints?

Tymo van Rijn

02/06/2025, 9:16 AM

Goodmorning people! I have a very important question, I am using a custom tap, which outputs 2 different streams : "Contacts" and "Companies", it works well when using it with a target-jsonl, a Contacts.jsonl is made and a Companies.jsonl. The problem I am having is, when I try to make a pipeline from my custom tap to 'target-hubspot', I can't guide my 2 streams to the correct hubspot columns. How can I manage the column_mapping for my 2 different streams? Thanks in advance!

Samuel Nogueira Farrus

02/11/2025, 12:14 PM

Greetings (again)! I am creating a

tap-postgres target-postgres

EL to read and store a table with 5,500,000+ rows and 45+ columns (and unfortunately, I need them all). With

use_copy: true

, the data ingestion takes an hour to complete, while in other solutions it would take less than 10 minutes (as expected of a

COPY

statement). Any ideas on how can I optimise this??

Reuben (Matatika)

02/12/2025, 2:25 PM

Do SDK targets that process

BATCH

messages support manifests referencing public HTTPS URLs? Or it is still only local filesystem/S3 as per https://sdk.meltano.com/en/v0.44.3/batch.html?

✅ 1

Jun Pei Liang

02/15/2025, 12:42 AM

hi all, i have issue with target-s3. hope you can help! When i ran

meltano run tap-oracle target-parquet

, it executed fine without issue. When i ran

meltano run tap-oracle target-s3

, i got

Run invocation could not be completed as block failed: Loader failed

Lior Naim Alon

03/27/2025, 3:51 PM

question about

target-s3

- I want to build an environment that has multiple sources that all write to the same S3 bucket ideally, i would define the AWS credentials + bucket specifications once, however, i do want each source to write to a separate path in the bucket, for example slackSource -> s3://my-data/data/slack/... Github -> s3://my-data/data/github/.... etc... how can this be achieved without duplicating many s3-targets ? can the prefix setting somehow be changed dynamically per source or per EL operation ?

✅ 1

Michael Sandt

04/28/2025, 1:52 PM

Is there a good pattern for configuring destination column types with the

target-redshift

repos that meltano endorses? Or at least graceful upgrading of columns over time? Context: previously we were using https://github.com/transferwise/pipelinewise-target-redshift - which makes an interesting assumption that all numeric columns should get persisted as

double precision

. This isn't exactly ideal for financial information - which is the use case we're solving for. I took a look at the version that's maintained https://github.com/TicketSwap/target-redshift - and it looks like it also makes the same assumption. https://github.com/TicketSwap/target-redshift/blob/main/target_redshift/connector.py#L237

Ashish Bhat

04/29/2025, 11:28 PM

Is target-s3 installation broken for anyone ? There's some circular dep issue between these packages : aiobotocore, s3fs, idna

visch

05/13/2025, 6:30 PM

@Edgar Ramírez (Arch.dev) can we get https://pypi.org/project/meltanolabs-tap-snowflake/ bumped https://github.com/MeltanoLabs/tap-snowflake/blob/main/pyproject.toml , a release made for it as https://status.snowflake.com/incidents/txclg2cyzq32 impacts anyone that is currently using

meltanolabs-tap-snowflake