Maksymilian Lumenn Stachowiak
10/15/2024, 9:10 PMEni S
10/21/2024, 9:04 AMsteven_wang
10/31/2024, 5:19 PMload_method
for each stream? For example I want one table to be overwritten every time whereas I want another one to be append-onlyDanilo Campana Fuchs
11/06/2024, 3:32 PMmeltano add loader target-postgres --variant transferwise
is not updated anymore:
https://pypi.org/project/pipelinewise-target-postgres/
The original repository is archived https://github.com/transferwise/pipelinewise-target-postgres
And was moved to https://github.com/transferwise/pipelinewise/tree/master/singer-connectors/target-postgres but does not publish to pip anymoreEni S
11/13/2024, 1:15 PMPaul Heyns
11/24/2024, 10:50 AMPaul Heyns
11/24/2024, 10:50 AMJeff Chen
11/25/2024, 2:59 PMMatthew Podolak
11/25/2024, 4:11 PMversion: 1
default_environment: prod
project_id: ownr-wh-meltano
environments:
- name: prod
database_uri: $DATABASE_URL?options=-csearch_path%3Dmeltano
plugins:
extractors:
- name: tap-postgres
variant: meltanolabs
pip_url: git+<https://github.com/MeltanoLabs/tap-postgres.git>
select:
- '!public-pg_stat_statements.*'
- '!public-pg_stat_statements_info.*'
- public-*.*
- name: tap-sage-db
inherit_from: tap-postgres
config:
sqlalchemy_url: $SAGE_DB_URL
default_replication_method: INCREMENTAL
filter_schemas:
- public
metadata:
'*':
replication-key: updated_at
'*campaign_history*':
replication-key: id
'*email_history':
replication-key: id
'*leaving_reasons':
replication-key: id
loaders:
- name: target-postgres
variant: meltanolabs
pip_url: meltanolabs-target-postgres
config:
sqlalchemy_url: $DATABASE_URL
load_method: upsert
validate_records: false
- name: target-sage-warehouse-schema
inherit_from: target-postgres
config:
default_target_schema: sage
batch_size_rows: 5000
Massimiliano Marcon
11/27/2024, 9:07 AMtarget-postgres
with the following schema definition (shortened):
{"type":"SCHEMA","stream":"data_platform_raw-tbii_daily_gmv_nmv","schema":{"properties":{"platform":{"type":["string","null"]},"at_brand_id":{"type":["integer","null"]} ...
The field at_brand_id
gets created in Postgres with type bigint (i.e int(8)). How can I make target-postgres
create it as a standard integer (i.e. int(4)) ?Deepanshu Mishra
11/27/2024, 5:33 PMtarget-snowflake
variant=meltanolabs
and a custom Shopify GraphQL tap that we've built. While the tap works fine with target-jsonl, I'm getting the error "Loader failed, Block run completed." when loading into Snowflake.
I've attached the relevant logs and config files. Any pointers or troubleshooting tips would be greatly appreciated!
Thanks in advance!Abednego Santoso
11/29/2024, 7:41 AMTymo van Rijn
12/02/2024, 11:24 AMMichal Ondras
12/18/2024, 6:49 PMTIMESTAMP
type instead of TIMESTAMP_NTZ
? anybody else. just popped recently... thinking it's the new tag possibly?Tymo van Rijn
01/07/2025, 12:49 PMSamuel Nogueira Farrus
01/08/2025, 7:53 PMPawel Plaszczak
01/13/2025, 6:33 PMmeltano invoke tap-oracle > out
And this works. Regarding the destination, I did several experiments and I think meltano correctly logs in to the target, because if I change the password to something wrong it throws an error. With the configuration below, it completes "successfully", that is, no error. Even if I set log-level debug, there are no errors. I run:
meltano run tap-oracle target-postgres
And I receive "Block run completed". However, the target does not get populated with any data. What's more, if I change the values of schema-mapping below to something silly, it gets silently ignored. This is how I know that the problem must lie in the final schema-mapping part of the config file below, but I could not find the correct format in the documentation (by the way, why isn't there an error message?)
version: 1
default_environment: dev
project_id: e3cb5953-cc6c-4546-8311-a4d841f8d5b8
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-github
variant: meltanolabs
pip_url: git+<https://github.com/MeltanoLabs/tap-github.git>
- name: tap-oracle
variant: s7clarke10
pip_url: git+<https://github.com/s7clarke10/pipelinewise-tap-oracle.git>
config:
host: ***
port: 1597
common_service_name: ***
user: PLASZPA
service_name: ***
filter_schemas: PLASZPA
filter_tables:
- PLASZPA-SIMPLE_SOURCE
metadata:
PLASZPA-SIMPLE_SOURCE:
replication-method: INCREMENTAL
replication-key: KW_UPDATE_DATE
- name: tap-csv
variant: meltanolabs
pip_url: git+<https://github.com/MeltanoLabs/tap-csv.git>
config:
files:
- ./test.csv
loaders:
- name: target-postgres
variant: meltanolabs
pip_url: meltanolabs-target-postgres
settings:
- name: host
value: localhost
- name: port
value: 5432
- name: user
value: postgres
- name: password
value: mypassword
- name: database
value: mydatabase
- name: schema_mapping
value:
PLASZPA:
target_schema: public
tables:
SIMPLE_SOURCE: simple_target
Update: In the log, I see two entries that may give some clue:
025-01-14T08:25:07.708595Z [info ] time=2025-01-14 09:25:07 name=singer level=INFO message=Resuming Incremental replication from KW_UPDATE_DATE = 2024-06-15T00:00:00.00+00:00 + INTERVAL '0' SECOND cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=tap-oracle producer=True run_id=43b148ed-d4c9-481e-a087-b04c276a6bee stdio=stderr string_id=tap-oracle
2025-01-14T08:25:07.851158Z [info ] 2025-01-14 09:25:07,850 | INFO | target-postgres.PLASZPA-SIMPLE_SOURCE | Inserting with SQL: INSERT INTO b7c75d2f_a7e4_4e65_b28b_7bd4ab721870 ("KW_UPDATE_DATE", _sdc_extracted_at, _sdc_received_at, _sdc_batched_at, _sdc_deleted_at, _sdc_sequence, _sdc_table_version, _sdc_sync_started_at) VALUES (:KW_UPDATE_DATE, :_sdc_extracted_at, :_sdc_received_at, :_sdc_batched_at, :_sdc_deleted_at, :_sdc_sequence, :_sdc_table_version, :_sdc_sync_started_at) cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=43b148ed-d4c9-481e-a087-b04c276a6bee stdio=stderr string_id=target-postgres
I think the first entry above indicates that the reason nothing gets written is that incremental load somehow did set the state to the youngest record of 2024-06-15, eventhough the older records never got written. I wonder how to reset the state but also understand why this have happened and prevent this from happening.
The second entry, if I am correct, indicates that the insert statement is is also faulty. I would expect INSERT INTO public.simple_target ("KW_UPDATE_DATE") VALUES...
Update 2: For the moment, to isolate the problem better, I changed the replication method to FULL_TABLE. I also added to meltano.yml the variable add_record_metadata, following the hints from here. The data still does not get populated, but it seems to me there are many clues in the log. The relevant log entries are:
2025-01-14T10:16:50.359653Z [info ] time=2025-01-14 11:16:50 name=singer level=INFO message=select SELECT to_char( "KW_UPDATE_DATE" ), ORA_ROWSCN cmd_type=elb consumer=Fals
e job_name=dev:tap-oracle-to-target-postgres name=tap-oracle producer=True run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=tap-oracle
2025-01-14T10:16:50.359847Z [info ] FROM PLASZPA.SIMPLE_SOURCE cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=t
ap-oracle producer=True run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=tap-oracle
2025-01-14T10:16:50.359980Z [info ] ORDER BY ORA_ROWSCN ASC cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=tap-o
racle producer=True run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=tap-oracle
2025-01-14T10:16:50.458891Z [info ] time=2025-01-14 11:16:50 name=singer level=INFO message=METRIC: b'{"type":"counter","metric":"record_count","value":2,"tags":{"schema":"
PLASZPA","table":"SIMPLE_SOURCE"}}' cmd_type=elb consumer=False job_name=dev:tap-oracle-to-target-postgres name=tap-oracle producer=True run_id=f98306be-36d6-4ad6-8178-28876657
38c2 stdio=stderr string_id=tap-oracle
2025-01-14T10:16:50.459451Z [info ] 2025-01-14 11:16:50,459 | INFO | target-postgres.PLASZPA-SIMPLE_SOURCE | An activate version message for 'PLASZPA-SIMPLE_SOURCE' was
received. Draining... cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 std
io=stderr string_id=target-postgres
2025-01-14T10:16:50.479410Z [info ] 2025-01-14 11:16:50,478 | INFO | target-postgres.PLASZPA-SIMPLE_SOURCE | Inserting with SQL: INSERT INTO "93902c9f_a35d_4875_9ecf_3b
1dfa13b059" ("KW_UPDATE_DATE", _sdc_extracted_at, _sdc_received_at, _sdc_batched_at, _sdc_deleted_at, _sdc_sequence, _sdc_table_version, _sdc_sync_started_at) VALUES (:KW_UPDAT
E_DATE, :_sdc_extracted_at, :_sdc_received_at, :_sdc_batched_at, :_sdc_deleted_at, :_sdc_sequence, :_sdc_table_version, :_sdc_sync_started_at) cmd_type=elb consumer=True job_na
me=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres
2025-01-14T10:16:50.505376Z [info ] 2025-01-14 11:16:50,505 | INFO | target-postgres.PLASZPA-SIMPLE_SOURCE | Hard delete: False cmd_type=elb consumer=True job_name=dev:
tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres
2025-01-14T10:16:50.519219Z [info ] 2025-01-14 11:16:50,519 | INFO | target-postgres | Target 'target-postgres' completed reading 6 lines of input (1 schemas, 2 re
cords, 0 batch manifests, 2 state messages). cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6
-8178-2887665738c2 stdio=stderr string_id=target-postgres
2025-01-14T10:16:50.532380Z [info ] 2025-01-14 11:16:50,532 | INFO | target-postgres.PLASZPA-SIMPLE_SOURCE | Cleaning up PLASZPA-SIMPLE_SOURCE cmd_type=elb consumer=Tru
e job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres
2025-01-14T10:16:50.532713Z [info ] 2025-01-14 11:16:50,532 | INFO | singer_sdk.metrics | METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"
stream": "PLASZPA-SIMPLE_SOURCE", "pid": 463639}} cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-postgres name=target-postgres producer=False run_id=f98306be-36d6
-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres
2025-01-14T10:16:50.532847Z [info ] 2025-01-14 11:16:50,532 | INFO | target-postgres | Emitting completed target state {"bookmarks": {"PLASZPA-SIMPLE_SOURCE": {"la
st_replication_method": "FULL_TABLE", "version": 1736849810358, "ORA_ROWSCN": null}}, "currently_syncing": null} cmd_type=elb consumer=True job_name=dev:tap-oracle-to-target-po
stgres name=target-postgres producer=False run_id=f98306be-36d6-4ad6-8178-2887665738c2 stdio=stderr string_id=target-postgres
2025-01-14T10:16:50.542641Z [info ] Incremental state has been updated at 2025-01-14 10:16:50.542620+00:00.
2025-01-14T10:16:50.626925Z [info ] Block run completed. block_type=ExtractLoadBlocks err=None set_number=0 success=True
Pawel Plaszczak
01/14/2025, 11:35 AMSamuel Nogueira Farrus
01/14/2025, 11:38 AMPawel Plaszczak
01/15/2025, 3:13 PMcat input|meltano invoke target-postgres
Where input file has this:
{"type":"SCHEMA","stream":"MYTABLE","schema":{"properties":{"EX_ID":{"multipleOf":1e-38,"type":["null","number"]}, "KW_UPDATE_DATE":{"description":"date","format":"date-time","type":["null","string"]}},"type":"object"},"key_properties":[],"bookmark_properties":["KW_UPDATE_DATE"]}
{"type":"RECORD","stream":"MYTABLE","record":{"EX_ID":1098236,"KW_UPDATE_DATE":"2025-01-14T22:03:49.00+00:00"},"version":1736868956608,"time_extracted":"2025-01-15T13:51:52.393818Z"}
And I receive:
decimal.InvalidOperation: [<class 'decimal.DivisionImpossible'>]
The error disappears if I remove EX_ID":1098236
which is of type number.
Is this a bug?
Is this linked to this issue, which seems to be saying that some numeric types are not supported? feat: Support integer types other than `BIGINT` #484
I believe @edgar_ramirez_mondragon is mentioned in this issue.
If it is, then what is a solution for now? I am pulling data from Oracle so I can't do much about the incoming data type:
meltano run tap-oracle target-postgres
More generally, I am evaluating Meltano for production project and won't be able to proceed if issues like this block me. For data coming from Oracle, can someone recommend some Meltano targets with reasonable level of maturity which require less troubleshooting? Perhaps other postgres variants, or if not, other relational databases?Balder Huerta
01/28/2025, 3:49 AMtap-mssql
โ target-postgres
. A table with 1.1B rows and 50+ columns (detalle_prod_emitidas
) is causing severe performance issues during upserts or incremental syncs.
Here is the execution history:
Attempt Method Duration Result
1 append-only 10 days Initial load (1.1B rows, 10k rows batches)
2 upsert 6 days Cancelled (timeout, 50M new rows, 10k rows batches)
3 append-only 4+ days Cancelled (timeout, 50M new rows, 10k rows batches)
4 upsert 1+ days Ongoing (30M new rows, testing new index in primary key and 100k batches)
Questions for the Community
1. Upsert Optimization:
โฆ How to configure target-postgres
for faster merges?
โฆ Recommended indexes/PostgreSQL tuning for tables >1B rows?
2. Incremental Extraction:
โฆ Best practices for configuring tap-mssql
to extract only new inserts (no updates) if detalle_prod_emitidas
is append-only?
3. Alternative Approaches:
โฆ Can we combine append-only
for this table + upsert
for others?
โฆ Would using `COPY`/`pg_bulkload` for this table outside Meltano be advisable?
Infrastructure:
โฆ Meltano + PostgreSQL: Ubuntu server (same machine), installed on virtualenv, no containers.
โฆ MS SQL: Windows Server (separate machine, same site).
Here is the YML file config:
version: 1
default_environment: dev
project_id: XXXXXXX-d7a5-4bc3-8f98-XXXXX
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-mssql
variant: wintersrd
pip_url: tap-mssql
config:
cursor_array_size: 100000
database: db_source
host: XXXXXX
user: XXXX
select:
- facturacion-cat_clave_prod_serv.*
- facturacion-cat_clave_unidad.*
- facturacion-cat_regimen_fiscal.*
- facturacion-cp33.*
- facturacion-cp33_det_imp.*
- facturacion-detalle_prod_emitidas.*
- facturacion-nc_emitidas.*
loaders:
- name: target-postgres
variant: meltanolabs
pip_url: meltanolabs-target-postgres
config:
batch_size_rows: 100000
database: dl_main
default_target_schema: raw_data
host: localhost
load_method: upsert
port: 5500
user: postgres
A heads snap of the current log:
2025-01-25T18:48:43.632638Z [info ] time=2025-01-25 19:48:43 name=singer level=INFO message=Server Parameters: version: Microsoft SQL Server 2017 (RTM) - 14.0.1000.169 (X64) cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.632965Z [info ] Aug 22 2017 17:04:49 cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633119Z [info ] Copyright (C) 2017 Microsoft Corporation cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633246Z [info ] Enterprise Edition (64-bit) on Windows Server 2022 Standard 10.0 <X64> (Build 20348: ) cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633398Z [info ] , lock_timeout: -1, cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.633546Z [info ] time=2025-01-25 19:48:43 name=singer level=INFO message=Beginning sync cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.842427Z [info ] time=2025-01-25 19:48:43 name=singer level=INFO message=Preparing Catalog cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:43.944933Z [info ] 2025-01-25 19:48:43,944 | INFO | target-postgres | Target 'target-postgres' is listening for input from tap. cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-25T18:48:44.052279Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=Fetching tables cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.121602Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=Tables fetched, fetching columns cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.511912Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=ARRAYSIZE=100000 cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.564381Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=Columns Fetched cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.599467Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=Catalog ready cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601060Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=selected-by-default: False cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601220Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=database-name: facturacion cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601382Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=is-view: False cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601573Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=table-key-properties: ['id'] cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601742Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=selected: True cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
2025-01-25T18:48:44.601876Z [info ] time=2025-01-25 19:48:44 name=singer level=INFO message=replication-method: FULL_TABLE cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
And a tails snap:
2025-01-28T03:41:39.148292Z [info ] 2025-01-28 04:41:39,148 | INFO | target-postgres | Target sink for 'facturacion-detalle_prod_emitidas' is full. Current size is '100000'. Draining... cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:41:41.015557Z [info ] 2025-01-28 04:41:41,012 | INFO | target-postgres.facturacion-detalle_prod_emitidas | Inserting with SQL: INSERT INTO ec70724a_6f93_4609_90e7_371371dbbc70 (id, idfactura, idproducto, idunidad, cantidad, subtotal, descuento, total, importe, "valorUnitario", iva_tra, tasa_iva_tra, iva_ret, tasa_iva_ret, isr_tra, tasa_isr_tra, isr_ret, tasa_isr_ret, ieps_tra, tasa_ieps_tra, ieps_ret, tasa_ieps_ret, estatus, fecha_alta, descripcion, unidad, tipo_factor_isr_ret, tipo_factor_iva_ret, tipo_factor_iva_tra, tipo_factor_ieps_ret, tipo_factor_ieps_tra, objeto_imp, _sdc_extracted_at, _sdc_received_at, _sdc_batched_at, _sdc_deleted_at, _sdc_sequence, _sdc_table_version, _sdc_sync_started_at) VALUES (:id, :idfactura, :idproducto, :idunidad, :cantidad, :subtotal, :descuento, :total, :importe, :valorUnitario, :iva_tra, :tasa_iva_tra, :iva_ret, :tasa_iva_ret, :isr_tra, :tasa_isr_tra, :isr_ret, :tasa_isr_ret, :ieps_tra, :tasa_ieps_tra, :ieps_ret, :tasa_ieps_ret, :estatus, :fecha_alta, :descripcion, :unidad, :tipo_factor_isr_ret, :tipo_factor_iva_ret, :tipo_factor_iva_tra, :tipo_factor_ieps_ret, :tipo_factor_ieps_tra, :objeto_imp, :_sdc_extracted_at, :_sdc_received_at, :_sdc_batched_at, :_sdc_deleted_at, :_sdc_sequence, :_sdc_table_version, :_sdc_sync_started_at) cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:41:56.105391Z [info ] 2025-01-28 04:41:54,907 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "batch_processing_time", "value": 15.758918285369873, "tags": {"stream": "facturacion-detalle_prod_emitidas", "pid": 3457125, "status": "succeeded"}} cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:42:23.409094Z [info ] 2025-01-28 04:42:23,408 | INFO | singer_sdk.metrics | METRIC: {"type": "counter", "metric": "record_count", "value": 91050, "tags": {"stream": "facturacion-detalle_prod_emitidas", "pid": 3457125}} cmd_type=elb consumer=True job_name=dev:tap-mssql-to-target-postgres name=target-postgres producer=False run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=target-postgres
2025-01-28T03:42:23.861373Z [info ] time=2025-01-28 04:42:23 name=singer level=INFO message=METRIC: b'{"type":"counter","metric":"record_count","value":156011,"tags":{"database":"facturacion","table":"detalle_prod_emitidas"}}' cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-postgres name=tap-mssql producer=True run_id=03a6a07c-6906-459e-ac92-9cc176b0aaf9 stdio=stderr string_id=tap-mssql
Thank you very much in advance!
- Balder Huerta.TomasB
01/29/2025, 3:47 PMtap-postgres
(meltanolabs
variant) and target-bigquery
(youcruit
variant) . Having an issue when the column datatype from postgres side is date
. Getting the below error
joblib.externals.loky.process_executor._RemoteTraceback:
Traceback (most recent call last):
File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/joblib/_utils.py", line 72, in __call__
return self.func(**kwargs)
File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/joblib/parallel.py", line 598, in __call__
return [func(*args, **kwargs)
File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/joblib/parallel.py", line 598, in <listcomp>
return [func(*args, **kwargs)
File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 470, in _drain_sink
self.drain_one(sink)
File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/singer_sdk/target_base.py", line 460, in drain_one
sink.process_batch(draining_status)
File "/project/.meltano/loaders/target-bigquery/venv/lib/python3.10/site-packages/target_bigquery/sinks.py", line 354, in process_batch
writer(tempfile, self.parsed_schema, avro_records)
File "fastavro/_write.pyx", line 796, in fastavro._write.writer
File "fastavro/_write.pyx", line 738, in fastavro._write.Writer.write
File "fastavro/_write.pyx", line 465, in fastavro._write.write_data
File "fastavro/_write.pyx", line 409, in fastavro._write.write_record
File "fastavro/_write.pyx", line 463, in fastavro._write.write_data
File "fastavro/_write.pyx", line 343, in fastavro._write.write_union
ValueError: datetime.datetime(2024, 12, 17, 0, 0) (type <class 'datetime.datetime'>) do not match ['null', 'string'] on field <column_name>
and fails the sync. Is it possible to overwrite the datatype on bigquery side?Pawel Plaszczak
01/31/2025, 4:03 PMCREATE TABLE mytable (
description VARCHAR(<tel:4294967295|4294967295> CHAR),
But the field description should be CLOB, as can be seen in the out file (generated earlier by tap-oracle ingest) that I am using as input:
{"type":"SCHEMA","stream": [... omitting not relevant stuff...]
"PL_ID":{"multipleOf":1e-38,"type":["null","number"]},"PROJECT_NBR":{"multipleOf":1e-38,"type":["null","number"]},
I suppose the problem would disappear (hopefully) if I was using thin client rather than cx_Oracle both in tap and target. I achieved success in tap-oracle, it reports using thin. But how to force target-oracle to use thin client? I have already set these env variables, and ensured that oracledb is installed while cx_oracle is not installed:
$ env|grep thin
TARGET_ORACLE_DRIVER_NAME=oracle+thin
ORA_PYTHON_DRIVER_TYPE=thin
TAP_ORACLE_ORA_PYTHON_DRIVER_TYPE=thin
$ python -c "import oracledb; print(oracledb.version)"
2.5.1
$ python -c "import cx_Oracle; print(cx_Oracle.version)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'cx_Oracle'
I also experimented with various contents of the variable TARGET_ORACLE_DRIVER_NAME, trying to set "thin" or "oracle+thin", for the documentation is unclear, with no effect - cx_oracle is still being used, I think. The documentation is here: https://hub.meltano.com/loaders/target-oracle
Any hints?Tymo van Rijn
02/06/2025, 9:16 AMSamuel Nogueira Farrus
02/11/2025, 12:14 PMtap-postgres target-postgres
EL to read and store a table with 5,500,000+ rows and 45+ columns (and unfortunately, I need them all). With use_copy: true
, the data ingestion takes an hour to complete, while in other solutions it would take less than 10 minutes (as expected of a COPY
statement). Any ideas on how can I optimise this??Reuben (Matatika)
02/12/2025, 2:25 PMBATCH
messages support manifests referencing public HTTPS URLs? Or it is still only local filesystem/S3 as per https://sdk.meltano.com/en/v0.44.3/batch.html?Jun Pei Liang
02/15/2025, 12:42 AMmeltano run tap-oracle target-parquet
, it executed fine without issue. When i ran meltano run tap-oracle target-s3
, i got Run invocation could not be completed as block failed: Loader failed
.Lior Naim Alon
03/27/2025, 3:51 PMtarget-s3
- I want to build an environment that has multiple sources that all write to the same S3 bucket
ideally, i would define the AWS credentials + bucket specifications once, however, i do want each source to write to a separate path in the bucket, for example
slackSource -> s3://my-data/data/slack/...
Github -> s3://my-data/data/github/....
etc...
how can this be achieved without duplicating many s3-targets ? can the prefix setting somehow be changed dynamically per source or per EL operation ?Michael Sandt
04/28/2025, 1:52 PMtarget-redshift
repos that meltano endorses? Or at least graceful upgrading of columns over time?
Context: previously we were using https://github.com/transferwise/pipelinewise-target-redshift - which makes an interesting assumption that all numeric columns should get persisted as double precision
.
This isn't exactly ideal for financial information - which is the use case we're solving for.
I took a look at the version that's maintained https://github.com/TicketSwap/target-redshift - and it looks like it also makes the same assumption. https://github.com/TicketSwap/target-redshift/blob/main/target_redshift/connector.py#L237Ashish Bhat
04/29/2025, 11:28 PM