Meltano #singer-target-development

visch

08/23/2023, 2:27 PM

https://github.com/MeltanoLabs/target-postgres/issues/168 What do folks think about a schema like

Copy code

{"type": "SCHEMA", "stream": "Tenant__History", "schema": {"type": "object", "additionalProperties": false, "properties": {"Id": {"type": "string"}, "CreatedDate": {"anyOf": [{"type": "string", "format": "date-time"},  "OldValue": {}, "NewValue": {}}}, "key_properties": ["Id"], "bookmark_properties": ["CreatedDate"]}

Specifically

OldValue

and

NewValue

should we just cast these to strings, or should this fail? https://github.com/singer-io/getting-started/blob/master/docs/SPEC.md#schema-message isn't super clear to me, I think JSON Schema's spec says

{}

just means anything goes. So maybe we should accept this and just convert the value to a string?

visch

08/31/2023, 8:25 PM

@pat_nadolny your presentation yesterday had me thinking about a "target-pandas", a simple python template that would pull your records from STDIN, put all the records following the schema into a dataframe and then the user can do whatever they want with that data. Would allow for "pure python" folks that don't want the data to bounce around to databases (and have data sized no bigger than your memory) I currently do this pattern via

meltano run tap-name target-postgres autoidm-utility

the utility just uses pandas to

select * from xyz

on the table I wrote to, but really I only do this as I have a lot of other things I use the data for in the DB the use case you were talking about wouldn't 🤷

visch

09/27/2023, 5:02 PM

What do we think about https://github.com/meltano/sdk/blob/main/singer_sdk/sinks/sql.py#L73-L81 from the office hours conversation today regarding removing the

databasename

from database taps. Question from @burton_dewilde

Henning Holgersen

12/01/2023, 1:38 PM

I am upgrading my SQL targets from sdk version 0.13.0 to the newest version, but we are seeing our PR checks fail because of a suddenly incorrect password. The tests still pass if I switch to main branch. Has there been any changes there I should know about?

Kristjan K

01/31/2024, 3:52 PM

Can anybody share their insight or ideas on this: https://meltano.slack.com/archives/C069CPBCVPV/p1706620937493089

Reuben (Matatika)

02/03/2024, 12:56 AM

When running the SDK standard built-in tap tests, I'm getting this error:

Copy code

/home/reuben/Documents/taps/tap-msaccess/tests/test_core.py::TestTapMSAccess::test_tap_stream_transformed_catalog_schema_matches_record failed with error: Test failed with exception
file /home/reuben/.cache/pypoetry/virtualenvs/tap-msaccess-XilzlSH0-py3.8/lib/python3.8/site-packages/singer_sdk/testing/templates.py, line 173
      def run(  # type: ignore[override]
E       fixture 'stream' not found
>       available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, config, doctest_namespace, monkeypatch, pytestconfig, record_property, record_testsuite_property, record_xml_attribute, recwarn, resource, runner, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
>       use 'pytest --fixtures [testpath]' for help on them.
/home/reuben/.cache/pypoetry/virtualenvs/tap-msaccess-XilzlSH0-py3.8/lib/python3.8/site-packages/singer_sdk/testing/templates.py:173

Anyone run into this issue before? I wonder if there is some condition that needs to be met by the tap that makes the

stream

fixture available internally... The tap is doing dynamic stream discovery if that helps narrow it down.

👀 1

✅ 1

william chaplin

02/05/2024, 7:54 PM

Hello, I have a question about best practice and use of targets. We have a use case for importing user data from one system to create users in another. A Meltano tap has already been made for the source system. Creating the users involves a lot of business logic and lookup tables, is it appropriate to create a Singer Target that implements that logic or does this not fit the intended purpose? Does any one have experience with creating a target that implements business logic and how it worked for them?

Ian OLeary

03/08/2024, 2:30 PM

Hello, I have a tap where most of my stream endpoints require pagination (between two dates) which I currently have working - however I also have a tap that only requires a boolean parameter True or False without the date parameters since it only returns a few hundred rows (pulls active users). How should I go about overriding the other parameter logic so I only pass in the True or False parameter and turn off pagination?

Copy code

class UsersListStream(JobDivaStream):

    def get_new_paginator(self):
        return None
    
    def get_url_params(
    self,
    context: dict | None,  # noqa: ARG002
    next_page_token: date | None,  # noqa: ANN401
    ) -> dict[str, Any]:
        return {
            "onlyInternalUsers": True,
        }

I added this to my stream class - what else do I need to do? Or am I doing this wrong lol

✅ 1

Ruben Vereecken

03/27/2024, 3:36 PM

I have a target (

target-airtable

) that I can install using

poetry install

but not using

pip install

, so

meltano install

fails. Apparently pip doesn’t have access to the

singer-sdk

version that is specified, but poetry does. For

singer-sdk = "^0.3.2"

, poetry installs

0.3.18

, which doesn’t show up under

pip index versions singer-sdk

Investigating, thought I’d share as I go

Mishank Gehlot

03/28/2024, 10:30 AM

Hello, I have a custom mapper using which i make some changes in schema, record and state messages of tap-mysql. Specifically, I'm attempting to change the stream value, ensuring it remains case-sensitive. However, despite providing the tablename as "ABc", it's being converted to lowercase ("abc") and loaded into the database using target-postgres. I attempted to address this by adding extra double quotes, but unfortunately, this resulted in an error. Removing the extra double quotes resolves the error, but it also leads to the name being converted to lowercase. can someone help me with this? Logs:

Copy code

2024-03-28T09:32:13.284781Z [info     ] 2024-03-28 09:32:13,284 | INFO     | root                 | tablename: "TableName_a3", shortname: A3 cmd_type=elb consumer=True name=mapper-companyspecifcmapper producer=True stdio=stderr string_id=mapper-companyspecifcmapper
2024-03-28T09:32:13.293916Z [info     ] time=2024-03-28 09:32:13 name=target_postgres level=INFO message=Table '""tablename_a3""' does not exist. Creating... CREATE TABLE IF NOT EXISTS schema1.""tablename_a3"" ("_sdc_batched_at" timestamp without time zone, "_sdc_deleted_at" character varying, "_sdc_extracted_at" timestamp without time zone, "id" bigint, PRIMARY KEY ("id")) cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.301023Z [info     ] Traceback (most recent call last): cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.301288Z [info     ]   File "/home/meltano/.meltano/loaders/target-postgres/venv/bin/target-postgres", line 8, in <module> cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.301545Z [info     ]     sys.exit(main())           cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.301735Z [info     ]   File "/home/meltano/.meltano/loaders/target-postgres/venv/lib/python3.9/site-packages/target_postgres/__init__.py", line 373, in main cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.301914Z [info     ]     persist_lines(config, singer_messages) cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.302092Z [info     ]   File "/home/meltano/.meltano/loaders/target-postgres/venv/lib/python3.9/site-packages/target_postgres/__init__.py", line 219, in persist_lines cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.302279Z [info     ]     stream_to_sync[stream].sync_table() cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.302451Z [info     ]   File "/home/meltano/.meltano/loaders/target-postgres/venv/lib/python3.9/site-packages/target_postgres/db_sync.py", line 589, in sync_table cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.307815Z [info     ]     self.query(query)          cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.308244Z [info     ]   File "/home/meltano/.meltano/loaders/target-postgres/venv/lib/python3.9/site-packages/target_postgres/db_sync.py", line 311, in query cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.308603Z [info     ]     cur.execute(               cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.308826Z [info     ]   File "/home/meltano/.meltano/loaders/target-postgres/venv/lib/python3.9/site-packages/psycopg2/extras.py", line 146, in execute cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.309039Z [info     ]     return super().execute(query, vars) cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.309251Z [info     ] psycopg2.errors.SyntaxError: zero-length delimited identifier at or near """" cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.309467Z [info     ] LINE 1: CREATE TABLE IF NOT EXISTS schema1.""tablename_a3""... cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.309692Z [info     ]                                             ^ cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.309902Z [info     ]                                cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2024-03-28T09:32:13.359303Z [error    ] Loader failed

Anita Bajariya

03/29/2024, 7:09 AM

I have tap-mysql target-postgres. I want few tables in postgres schema1 & few tables in postgres schema2.. how to do it? Table name is dynamic, not static, for ex, User_<companyId>

Siddu Hussain

05/16/2024, 8:00 PM

Do we have a reference which shows how to develop rest target using Meltano sdk

Nir Diwakar (Nir)

06/12/2024, 8:54 AM

Is it possible to read any extractor configuration when in the target namespace? For example in the tap namesapace, I can read using: client.py ..... self.config["domain"] ..... meltano.yml ..... plugins: extractors: - name: tap-custom namespace: tap_custom pip_url: -e ./ capabilities: - state - discover - catalog config: domain: "https://www.googleapis.com" .... However, if I use self.config in targets namespace I can only read configuration from targets

Reuben (Matatika)

06/17/2024, 4:22 PM

Running into some problems upgrading

target-mssql

to a newer SDK version, where it is not playing nicely with the SQLConnection refactor back in

v0.20.0

... I'm getting an

Invalid object name

error, which suggests to me some issue with the target engine/connection implementation (works fine on

v0.19.0

, breaks on

v0.20.0

). Would appreciate some pointers if anyone has some experience here. 🙂 `connector.py`: https://github.com/storebrand/target-mssql/blob/56968e01ab9f7295ef9a0aeeec96459353185d70/target_mssql/connector.py SDK

v0.19.0

v0.20.0

diff: https://github.com/meltano/sdk/compare/v0.19.0...v0.20.0?diff=split&w=#diff-db20b40a2eb1ac17938f49d9757779cdb9998129d7fc075a964d20096ddb4b48

Reuben (Matatika)

06/18/2024, 1:58 PM

Is there a nice way to programmatically opt-in a target to output metrics, like how SDK taps do out-the-box? https://github.com/meltano/sdk/blob/63340915751b798bade84572a6d6bda2da9e2d1b/singer_sdk/streams/core.py#L1063-L1064 https://github.com/meltano/sdk/blob/63340915751b798bade84572a6d6bda2da9e2d1b/singer_sdk/streams/core.py#L1123

✅ 1

👀 1

visch

08/20/2024, 6:59 PM

Finally had a target debug scenario where I needed to make it work in vscode instead of hacking together testing vscode launch examples for target

Copy code

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "target-hubspot Python Debug",
            "type": "debugpy",
            "request": "launch",
            "module": "target_hubspot",
            "console": "integratedTerminal",
            "args": [
                "--config",
                "config.json",
                "--input",
                "data.singer",
            ]
        }
    ]
}

I've gotten away for this long with pdb and some remote pdb telnet trickery

🙌 1

ashish

09/18/2024, 4:02 PM

How to use

private_key

target-snowflake

? I see we can use

private_key_path

though

✅ 1

Conner Panarella (SpaceCondor)

09/27/2024, 4:34 PM

I'm looking at this bit of code in `target-postgres`: https://github.com/MeltanoLabs/target-postgres/blob/d07b41583e8ff77ee770a0d40779ea9485772461/target_postgres/sinks.py#L161-L169

Copy code

for record in records:
    insert_record = {column.name: record.get(column.name) for column in columns}
    # No need to check for a KeyError here because the SDK already
    # guarantees that all key properties exist in the record.
    primary_key_value = "".join([str(record[key]) for key in primary_keys])
    insert_records[primary_key_value] = insert_record

Wouldn't this potentially cause with records being erroneously thrown out? We are just concatenating the values of the primary keys to check for duplication. So the following would be equivalent: Record 1: • Primary Key #1: AB • Primary Key #2: C Record 2: • Primary Key #1: A • Primary Key #2: BC

Conner Panarella (SpaceCondor)

09/30/2024, 8:27 PM

Is anyone familiar with the connection pooling behavior for SQL Connections in the sdk? I am having an issue with the number of idle connections ballooning with

target-postgres

Rafael Santana

10/01/2024, 7:59 PM

I'm trying to run an integration from

tap-googhe-sheets

target-bigquery

, but I'm having authentication problems with the target. A json service account is requested but my app only obtains oauth credentials from users. Is there a way to use target bigquery only with the credentials obtained from social login?

👀 1

Nir Diwakar (Nir)

10/02/2024, 2:47 PM

Hi, what is the correct way to set the elt buffersize in meltano.yml? Is it this:

Copy code

loaders:
  - name: target-elasticsearch
    namespace: target_elasticsearch
    pip_url: -e ./targets/target-elasticsearch
    executable: target-elasticsearch
    config:
      url: ${ES_URL}
      scheme: https
      slack_channel: ${SLACK_CHANNEL}
      slack_token: ${SLACK_TOKEN}
      dagit_url: ${DAGIT_BASE_URL}
      environment: ${ENV}
      tap: ${TAP}
    env:
      MELTANO_ELT_BUFFER_SIZE: 20000000

or this:

Copy code

loaders:
  - name: target-elasticsearch
    namespace: target_elasticsearch
    pip_url: -e ./targets/target-elasticsearch
    executable: target-elasticsearch
    config:
      url: ${ES_URL}
      scheme: https
      slack_channel: ${SLACK_CHANNEL}
      slack_token: ${SLACK_TOKEN}
      dagit_url: ${DAGIT_BASE_URL}
      environment: ${ENV}
      tap: ${TAP}
    elt:
      buffer_size: 20000000

David Peterson

10/29/2024, 8:40 PM

Hi all, I’m new to Meltano, I’m developing a tap for the ETL infrastructure my company has deployed and I’m trying to set up a target to redis using this guide: https://jove.medium.com/build-a-meltano-target-within-1-hour-7134df6244fb I’m trying to just set it up and do a simple hard coded hset command in my database to know that I have got all of my settings right and I’m using a version of tap-jsonplaceholder as my tap. I have tried a bunch of different settings, but I am getting a BrokenPipeError so I’m sure I am missing something in my code. I’ll put my code in the thread, any help would be appreciated.

Edgar Ramírez (Arch.dev)

10/30/2024, 9:49 PM

If you currently maintain or are planning to develop a SQL target, this might be of interest: I got a PR that tries to improve the way SQL target developers can customize the mapping from JSON schema to SQL types: meltano/sdk#2732. I've also started work on a reference implementation for MeltanoLabs/target-postgres: MeltanoLabs/target-postgres#469. Questions and suggestions are of course welcome!

🙌 4

visch

11/13/2024, 3:03 PM

Wanted to call out a release to target-apprise . I almost think we should change the name to

target-notifications

https://github.com/AutoIDM/target-apprise/releases/tag/v0.1.0 Now allows you to dynmically change your apprise uris based on incoming data https://github.com/AutoIDM/target-apprise?tab=readme-ov-file#dynamically-providing-target-emails so if you want different from/to emails based on data. ie your Datawarehouse you want to send notifications to the marketing team or the accounting team, you can do that now without having separate notification targets

👌 1

mark_johnston

11/14/2024, 8:31 PM

I have a question on

target-snowflake

and the difference between

default_schema_name

and

schema

config properties after we had permissions issues with file formats: https://github.com/MeltanoLabs/target-snowflake/discussions/92#discussioncomment-11259580

👀 1

joshua_janicas

12/02/2024, 3:21 PM

Hi Team, I was looking at the target-snowflake GitHub documentation for https://github.com/MeltanoLabs/target-snowflake and have two questions. 1. Trying to wrap my head around the

append-only

upsert

method descriptions for

load_method

. They sound the same to me...or maybe I need more coffee.

The method to use when loading data into the destination. append-only will always write all input records whether that records already exists or not.

upsert will update existing records and insert new records.

Does

append-only

delete matching records and then just re-insert them, vs a traditional upsert? 2. How would (1) interact with

hard_delete

? Wouldn't choosing

overwrite

as the load method already act as a hard_delete? What about the other two options?

👀 1

Pedro Ceriotti

12/03/2024, 8:36 PM

Hi everyone, does any of you know if the

load_method

built-in setting works for

target-postgres

(https://github.com/MeltanoLabs/target-postgres)? It’s documented in the README file, but I have been testing different configurations and it doesn’t seem to work as expected. Curious if there’s support for

overwrite

method and/or if there’s any alternative to achieve that other than implementing it by myself. Thanks!

Senne Vanstraelen

12/13/2024, 3:31 PM

Hello All, I have been using Meltano for a while to perform our ELT processes. For a specific use-case, we now need to develop a reverse ETL pipeline. I was thinking about using Meltano for this as well (considering it is already integrated in our stack). The destination system for our reverse ETL flow is a REST API where we need to do POST/PUT requests. I can't seem to find some good examples on this subject. Does anyone have some good examples of Targets for REST API's in general? When doing the post, we also get back an import_id. How can we pull this import_id out of the target in order to monitor the status of the import? Or can we do this inside the meltano target as well?

Chase Brammer

02/26/2025, 2:52 AM

Hey all! Trying to instantiate a tap with only certain catalog routes enabled for a unit test, but can't find how to pass that in. For example:

Copy code

tap = TapGitLab(config=TEST_CONFIG, catalog=TEST_CATALOG, parse_env_config=True)

Could someone point me to some code that passes the catalog object? I can't find the definition via a file or declared in code please? And/Or how do I call a specific catalog item from a test? All I can see is

tap.sync_all()

TYIA

Lior Naim Alon

04/27/2025, 10:03 AM

Hello, regarding target-hubspot, i've encountered a TODO comment in the source code that is relevant for my usage: https://github.com/MeltanoLabs/target-hubspot/blob/f4ecfab7348cff7333ef1748a725164cc9672491/target_hubspot/target.py#L46 i need to insert data to a custom hubspot object, and i have forked the repo in order to implement a version that suits my requirements. i'd also like to push a pull request to handle it, and i'm wondering which solution is better: 1. remove the validation altogether - if hubspot supports large amount of custom values there shouldn't be a validation like singer-sdk provides (which only allows specific values, no wild-cards) 2. add to the array all possible values of "2-X" to "2-XXXXXXXX" - this will definietly work and maintain the validation - the biggest downside i can see is that on errors the logs will be flooded with that huge array. so something like:

allowed_values=["0-1", "0-2", "0-3", "0-5", "0-48", "0-49", "0-47", "0-4", "0-27", "0-7", "0-8", "0-18", "0-116", "0-54", "0-19"] + [f"2-{i}" for i in range(1, 100000)]

In general i didn't understand the schema of the column_mapping configuration until i encountered errors so i think the logging aspect is important. I'm not an expert on collaborating in open-source development so forgive me if my attitude is naive or misguided. Thanks