https://linen.dev logo
Join Slack
Powered by
# best-practices
  • s

    Siddu Hussain

    04/11/2025, 10:28 AM
    Hi All, I am trying to understand how Meltano internals are working. I am not able to figure out why one of the cases below is working but not both. • I am using Multiprocessing to call, child stream with parent keys. the data sent to Target is getting into race condition and the data posted is broken at times like ◦ expected sample, I have not added signer format but consider this in singer format and type record ▪︎
    {"key1": "value1" }, {"key2" : "value2"}
    ◦ random race condition data sample: the second record is emitted and written to the target before completing the first record writing. ▪︎
    {"key1": {"key2" : "value2"}
    ◦ I was under the assumption this was happening because of batching at tap and data written to jsonl ◦ This is happening even on removing the batching at tap • I tried Multiprocessing outside tap and call the meltano el as a subprocess for each chunk of data like below . This works without race condition.
    Copy code
    def run(time_range):
        try:
            is_backfill = os.environ.get("TAP_ZOOM_IS_BACKFILL")
            start_time, end_time = time_range
            start_time = shlex.quote(start_time)
            # start = start_time.replace(" ", "").replace(":", "")
            end_time = shlex.quote(end_time)
            cmd = (
                f"export MELTANO_STATE_BACKEND_URI='<s3://iflow-prod/state/backfill/zoom/>'"
                f"export TAP_ZOOM_FROM_DATE={start_time} TAP_ZOOM_TO_DATE={end_time} TAP_ZOOM_IS_BACKFILL={is_backfill}; "
                f"source .venv/bin/activate ; "
                f"meltano el tap-zoom target-s3-zoom --force;"
            )
            subprocess.run(cmd, shell=True, check=True)
            return {"time_range": time_range, "status": "success"}
        except Exception as e:
            return {"time_range": time_range, "status": "error", "error": str(e)}
    I was wondering if both the approaches spin an individual stdout pipe for each process spun but why is it getting into race condition in case 1 and not in case 2. My understanding is meltano sends data to stdout as per the target emit code, • might be a silly question but. How is Meltano differentiating logs that are emitted and singer records emitted? • when I spin a separate process this stdout should be different from the main process stdout right or else, is it the same stdout pipe. thanks for the time to read through any help is much appreciated, Thanks and have a great day
    e
    • 2
    • 6
  • a

    Anthony Shook

    04/17/2025, 5:06 PM
    Random thought problem: Let’s say I have a table with 1billion+ rows, and for the longest time, I’ve been replicating it on an auto-incrementing
    id
    column. However, the table is mutable at the source and has an
    updated_at
    column, so that means I’m not catching changes in the source table once I’ve pulled the at-the-moment value of a row. So my situation is this: • I want to update meltano config from using
    id
    as my replication-key to using
    updated_at
    as my replication key, with
    id
    as a value in
    table-key-properties
    • I don’t want to start from the beginning of time, because it’s absolutely too much data to handle, so I’ve got to manually set a
    date
    So the question is — how would you go about it?
    e
    • 2
    • 8
  • t

    Tanner Wilcox

    05/01/2025, 6:50 PM
    We need to run a
    show arp
    command on all routers at our ISP and get that data into our warehouse. Ansible is really good at communicating with network devices. It has profiles for each type and is able to recognize when a command starts/ends and parses that data for you. I don't think there's an equivalent tap for that with meltano. I'm wondering what the best way is to merge the two. Maybe I could make a tap that will call out to ansible and ansible can write what I want to a json file then my meltano tap can read from that json and pump it in to a raw table. Seems kind of weird at that point because all my tap is doing is just reading from a file. I could have ansible write directly to my postgres db but that feels like it'd be stepping on Meltano's toes. Looking for input
    👀 1
    v
    • 2
    • 7
  • d

    Don Venardos

    05/05/2025, 10:57 PM
    What is the best practice for setting up a different sync schedule for a subset of tables in a database? Should that be a separate project or maybe just set that up as an environment in the same project? Use case is that we have a table that receives periodic large bulk inserts and don't want that interfere with the other tables that have small changes which we want to replicate on a faster interval.
    m
    • 2
    • 2
  • s

    Siba Prasad Nayak

    05/08/2025, 1:42 PM
    Hi Team, I am getting one issue with "meltano invoke tap-mysql". MySQL is installed on my local machine. (localhost)
    Copy code
    - name: tap-mysql
        namespace: tap_mysql
        pip_url: ./connectors/tap-mysql
        executable: tap-mysql
        capabilities:
        - about
        - batch
        - stream-maps
        - schema-flattening
        - discover
        - catalog
        - state
        settings:
        - name: host
          kind: string
          value: localhost
        - name: port
          kind: integer
          value: 3306  # Or whatever port your PostgreSQL is running on
        - name: user
          value: root
        - name: password
          kind: string
          value: *******  # Use an environment variable!
          sensitive: true
        - name: database
          kind: string
          value: world
        - name: is_vitess
          kind: boolean
          value: false
    Error:
    Copy code
    (sibaVenv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend> meltano invoke tap-mysql
    2025-05-08T13:38:51.175173Z [warning  ] Failed to create symlink to 'meltano.exe': administrator privilege required
    2025-05-08T13:38:51.189778Z [info     ] Environment 'dev' is active   
    Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
    join our friendly Slack community.
    
    Catalog discovery failed: command ['C:\\Siba_\\Work\\POC_ConnectorFactory\\Gerrit\\Connector_Factory_Development\\meltano-backend\\.meltano\\extractors\\tap-mysql\\venv\\Scripts\\tap-mysql.exe', '--config', 'C:\\Siba_\\Work\\POC_ConnectorFactory\\Gerrit\\Connector_Factory_Development\\meltano-backend\\.meltano\\run\\tap-mysql\\tap.79a26421-4773-4b39-a35d-577aa37522b8.config.json', '--discover'] returned 1 with stderr:
     Traceback (most recent call last):
      File "<frozen runpy>", line 198, in _run_module_as_main
      File "<frozen runpy>", line 88, in _run_code
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Scripts\tap-mysql.exe\__main__.py", line 7, in <module>
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\click\core.py", line 1161, in __call__
        return self.main(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\click\core.py", line 1081, in main
        with self.make_context(prog_name, args, **extra) as ctx:
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\click\core.py", line 949, in make_context
        self.parse_args(ctx, args)
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\click\core.py", line 1417, in parse_args
        value, args = param.handle_parse_result(ctx, opts, args)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\click\core.py", line 2403, in handle_parse_result
        value = self.process_value(ctx, value)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\click\core.py", line 2365, in process_value
        value = self.callback(ctx, self, value)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\singer_sdk\tap_base.py", line 554, in cb_discover
        tap.run_discovery()
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\singer_sdk\tap_base.py", line 309, in run_discovery
        catalog_text = self.catalog_json_text
                       ^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\singer_sdk\tap_base.py", line 329, in catalog_json_text
        return dump_json(self.catalog_dict, indent=2)
                         ^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\tap_mysql\tap.py", line 333, in catalog_dict
        result["streams"].extend(self.connector.discover_catalog_entries())
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-mysql\venv\Lib\site-packages\singer_sdk\connectors\sql.py", line 998, in discover_catalog_entries
        (reflection.ObjectKind.TABLE, False),
         ^^^^^^^^^^^^^^^^^^^^^
    AttributeError: module 'sqlalchemy.engine.reflection' has no attribute 'ObjectKind'
    
    (sibaVenv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend>
    Copy code
    (sibaVenv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend> python -m pip show SQLAlchemy
    Name: SQLAlchemy
    Version: 2.0.39
    Summary: Database Abstraction Library
    Home-page: <https://www.sqlalchemy.org>
    Author: Mike Bayer
    Author-email: <mailto:mike_mp@zzzcomputing.com|mike_mp@zzzcomputing.com>
    License: MIT
    Location: C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\sibaVenv\Lib\site-packages
    Requires: greenlet, typing-extensions
    Required-by: alembic, meltano
    (sibaVenv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend> meltano --version
    meltano, version 3.7.4
    (sibaVenv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend>
    e
    • 2
    • 4
  • t

    Tanner Wilcox

    05/09/2025, 10:36 PM
    I need to drop my raw schema every time before running running a specific tap. In my research I learned about macros. I wrote this one based on a macro I saw from a blog post:
    Copy code
    {%- macro drop_schema() -%}
        {%- set drop_query -%}
            drop schema {{ target.schema }}
        {%- endset -%}
        {% do run_query(drop_query) %}
    {%- endmacro -%}
    I'm assuming I should be able to do something like this:
    mel run dbt:run-operation:drop_schema sonar warehouse
    but I get an error saying it can't find drop_schema. I have it in
    ./macros/
    . I'm assuming I need to put it in my meltano.yml under my dbt transfromer section. Maybe it should go under utilities? I'm at a loss
    r
    • 2
    • 5
  • s

    Siba Prasad Nayak

    05/14/2025, 6:21 AM
    Hi Team, I am facing one issue with tap-sftp.
    Copy code
    (venv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend>
     meltano --log-level=debug invoke tap-sftp
    2025-05-14T06:09:15.617690Z [warning  ] Failed to create symlink to 'meltano.exe': administrator privilege required
    2025-05-14T06:09:15.624021Z [debug    ] Meltano 3.6.0, Python 3.12.3, Windows (AMD64)
    2025-05-14T06:09:15.632565Z [debug    ] Looking up time zone info from registry
    2025-05-14T06:09:15.651188Z [info     ] Environment 'dev' is active   
    2025-05-14T06:09:15.707195Z [debug    ] Creating DB engine for project at 'C:\\Siba_\\Work\\POC_ConnectorFactory\\Gerrit\\Connector_Factory_Development\\meltano-backend' with DB URI 'sqlite:/C:\\Siba_\\Work\\POC_ConnectorFactory\\Gerrit\\Connector_Factory_Development\\meltano-backend\\.meltano/meltano.db'
    2025-05-14T06:09:16.045430Z [debug    ] Skipped installing extractor 'tap-sftp'
    2025-05-14T06:09:16.046429Z [debug    ] Skipped installing 1/1 plugins
    2025-05-14T06:09:16.231095Z [debug    ] Created configuration at C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\run\tap-sftp\tap.31d90bf6-66b2-4bc2-9d17-9305905bbcdf.config.json
    2025-05-14T06:09:16.235112Z [debug    ] Could not find tap.properties.json in C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-sftp\tap.properties.json, skipping.
    2025-05-14T06:09:16.238126Z [debug    ] Could not find tap.properties.cache_key in C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-sftp\tap.properties.cache_key, skipping.
    2025-05-14T06:09:16.240124Z [debug    ] Could not find state.json in C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-sftp\state.json, skipping.
    2025-05-14T06:09:16.248129Z [debug    ] Invoking: ['C:\\Siba_\\Work\\POC_ConnectorFactory\\Gerrit\\Connector_Factory_Development\\meltano-backend\\.meltano\\extractors\\tap-sftp\\venv\\Scripts\\tap-sftp.exe', '--config', 'C:\\Siba_\\Work\\POC_ConnectorFactory\\Gerrit\\Connector_Factory_Development\\meltano-backend\\.meltano\\run\\tap-sftp\\tap.31d90bf6-66b2-4bc2-9d17-9305905bbcdf.config.json']
    C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-sftp\venv\Lib\site-packages\paramiko\pkey.py:59: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from cryptography.hazmat.primitives.ciphers.algorithms in 48.0.0.
      "cipher": algorithms.TripleDES,
    C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-sftp\venv\Lib\site-packages\paramiko\transport.py:219: CryptographyDeprecationWarning: Blowfish has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.Blowfish and will be removed from cryptography.hazmat.primitives.ciphers.algorithms in 45.0.0.
      "class": algorithms.Blowfish,
    C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\extractors\tap-sftp\venv\Lib\site-packages\paramiko\transport.py:243: CryptographyDeprecationWarning: TripleDES has been moved to cryptography.hazmat.decrepit.ciphers.algorithms.TripleDES and will be removed from cryptography.hazmat.primitives.ciphers.algorithms in 48.0.0.
      "class": algorithms.TripleDES,
    2025-05-14T06:09:17.479363Z [debug    ] Deleted configuration at C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend\.meltano\run\tap-sftp\tap.31d90bf6-66b2-4bc2-9d17-9305905bbcdf.config.json
    (venv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend>
    I have created a basic configuration in meltano.yml file.
    Copy code
    - name: tap-sftp
        namespace: tap_sftp
        pip_url: ./connectors/tap-sftp
        executable: tap-sftp
        config:
          host: 10.148.155.30
          port: 22
          username: ubuntu
          start_date: 2025-05-13
          private_key_file: bridgex.pem
          tables:
          - table_name: single_file_test
            search_prefix: /home/ubuntu
            search_pattern: 'wget-log'
    Copy code
    (venv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend>
     meltano config tap-sftp list
    2025-05-14T06:16:25.653934Z [warning  ] Failed to create symlink to 'meltano.exe': administrator privilege required
    2025-05-14T06:16:25.677427Z [info     ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
    
    Custom, possibly unsupported by the plugin:
    host [env: TAP_SFTP_HOST] current value: '10.148.155.30' (from `meltano.yml`)
    port [env: TAP_SFTP_PORT] current value: 22 (from `meltano.yml`)
    username [env: TAP_SFTP_USERNAME] current value: 'ubuntu' (from `meltano.yml`)
    start_date [env: TAP_SFTP_START_DATE] current value: '2025-05-13' (from `meltano.yml`)
    private_key_file [env: TAP_SFTP_PRIVATE_KEY_FILE] current value: 'bridgex.pem' (from `meltano.yml`)
    tables [env: TAP_SFTP_TABLES] current value: [{'table_name': 'single_file_test', 'search_prefix': '/home/ubuntu', 'search_pattern': 'wget-log'}] (from `meltano.yml`)
    (venv) PS C:\Siba_\Work\POC_ConnectorFactory\Gerrit\Connector_Factory_Development\meltano-backend>
    Can anyone please help.
    r
    s
    • 3
    • 21
  • a

    Andy Carter

    05/21/2025, 12:13 PM
    When I started with my meltano deployment I didn't have a separate systemdb, so used Azure for state store. Now I've seen the light, and moved to a system db. Is there any good reason to maintain a separate state store in Azure storage? Or is it just unnecessary complication?
    v
    e
    • 3
    • 4
  • k

    Krisna Aditya

    05/22/2025, 4:03 AM
    Hi everyone, I’ve been using the tap-postgres from MeltanoLabs version for doing hourly log-based replication. I wonder is there any way to parameterized the replication slot name? Instead of hardcoded
    tappostgres
    can it be named something else? I might come into scenario where I must replicate the same source to two separate destination while data migration is happening. Thank you!
    v
    e
    • 3
    • 5
  • s

    Siba Prasad Nayak

    05/23/2025, 10:38 AM
    Hi Team, I am using the "tap-sftp" from singer https://github.com/singer-io/tap-sftp Getting one issue saying
    Copy code
    paramiko.ssh_exception.SSHException: Incompatible ssh peer (no acceptable host key)
    For this I made a change in the client.py
    Copy code
    self.transport._preferred_keys = ('ssh-rsa', 'ecdsa-sha2-nistp256', 'ecdsa-sha2-nistp384', 'ecdsa-sha2-nistp521', 'ssh-ed25519', 'ssh-dss')
    Copy code
    def __try_connect(self):
            if not self.__active_connection:
                try:
                    self.transport = paramiko.Transport((self.host, self.port))
                    self.transport.use_compression(True)
                    self.transport._preferred_keys = ('ssh-rsa', 'ecdsa-sha2-nistp256', 'ecdsa-sha2-nistp384', 'ecdsa-sha2-nistp521', 'ssh-ed25519', 'ssh-dss')
                    self.transport.connect(username = self.username, pkey = self.key)
                    self.sftp = paramiko.SFTPClient.from_transport(self.transport)
                except (AuthenticationException, SSHException) as ex:
                    self.transport.close()
                    self.transport = paramiko.Transport((self.host, self.port))
                    self.transport.use_compression(True)
                    self.transport._preferred_keys = ('ssh-rsa', 'ecdsa-sha2-nistp256', 'ecdsa-sha2-nistp384', 'ecdsa-sha2-nistp521', 'ssh-ed25519', 'ssh-dss')
                    self.transport.connect(username= self.username, pkey = None)
                    self.sftp = paramiko.SFTPClient.from_transport(self.transport)
                self.__active_connection = True
                # get 'socket' to set the timeout
                socket = self.sftp.get_channel()
                # set request timeout
                socket.settimeout(self.request_timeout)
    Even after making this change, its not resolving the issue.
    ✅ 1
    r
    • 2
    • 1
  • s

    Steven Searcy

    05/30/2025, 8:58 PM
    Looking to modularize my Meltano project. Is using
    pkl
    modules still considered a good practice here? I know there are some limitations to
    include_paths
    .
    ✅ 1
    e
    • 2
    • 2
  • s

    Siba Prasad Nayak

    06/06/2025, 5:14 AM
    Team, Do we have tap-onedrive from meltano side ? Has anyone ever tried it ?
    e
    c
    • 3
    • 3
  • b

    Bruno Arnabar

    06/06/2025, 3:35 PM
    is there a best practice approach to apply a conditional when using an extractor?
    e
    • 2
    • 1
  • s

    Siba Prasad Nayak

    06/06/2025, 5:47 PM
    Hi Team, I would like to introduce data encryption to my pipeline. Is there any default encryption mechanism available from meltano ?
    e
    • 2
    • 1
  • t

    Tanner Wilcox

    06/10/2025, 9:07 PM
    https://github.com/MeltanoLabs/tap-csv Is there a way to configure this tap to drop the table before it runs every time?
    e
    • 2
    • 3
  • s

    Sac

    06/25/2025, 3:05 PM
    Hi everyone, Is there a simple way to change the timezone for the logging? I’d like to keep everything as it is—just use my local timezone instead of the default. Could you give me a hint or point me in the right direction? Thank you in advance!
    ✅ 1
    r
    e
    • 3
    • 11
  • s

    Steven Searcy

    06/25/2025, 6:26 PM
    Hello! I am heavily considering running Meltano inside of Dagster. I've never used Dagster before, so all tips and recommendations are welcome if anyone has any. Or maybe things I should know before proceeding down this route? I am currently just running Meltano standalone inside of an Ubuntu instance.
    e
    a
    j
    • 4
    • 8
  • s

    Siba Prasad Nayak

    06/27/2025, 8:11 AM
    Hi everyone, I'm planning to build a common module to validate tap and target configurations (credentials, connection, etc.) before a pipeline is run. The main challenge is that every connector has a different API or method for testing a connection. My first thought is a large
    switch
    statement to handle each connector individually, but that doesn't seem scalable. Has anyone approached this problem before? I'd love to hear any recommendations or design patterns you'd suggest instead of a massive
    switch
    case. In any case if meltano offers something which can solve my issue ? Thanks!
    👀 1
    r
    a
    • 3
    • 9
  • e

    Ellis Valentiner

    07/11/2025, 2:34 PM
    Does anyone have best practice recommendations for managing their
    meltano.yml
    file? Specifically ours is very verbose and contains a lot of duplication. For instance, we have an inline stream map on every table to add a source database identifier. If we try to define that with a
    *
    to apply to all tables then it errors because it doesn't respect the contents of
    select
    . This means for each extractor we have the same table identified in the
    stream_maps
    ,
    select
    , and
    metadata
    blocks. So we're constantly jumping around the yaml to make updates and its very easy for devs to miss 1 of the 3 places that need to be updated.
    v
    m
    • 3
    • 18
  • s

    Siba Prasad Nayak

    07/15/2025, 4:18 PM
    Hi Team, I have created a custom utility like below.
    Copy code
    - name: salesforce-discover
          namespace: salesforce_discover
          commands:
            hello:
                executable: echo
                args: ["Hello World"]
            verify:
                executable: python
                args: ["bin/salesforce_discover.py", "$CONFIG_PATH", "verify"]
    The requirement is I want to override the discovery output for salesforce instead of using the existing "--discover" functionality. But when I am executing these commands, I am getting the error.
    Copy code
    (sibaVenv) PS C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304> meltano invoke salesforce-discover:hello
    2025-07-15T16:09:16.508600Z [warning  ] Failed to create symlink to 'meltano.exe': administrator privilege required
    2025-07-15T16:09:16.536909Z [info     ] Environment 'dev' is active   
    Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
    join our friendly Slack community.
    
    'CommentedSeq' object has no attribute 'read'
    (sibaVenv) PS C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304> meltano invoke salesforce-discover:verify
    2025-07-15T16:17:59.700345Z [warning  ] Failed to create symlink to 'meltano.exe': administrator privilege required
    2025-07-15T16:17:59.743028Z [info     ] Environment 'dev' is active   
    Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
    join our friendly Slack community.
    
    'CommentedSeq' object has no attribute 'read'
    (sibaVenv) PS C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304>
    From quick search on internet, it seems there is some issue with parsing the YAML file but not sure where exactly. If anyone has any idea on this, can you please help.
    e
    a
    • 3
    • 6
  • s

    Siba Prasad Nayak

    07/17/2025, 8:11 PM
    Hi Team, I am working on a new target i.e. target-jira and facing one issue when target is processing the records, Even though my records are in proper format like - first schema and then records, still I am getting issue like -
    singer_sdk.exceptions.RecordsWithoutSchemaException: A record for stream 'accounts' was encountered before a corresponding schema. Check that the Tap correctly implements the Singer spec.
    Full Error Log:
    Copy code
    $ dos2unix out_cleaned.jsonl
    dos2unix: converting file out_cleaned.jsonl to Unix format...
    (sibaVenv)
    SiNayak@INBAWN172239 MINGW64 /c/Siba_/Work/LocalSetup/Backend/backend_05062025/backend_2304
    $ cat out_cleaned.jsonl | meltano invoke target-jira
    2025-07-17T20:07:23.746313Z [warning  ] Failed to create symlink to 'meltano.exe': administrator privilege required
    2025-07-17T20:07:23.761502Z [info     ] Environment 'dev' is active
    INFO:target-jira:target-jira v0.0.1, Meltano SDK v0.42.1
    INFO:target-jira:Skipping parse of env var settings...
    2025-07-18 01:37:28,712 | INFO     | target-jira          | Target 'target-jira' is listening for input from tap.
    2025-07-18 01:37:28,713 | INFO     | target-jira          | Adding sink for stream: accounts with schema keys: ['Id', 'Name']
    2025-07-18 01:37:28,713 | INFO     | target-jira.accounts | Initializing target sink for stream 'accounts'...
    2025-07-18 01:37:28,716 | INFO     | target-jira          | Sink added: <target_jira.sinks.JiraSink object at 0x000001B75A72D070>
    Traceback (most recent call last):
      File "<frozen runpy>", line 198, in _run_module_as_main
      File "<frozen runpy>", line 88, in _run_code
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Scripts\target-jira.exe\__main__.py", line 7, in <module>
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\click\core.py", line 1442, in __call__
        return self.main(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\click\core.py", line 1363, in main
        rv = self.invoke(ctx)
             ^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\singer_sdk\plugin_base.py", line 84, in invoke
        return super().invoke(ctx)
               ^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\click\core.py", line 1226, in invoke
        return ctx.invoke(self.callback, **ctx.params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\click\core.py", line 794, in invoke
        return callback(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\singer_sdk\target_base.py", line 572, in invoke
        target.listen(file_input)
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\singer_sdk\_singerlib\encoding\_base.py", line 48, in listen
        self._process_lines(file_input or self.default_input)
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\singer_sdk\target_base.py", line 304, in _process_lines
        counter = super()._process_lines(file_input)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\singer_sdk\_singerlib\encoding\_base.py", line 70, in _process_lines
        self._process_record_message(line_dict)
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\singer_sdk\target_base.py", line 344, in _process_record_message
        self._assert_sink_exists(stream_map.stream_alias)
      File "C:\Siba_\Work\LocalSetup\Backend\backend_05062025\backend_2304\.meltano\loaders\target-jira\venv\Lib\site-packages\singer_sdk\target_base.py", line 280, in _assert_sink_exists
        raise RecordsWithoutSchemaException(msg)
    singer_sdk.exceptions.RecordsWithoutSchemaException: A record for stream 'accounts' was encountered before a corresponding schema. Check that the Tap correctly implements the Singer spec.
    I have attached few files for reference.
    out_cleaned.jsonl
    e
    • 2
    • 2
  • d

    David Dobrinskiy

    07/18/2025, 12:25 PM
    does anyone have examples of meltano project with
    target-clickhouse
    ? I'm having some difficulties in setting mine up, e.g. I want new tables to be deduplicated on
    id
    with sorting by a non-nullable
    updated_at
    . But the target-clickhouse module always casts dates as Nullable: https://github.com/shaped-ai/target-clickhouse/blob/a04758cff46b429bab6615a15d662e25c5a96db9/target_clickhouse/connectors.py#L120 Since this is my first meltano ETL pipeline, I'm getting confused between layers of abstraction here and how to properly configure my target clickhouse tables.
    e
    • 2
    • 2
  • t

    Tanner Wilcox

    07/22/2025, 4:52 PM
    I'm confused about how to add a replication key to my tables with tap-snowflake. Code snippet in thread
    e
    e
    • 3
    • 19
  • e

    Ellis Valentiner

    07/25/2025, 1:19 PM
    Does anyone have suggestions for how to best perform a one-off sync of historical data? I have a table that we sync with incremental update and uses a timestamp column for the replication key. The table has "new" records that are not synced to our destination because their timestamp value is before the replication key value.
    r
    a
    • 3
    • 16
  • s

    Steven Searcy

    08/04/2025, 5:01 PM
    Hello! Is it considered best practice to drop or not drop the raw tables when Meltano pipelines run? I could see not dropping the raw tables making sense for auditability and traceability, but I’m dealing with a spreadsheet that has millions of rows and gets re-ingested regularly (contains historical rows but also new and updated rows). Would retaining all of that raw data over time be problematic in terms of performance or storage?
  • a

    Adam Wegscheid

    08/05/2025, 7:41 PM
    Hello! Does anyone have an example implementation of a custom state backend? I have used my own custom implementation and just saw that you can integrate it naturally into Meltano with release 3.7.0. I must admit that I am struggling to understand the provided documentation: https://docs.meltano.com/guide/custom-state-backend/
    e
    • 2
    • 2
  • r

    Rob Norman

    08/07/2025, 7:02 AM
    I'm trying to use tap-mysql to do an initial load (into target-postgres) of a few tables. Most of the tables extract just fine but one table is just over 35M rows. When Meltano hits that table, it sits there appearing to not do anything for five minutes and then seemingly carries on. Looking in the logs I can see "One or more records have exceeded the max age of 5 minutes. Draining all sinks." and it proceeds to write about 300Mb into the destination table (it should be about 6Gb) before erroring out. When I examine the processlist on MySQL, I can see a query running that's trying to select the entire table and order by the incremental column. That column has an index on (it's a datetime for reference) but `explain`ing the query says that not only is it not going to use the index, MySQL doesn't even consider it a possible index.
    Copy code
    SELECT <list of columns> FROM <table> ORDER BY `created_at` ASC;
    My tap configuration is pretty basic, there's nothing weird going on:
    Copy code
    plugins:
      extractors:
      - name: tap-mysql
        variant: transferwise
        pip_url:
          git+<https://github.com/transferwise/pipelinewise.git#subdirectory=singer-connectors/tap-mysql>
        select:
          - ${TAP_MYSQL_DATABASE}-<table>.*
        settings:
          - name: engine
            value: mysql
        config:
          session_sqls:
          - SET @@session.wait_timeout=28800
          - SET @@session.net_read_timeout=3600
          - SET @@session.innodb_lock_wait_timeout=3600
        metadata:
          '*-<table>':
            replication-method: INCREMENTAL
            replication-key: created_at
    Am I just fundamentally missing something because trying to read the entire table in one go seems insane to me? Do I need to set the replication-method to FULL_TABLE for the first load and manually fiddle the state or something?
    d
    • 2
    • 1
  • t

    Tanner Wilcox

    08/11/2025, 10:04 PM
    I would like one dag staging/intermediate/mart file in my project. I would like if the mart dag knew that it depended on x, y, and z intermediate and that intermediate depended on a, b, and c staging file and would rerun all of those to re-generate the mart. From what I've looked into there isn't a solution for this so I'm planning on writing my own generator. Before I head down that route I was wondering if there's already some solution for this
    s
    • 2
    • 3
  • t

    Tanner Wilcox

    08/13/2025, 6:55 PM
    Some of my taps are on python3.12.10 which is fine when I'm developing this on Fedora but the production Ubuntu server doesn't have that installed yet. What's the recommended way to fix this? I could build it from source and point meltano to it using, I think, the meltano.yml but I don't want a direct path in there as that woudl stop it from working on my computer. Does
    uv
    handle this?
    ✅ 1
    e
    • 2
    • 4
  • q

    Quoc Nguyen

    08/19/2025, 5:37 AM
    Hello, is anyone facing the different log behavior when upgrading meltano from
    3.6
    to
    3.7
    ? Back then, with
    3.6
    , when running sth like
    meltano run tap-posgres target-redshift
    , the console will show both internal logs from meltano and external logs from meltano plugins (`tap-postgres`/`target-redshift`). But when upgrading to
    3.7
    , it only shows the internal logs from meltano. So in our use case, the logs from plugins are really important because they're inputs for our monitoring systems. So is there any way to make it work similarly to the version
    3.6
    ? Not sure if this is a dumb question or not. If so, then is there any doc that I can read more about this since I'm new to
    meltano
    😄 Thank you sirs a lot in advance
    ✅ 1
    r
    • 2
    • 1