https://linen.dev logo
Join Slack
Powered by
# meltano-plugin-development
  • s

    swetha_mandanapu

    06/06/2024, 10:20 PM
    Hi, have a generic question relating to choosing been a singer tap vs meltano's. Can someone highlight the benefits of employing one over the other? Thanks
    👀 1
    e
    • 2
    • 1
  • v

    Viet Vu Danh

    06/25/2024, 3:29 AM
    I am trying to overwrite tap's schema (override type, maxLength...), so it's reflect on target. Example: • I have source column say
    phone_number
    in MySQL with type
    VARCHAR(20)
    It's PII so I applied inline mapping hash, the length increased to 32 => it failed on record validations (max length > 20, which is taken from the source length). • I would like to override the schema so it pass the validation • I can workaround by mapping to a new field:
    phone_number_hash
    and set null/drop original field
    phone_number
    . However, there are some other cases too, which override schema would solve them. Therefore I got some questions, it would be great if someone would help me to answer: 1/ Is it possible? Because looks like it's not. I found this issue and it's still open: https://github.com/meltano/meltano/issues/2424 2/ Where in the code that the
    schema
    extra is applied to catalog discovery? I failed to find it in the sdk code. 3/ How can I apply the same logic of catalog discovery to the stream? Overwrite the
    SCHEMA
    message, by modifying
    tap_base
    ?
    e
    • 2
    • 2
  • s

    Slackbot

    07/04/2024, 9:37 AM
    This message was deleted.
    ✅ 1
    r
    h
    • 3
    • 3
  • f

    Feroze Ahmad

    07/05/2024, 5:55 AM
    I am using 'tap-salesforce' extractor to pulls data from Salesforce. I want to customize the behavior of this tap to get data as per my requirement. Should i modify 'tap-salesforce' extractor to meet my requirement? or Is there any approach to create my custom tap to customize a parent tap? What should be the best approach?
    r
    • 2
    • 1
  • p

    Pedro Ceriotti

    07/06/2024, 12:59 AM
    Did anyone find a way to ‘persist’ a changing refresh token in between tap runs? There are some APIs that invalidate the refresh token every time it’s used to generate a new access token. I found this issue on Github, but it doesn’t seem like we got to a consensus about what’s the most reasonable approach up to this point. For context, I’m running meltano on an ephemeral container, so a potential solution would involve storing and retrieving the new tokens from an external system or — preferrably — use meltano’s DB to store and persist the newly generated token between runs. Did anyone face this same issue? Any viable workarounds using Meltano’s SDK for this particular use case?
    ➕ 1
    👀 1
    e
    • 2
    • 1
  • h

    Haruno izumi

    07/12/2024, 1:11 PM
    Hi If there is a variable need to read from context then how can we use it in the code example "context":{"acc_id" :"678827388"} Need to read this id as id= acc_id. I have tried overriding path @property def path(self): path="api_url/{acc_id}" return path But it doesn't seem to work...what could be the reason?
    r
    • 2
    • 3
  • f

    Frederic

    07/17/2024, 8:07 AM
    Hi, I'm currently looking at generating a custom connector using a http POST with a payload. I've been following the example set in https://docs.meltano.com/tutorials/custom-extractor/. However, this only relates to an http GET. Could let me know the best practice for a 'tap.py' and/or 'streams.py' implementation. Thank you very much for your support.
    ✅ 1
    r
    e
    • 3
    • 6
  • f

    Frederic

    07/29/2024, 1:39 PM
    Hi, I am trying to build a tap 'tap-jsonplaceholder ' for the following schema: { "timezones": [ { "timezone": { "local_time": "", "description": "[GMT-10:00] Pacific/Tahiti", "name": "Pacific/Tahiti", "ref_timezone_id": 3, "utc_offset": -36000 } } ] } for which I'm trying to create a stream schema: class CommentsStream(jsonplaceholderStream): .... schema = th.PropertiesList( th.Property( "timezones", th.ArrayType( th.Property( "timezone", th.ObjectType( th.Property("local_time", th.StringType), th.Property("description", th.StringType), th.Property("name", th.StringType), th.Property("ref_timezone_id", th.IntegerType), th.Property("utc_offset", th.IntegerType), ), ), ), ), ).to_dict() ..... • I run the meltano cmd line: meltano run tap-jsonplaceholder target-json • I get the following output: {"timezones": [{}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}, {}]} The logging of the schema object give me this output: 2024-07-29 150355,462 - root - INFO - {'type': 'object', 'properties': {'count': {'type': ['integer', 'null']}, 'execution_time': {'type': ['string', 'null']}, 'timezones': {'type': ['array', 'null'], 'items': {'type': 'object', 'properties': {'local_time': {'type': ['string', 'null']}, 'description': {'type': ['string', 'null']}, 'name': {'type': ['string', 'null']}, 'ref_timezone_id': {'type': ['integer', 'null']}, 'utc_offset': {'type': ['integer', 'null']}}}}}} I can see the 'items' key but I cannot see the 'timezone' key (defined as a property above). I think this is the source of the issue. However, I am not sure how to correct this. I cannot see whether my PropertiesList defintion is wrong. Could you enlighten please? Thank you very much for your time
    r
    • 2
    • 3
  • f

    Frederic

    07/29/2024, 8:40 PM
    Hi, I am looking to build a workflow of metalno/singer tap. I am thinking of two main use case: i) being able to run two taos in parallel, ii) running them sequentially, having one feeding the next. Could you advise if a python framework or methodology that could help? Thank you Fred
    e
    v
    • 3
    • 6
  • h

    haleemur_ali

    08/01/2024, 5:03 PM
    I created a small Pull Request to improve column quoting in the meltano SDK. https://github.com/meltano/sdk/pull/2582 I appreciate your reviews on it. In general object quoting feels a bit brittle, because in some methods custom queries are constructed using parts generated by sqlalchemy, while other methods, sqlalchemy is used entirely. I'll try to make some improvements in that regard as well.
    👀 1
    e
    • 2
    • 8
  • m

    Max McKenzie

    08/03/2024, 6:51 AM
    Hello, hope this is the correct channel to ask this question. I'm trying to use a fork of tap-plausible. I just want to add an additional API URL setting to the plugin, however this seems to have been missed on the original, so i forked it and added this. https://github.com/maxmckenzie/airbyte/blob/master/airbyte-integrations/connectors/source-plausible/source_plausible/manifest.yaml#L43 I've tried to install it from the URL in meltano but i could not work out how to do it. I also tried to just edit the tap-plausible to point at my github url then ran
    meltano install ...
    to reinstall it. But when i run the config test it errors on discover?
    Copy code
    2024-08-03T06:43:05.516064Z [debug    ] Creating DB engine for project at '/Users/xam/Development/etl-pipelines' with DB URI '<sqlite://Users/xam/Development/etl-pipelines/.meltano/meltano.db>'
    2024-08-03T06:43:05.542581Z [debug    ] Found plugin parent            parent=tap-plausible plugin=tap-plausible source=LOCKFILE
    2024-08-03T06:43:05.576936Z [debug    ] Skipped installing extractor 'tap-plausible'
    2024-08-03T06:43:05.577150Z [debug    ] Skipped installing 1/1 plugins
    2024-08-03T06:43:05.645618Z [debug    ] Created configuration at /Users/xam/Development/etl-pipelines/.meltano/run/tap-plausible/tap.b684be6e-5652-4a71-847f-1310aca0342e.config.json
    2024-08-03T06:43:05.645841Z [debug    ] Could not find tap.properties.json in /Users/xam/Development/etl-pipelines/.meltano/extractors/tap-plausible/tap.properties.json, skipping.
    2024-08-03T06:43:05.645943Z [debug    ] Could not find tap.properties.cache_key in /Users/xam/Development/etl-pipelines/.meltano/extractors/tap-plausible/tap.properties.cache_key, skipping.
    2024-08-03T06:43:05.646029Z [debug    ] Could not find state.json in /Users/xam/Development/etl-pipelines/.meltano/extractors/tap-plausible/state.json, skipping.
    2024-08-03T06:43:05.646745Z [debug    ] Invoking: ['/Users/xam/Development/etl-pipelines/.meltano/extractors/tap-plausible/venv/bin/tap-airbyte', '--config', '/Users/xam/Development/etl-pipelines/.meltano/run/tap-plausible/tap.b684be6e-5652-4a71-847f-1310aca0342e.config.json', '--discover']
    Traceback (most recent call last):
      File "/Users/somedude/Development/etl-pipelines/.meltano/extractors/tap-plausible/venv/bin/tap-airbyte", line 8, in <module>
        sys.exit(TapAirbyte.cli())
                 ^^^^^^^^^^^^^^^^
    i've really got no idea if i'm even doing this correctly. can i just edit plugins inside my project? do i need to go and release my own tap? do i have to pull this all locally somehow? Its pretty frustrating that Airbyte have missed this 1 setting 😄
    • 1
    • 1
  • f

    Frederic

    08/06/2024, 2:25 PM
    Hi, I am trying to use the tap-postgres https://hub.meltano.com/extractors/tap-postgres/() 1. I have set up the config in this fashion with the correct username, password, etc (as shown below): - name: tap-postgres variant: meltanolabs pip_url: git+https://github.com/MeltanoLabs/tap-postgres.git config: sqlalchemy_url: "postgresql://[username]:[password]@localhost:5432/[db_name]" 2. I have proven that for this sqlalchemy_url I can connect to the dB in question, and load the data via the sqlalchemy 'create_engine' object. 3. However when I test the tap in Meltano, I'm getting the following error: "info ] The default environment 'dev' will be ignored for
    meltano config
    . To configure a specific environment, please use the option
    --environment=<environment name>
    . Need help fixing this problem? Visit http://melta.no/ for troubleshooting steps, or to join our friendly Slack community. Plugin configuration is invalid (Background on this error at: https://sqlalche.me/e/20/f405) " Could you advise please? Thank you so much for your time Fred
    e
    • 2
    • 4
  • m

    Max McKenzie

    08/07/2024, 1:54 AM
    Hello, Following on from my problems above i've had my PR merged into the airbyte github repo https://github.com/airbytehq/airbyte/pull/43048 this allows you to add a custom URL for the plausible extractor. I have one question, when will this update be available via tap-plausible in meltano? do i need to update it for meltano? is there a tap repo somewhere?
    e
    • 2
    • 9
  • j

    Jens Christian Hillerup

    08/09/2024, 6:08 PM
    Where do I find the secret sauce that maps environment variables to configuration parameters? I'm having some issues with
    tap-postgres
    currently that doesn't seem to find my
    TAP_POSTGRES_SQLALCHEMY_URL
    variable and barfs a KeyError because it had expected
    user
    to be defined. Thanks in advance for any assistance!
    ✅ 1
    v
    • 2
    • 57
  • h

    Haruno izumi

    08/23/2024, 9:24 AM
    Hi , Can anyone suggest How do I use get_starting_replication_value(context) In any method to use in below property @property def partitions (self): end_date= datetime.now() Start_date= #TODO read from state #TODO process dates return #processed date value partitions
    e
    • 2
    • 7
  • d

    Daniel Luo

    09/05/2024, 8:45 PM
    I created a custom mapper using cookiecutter and copying the code in meltano_map_transformer with a custom PluginMapper implementation. Debugging the code works great and I have the expected output showing up. Now I'm trying to add it to our project, and I'm wondering how to correctly do that. Here is the relevant part in my meltano.yml, which was added when I ran
    meltano add --custom mapper column_mapper
    Copy code
    mappers:
      - name: column-mapper
        namespace: column_mapper
        pip_url: -e map/column-mapper
        executable: column-mapper
    And my project structure attached. When I run
    meltano run tap-mssql column-mapper target-snowflake --state-id-suffix dbo-test
    , I get the following error:
    Copy code
    Environment 'default' is active
    Found unexpected mapper plugin name.  plugin_name=column-mapper
    Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
    join our friendly Slack community.
    
    block violates set requirements: Expected unique mappings name not the mapper plugin name: column-mapper.
    What am I doing wrong here?
    v
    • 2
    • 6
  • s

    Slackbot

    09/08/2024, 8:07 PM
    This message was deleted.
    v
    • 2
    • 1
  • d

    Daniel Luo

    09/16/2024, 7:07 PM
    What environment do people usually develop on? Mac? Windows? Windows with WSL? Or running some Linux distro? We have people using different OS and wondering what the best option is to avoid platform specific issues.
    v
    • 2
    • 1
  • d

    Daniel Luo

    09/17/2024, 5:44 PM
    I'm running into an error with invalid unicode characters and I'm wondering where the best place to tackle this would be. My initial thought was to do it as part of my mapper with a custom eval function, but after trying that, I realized that the error occurs before eval even gets called. It seems to happen while the file is being streamed. I have a function that fixes the problem, just not sure where to put it to avoid having unnecessary processing and potentially breaking something else. Perhaps this should happen as part of the tap and process each value before it gets saved? An example of this is ” vs ". The former shows up with some unicode sequence in the file. My latest example is
    \xe2\x80\x90
    , which should be a
    -
    , but isn't. In the database, visually, it looks like a hyphen, but when copied out and searched with a regular hyphen, it doesn't match.
    Copy code
    Exception has occurred: UnicodeDecodeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
    'charmap' codec can't decode byte 0x90 in position 2548: character maps to <undefined>
      File "C:\Users\daniell\.rye\py\cpython@3.12.4\Lib\encodings\cp1252.py", line 23, in decode
        return codecs.charmap_decode(input,self.errors,decoding_table)[0]
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\singer_sdk\_singerlib\encoding\_base.py", line 61, in _process_lines
        for line in file_input:
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\singer_sdk\_singerlib\encoding\_base.py", line 48, in listen
        self._process_lines(file_input or self.default_input)
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\singer_sdk\mapper_base.py", line 135, in invoke
        mapper.listen(file_input)
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\click\core.py", line 783, in invoke
        return __callback(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\click\core.py", line 1434, in invoke
        return ctx.invoke(self.callback, **ctx.params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\singer_sdk\plugin_base.py", line 82, in invoke
        return super().invoke(ctx)
               ^^^^^^^^^^^^^^^^^^^
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\click\core.py", line 1078, in main
        rv = self.invoke(ctx)
             ^^^^^^^^^^^^^^^^
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\.venv\Lib\site-packages\click\core.py", line 1157, in __call__
        return self.main(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "C:\git\dagster-hybrid\src\elt_projects\meltano\custom-plugins\map\column-mapper\column_mapper\__main__.py", line 7, in <module>
        ColumnMapperMapper.cli()
      File "C:\Users\daniell\.rye\py\cpython@3.12.4\Lib\runpy.py", line 88, in _run_code
        exec(code, run_globals)
      File "C:\Users\daniell\.rye\py\cpython@3.12.4\Lib\runpy.py", line 198, in _run_module_as_main (Current frame)
        return _run_code(code, main_globals, None,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 2548: character maps to <undefined>
    • 1
    • 1
  • j

    Jens Christian Hillerup

    09/25/2024, 9:31 AM
    I'm trying to figure out a good structure for handling of incoming invoices within my Meltano project. We are reselling services from physiotherapists, so we're getting a large number of incoming invoices that need to be parsed before we can then invoice upstream. I want to use an external OCR tool for that (no specific tool yet, but I'd appreciate suggestions if anyone has experience). 1. We can get URLs for the invoices using the API provided by our accounting software. That sounds like a good job for an extractor. 2. Then the OCR tool will provide information about the downstream service provider (tax ID etc) as well as a list of lines on the invoice. 3. That data then needs to be loaded into our data warehouse. Currently I'm implementing it as a
    utility
    in Meltano parlance, but would such a "heavy" operation be suited for the stream mapping API? If not, then what is the alternative? I suppose I could have the utility output something adhering to the Singer spec but what is it then, if not a tap? Is
    meltano run foo_utility target-postgres
    a well-defined operation?
    e
    a
    d
    • 4
    • 11
  • a

    Ayoub Fakir

    10/25/2024, 9:15 AM
    Hello guys! I have a question about the twitter ads tap - since the change of the API, is there some ongoing work suggesting that the tap would support the new API soon? Thank you!
    e
    • 2
    • 1
  • j

    joshua_janicas

    12/02/2024, 2:46 PM
    Hey all, I'm not sure if this is the right channel for it but I'm trying to chase down a critical vulnerability flagged in our local DependencyTrack: https://nvd.nist.gov/vuln/detail/CVE-2024-53899 which was fixed via: https://github.com/pypa/virtualenv/pull/2771 I was trying to find the dependency in my package but I don't see any references, which makes me think it's being used in a separate package. I'm not sure if this vuln is coming from Meltano/Dagster/DBT, but my initial hunch is Meltano because of https://github.com/meltano/meltano/blob/main/docs/docs/concepts/python_virtual_environments.md Would someone from the Meltano team be able to be able to confirm?
    e
    • 2
    • 2
  • m

    mykola_zavada

    12/10/2024, 10:00 PM
    Hello everyone, I'm developing a target connector with the SDK and I need to add some specific logic when it's called with the full-refresh parameter. Is there any method that I should override for this purpose?
    r
    • 2
    • 6
  • r

    Reuben (Matatika)

    02/20/2025, 10:58 AM
    We are developing a custom mapper plugin that transforms data to the Fivetran format. It appears we have to define at least a single
    mapping
    in the
    meltano.yml
    for it to work with
    meltano run
    Copy code
    mappers:
      - name: mapper-fivetran
        namespace: mapper_fivetran
        variant: matatika
        pip_url: -e /home/reuben/Documents/mappers/mapper-fivetran
        mappings:
        - name: fivetran
    otherwise we get the error message
    block violates set requirements: Expected unique mappings name not the mapper plugin name: mapper-fivetran
    . Would it be possible/does it makes sense to reference a mapper plugin with predefined behaviours directly like this?
    e
    • 2
    • 7
  • t

    tim_schwartz

    03/11/2025, 5:04 PM
    has anyone developed a tap for Google Ad Manager (not google ads not DV360)?
  • t

    tim_schwartz

    03/11/2025, 5:37 PM
    looks like we have to develop our own 😞
    r
    • 2
    • 9
  • z

    Zahir Alward

    03/23/2025, 8:08 AM
    Hi All , I’m working on a custom Meltano target using Singer SDK v0.44.3, and I’ve hit an issue where the finalize method in my RecordSink subclass isn’t being called. I’m buffering records and flushing them to S3, expecting finalize to handle the final flush and log a summary, but it never triggers. Here’s the setup: • My sink subclasses singer_sdk.sinks.RecordSink. • process_record buffers records and flushes at a threshold (e.g., 1000 records). • finalize sets force_flush = True, calls flush_buffer, and logs a summary. • Logging is at DEBUG level, but I don’t see “Entering finalize method” in the logs. From the docs, I understand finalize should be called by the Target class after all records are processed, but it’s not happening. Has anyone run into this? Could it be related to: • The tap not signaling stream completion? • An exception killing the process early? (No obvious errors in logs, though.) • A misstep in how I’ve set up the target or sink? I’ve fixed a bug in my buffer logic (was comparing len(buffer) to a boolean), but that didn’t resolve it. I’d appreciate any pointers or examples of working finalize implementations! Happy to share more code/logs if needed. Thanks!
    e
    • 2
    • 3
  • z

    Zahir Alward

    03/23/2025, 8:11 AM
    def *process_record*(self, record: dict, context: dict) -> None:
    <http://logger.info|logger.info>(f"Processing record: {record}")
    logger.debug(f"Context in process_record: {context}")
    logger.debug(f"Stream name from self: {self.stream_name}")
    try:
    if self.stream_schema is None:
    self.stream_schema = self.schema
    self.buffer.append(record)
    logger.debug(f"Buffer size: {len(self.buffer)}")
    if (len(self.buffer) >= self.max_buffer_size) or self.force_flush:
    self.flush_buffer(context)
    self.force_flush = False
    except Exception as e:
    logger.error(f"Error processing record: {e}")
    raise
  • t

    Tanner Wilcox

    04/07/2025, 8:42 PM
    I wrote a plugin that SCPs a file. I call it like this
    poetry run scp get-file tanner password url /remote/path ./
    I'm struggling to add this to my meltano project. Here's the utility section of my meltano.yml
    Copy code
    utilities:
        - name: dbt-postgres
          variant: dbt-labs
          pip_url: dbt-core dbt-postgres meltano-dbt-ext~=0.3.0
        - name: scp-ext
          pip_url: '../scp-ext/scp_ext'
          executable: scp
    I get this error:
    Copy code
    [tanner@sato ubb-meltano]$ meltano lock --update --all
    Utility 'scp-ext' is not known to Meltano. Check <https://hub.meltano.com/> for available plugins.
    e
    • 2
    • 5
  • t

    tim_schwartz

    05/08/2025, 2:28 PM
    👋 Hi all - (cc @Reuben (Matatika)) - I finally finished writing my first plugin it’s a tap for Google Ad Manager (not 360) - I’ve never written one before. It’s working great for us internally, but could likely use another set of eyes or I could review a standards doc 🙂 - what is the general flow for plugin? Should I submit it for review or something? Also.. what is the general standard for logging messages? I left a lot of debugging for one of my streams, b/c its pretty tricky. I’m using a lot of self.logger.info, but i could likely shift it to something else or delete it. Comments or PRs welcome! https://github.com/The-Daily-Upside/tap-google-ad-manager
    🙌 2
    e
    • 2
    • 2