https://linen.dev logo
Join Slack
Powered by
# singer-taps
  • r

    Rafał

    06/02/2025, 1:19 PM
    I writing a tap for reading OpenOffice Calc (ODS) files from a versioned S3 bucket. I know there's tap-spreadsheets-anywhere but it neither supports ODS nor versioned buckets, it's architecturally incompatible with pyexcel and is unmaintained. The ODS files have multiple sheets, but the same schema across files, so all files produce the same streams and all streams span across multiple files. Naturally the tap wouldn't produce records stream by stream, but file by file, with streams intertwined. That's not how the singer-sdk want's me to do things. Is this mode supported by the sdk (I'm assuming that Meltano will be fine with it)? Is there an example tap I could look at?
    a
    • 2
    • 16
  • r

    Rafał

    06/05/2025, 10:36 AM
    I find the role of
    def discover_streams
    unclear, if it's called even if --catalog is passed. I'd expect it to be called only without the catalog or with --discover, when a discovery is actually needed
    e
    r
    • 3
    • 25
  • s

    Siba Prasad Nayak

    06/06/2025, 5:15 AM
    Team, Do we have tap-onedrive from singer side ? Has anyone ever tried it ?
    e
    • 2
    • 2
  • a

    Ayush

    06/06/2025, 5:44 AM
    hello, i have a question not sure where to ask it: 1. I’m currently trying to extract (tap-salesforce) data from salesforce and load it onto a json file (target-jsonl). 2. Then I want to extract (tap-singer-jsonl) from the json and load it onto mongodb (target-mongodb). however, the tap-singer-jsonl does not successfully extract the data from the json file. I also tried using targer-singer-jsonl instead of target-jsonl (in step 1) to test the singer config, but it says “run invocation could not be completed as block failed: Loader failed. Seems like a singer issue since I was able to do step 1 with target-jsonl
    ✅ 1
    a
    m
    • 3
    • 7
  • a

    Ayush

    06/06/2025, 5:45 AM
    Does anyone know what’s going on?
  • c

    Chinmay

    06/06/2025, 8:13 AM
    Hello team, we are using tap-quickbooks for fetching QBO data (https://github.com/hotgluexyz/tap-quickbooks), but it does not have RefundReceipt, Check & CreditCardCredit record types data given by this repo, so how to get these records? Can you guys help get these record types?.
    e
    • 2
    • 1
  • a

    azhar

    06/10/2025, 9:06 AM
    Hello team, https://github.com/MeltanoLabs/tap-linkedin-ads --- We are using LinkedIn singer tap, from June 1, we started to get the 426 client error as it seems LinkedIn has deprecated old API endpoints. Also, noticed this tap is using LinkedIn-Version 2024 in the headers. ---
    Copy code
    error:          2025-06-10T02:55:07.873581Z [info     ] 2025-06-10 02:55:07,872 | ERROR    | tap-linkedin-ads.accounts | An unhandled error occurred while syncing 'accounts' cmd_type=elb consumer=False job_name=prod:tap-linkedin-ads-to-target-clickhouse:UMOJn5gijo name=tap-linkedin-ads producer=True run_id=ab987cd0-89aa-4d5a-b179-8fb04e6d3f7d stdio=stderr string_id=tap
    -linkedin-ads                                                                                                                                                                                                                                                                                                                                                                      
    2025-06-10T02:55:07.875835Z [info     ]     raise FatalAPIError(msg)   cmd_type=elb consumer=False job_name=prod:tap-linkedin-ads-to-target-clickhouse:UMOJn5gijo name=tap-linkedin-ads producer=True run_id=ab987cd0-89aa-4d5a-b179-8fb04e6d3f7d stdio=stderr string_id=tap-linkedin-ads                                                                                          
    2025-06-10T02:55:07.875945Z [info     ] singer_sdk.exceptions.FatalAPIError: 426 Client Error: Upgrade Required for path: /rest/adAccounts cmd_type=elb consumer=False job_name=prod:tap-linkedin-ads-to-target-clickhouse:UMOJn5gijo name=tap-linkedin-ads producer=True run_id=ab987cd0-89aa-4d5a-b179-8fb04e6d3f7d stdio=stderr string_id=tap-linkedin-ads                      
    2025-06-10T02:55:07.880461Z [info     ]     raise FatalAPIError(msg)   cmd_type=elb consumer=False job_name=prod:tap-linkedin-ads-to-target-clickhouse:UMOJn5gijo name=tap-linkedin-ads producer=True run_id=ab987cd0-89aa-4d5a-b179-8fb04e6d3f7d stdio=stderr string_id=tap-linkedin-ads                                                                                          
    2025-06-10T02:55:07.880569Z [info     ] singer_sdk.exceptions.FatalAPIError: 426 Client Error: Upgrade Required for path: /rest/adAccounts cmd_type=elb consumer=False job_name=prod:tap-linkedin-ads-to-target-clickhouse:UMOJn5gijo name=tap-linkedin-ads producer=True run_id=ab987cd0-89aa-4d5a-b179-8fb04e6d3f7d stdio=stderr string_id=tap-linkedin-ads                      
    2025-06-10T02:55:16.772779Z [error    ] Extractor failed                                                                                                                                                                                                                                                                                                                           
    2025-06-10T02:55:16.772957Z [error    ] Block run completed.           block_type=ExtractLoadBlocks err=RunnerError('Extractor failed') exit_codes={: 1} set_number=0 success=False
    e
    • 2
    • 1
  • h

    hammad_khan

    06/23/2025, 11:59 AM
    Hello team, Anyone is using snowflake tap https://github.com/MeltanoLabs/tap-snowflake? I noticed its not maintaining any bookmarks in state.json for the tables. Also I dont seem to find a setting for start_date. For instance: below state.json after successfully pulling first time
    Copy code
    {
      "completed": {
        "singer_state": {
          "bookmarks": {
            "dw_hs-dim_accounts": {},
            "dw_hs-dim_activities": {
              "starting_replication_value": null
            }
          }
        }
      },
      "partial": {}
    }
    ✅ 1
    m
    • 2
    • 2
  • n

    Nathan Sooter

    06/27/2025, 6:16 PM
    I'm using
    tap-salesforce
    and am looking for the config to pass WHERE clauses into the SOQL that Meltano generates. I need to filter to particular values in a particular column in the Account object to make sure specific records aren't extracted. Chat gpt is leading me astray with configs that don't actually exist...does one exist?
    m
    e
    • 3
    • 6
  • f

    Florian Bergmann

    07/03/2025, 9:23 AM
    Hi all, I got errors during my last run of extracting data using tap-oracle, variant s7clarke10, replication method log_based and I want to debug them. For that purpose, I have a rather basic question: How do I change the Log-level for tap-oracle? I tried running meltano with --log-level=debug or --log-level=info, but that has no effect on the output from tap-oracle. I figured out that tap-oracle uses singer.get_logger. So I suppose I have to adjust its settings. Any hints how to do so? I'd like to get those info messages either printed to terminal or in a log file like LOGGER.info("Running in thick mode")
    ✅ 1
    e
    • 2
    • 2
  • e

    Emwinghare Kelvin

    07/23/2025, 6:57 PM
    Hello everyone, I’m experiencing an issue with 
    tap-rest-api-msdk
    when making POST requests. Has anyone resolved this or can suggest an alternative tap I could use?
    e
    • 2
    • 2
  • c

    Chandana S

    07/24/2025, 6:06 AM
    Hi everyone, I have a use case where I need to extract the data from mysql and load it into bigquery. I want to use meltano's tap-mysql for this. One doubt I have is, the incremental load should happen through the updated_at column and according to standard ETL process, checking the destination before proceeding with the load is good. Is there a way, I can pass a WHERE condition or even a filter condition to the tap before I run the job?
    e
    • 2
    • 1
  • e

    Evan Guyot

    07/25/2025, 10:18 AM
    Hey, I hope I'm reaching out in the right channel. I've created a custom catalog from a tap (based on the existing one) to add a new field. However, in some cases, this field is not returned at all by the REST API — not even as
    null
    , but completely missing — which leads to a Singer exception. I was wondering if there's a catalog's property designed to handle this kind of situation? I’ve already tried defining the field as nullable and using
    additionalProperties
    , but I’m still encountering the Singer error when the field is absent from the object. Here is the Singer error :
    2025-07-25T10:06:57.048305Z [error  ] Loading failed        code=1 message="singer_sdk.exceptions.InvalidRecord: Record Message Validation Error: {'sub_prop_1': 'abc', 'sub_prop_2': 'def'} is not of type 'string'"
    Here is what i have tried in the catalog :
    Copy code
    {
      "streams": [
        {
          "tap_stream_id": "obj",
          ...,
          "schema": {
            "properties": {
              "prop_1": {
                "type": ["array", "null"],
                "items": {
                  "type": "object",
                  "properties": {
                    "sub_prop_1": { "type": ["string", "null"] },
                    "sub_prop_2": { "type": ["string", "null"] },
                    "optional_sub_prop_3": { "type": ["string", "null"] }
                  },
                  "additionalProperties": true
                }
              }
            }
          }
        }
      ]
    }
    Thanks in advance to anyone who takes the time to help ☺️
    r
    • 2
    • 7
  • r

    Reuben (Matatika)

    08/01/2025, 2:10 PM
    What is the point of
    select_filter
    ? Isn't
    select
    a kind of filtering mechanism by definition? Why would I need a filter for a filter? 😅
    v
    h
    +2
    • 5
    • 10
  • s

    Sac

    08/05/2025, 1:03 PM
    Hi everyone 👋 I'm working with the community-managed
    tap-quickbooks
    and noticed that some secrets (like API keys or tokens) seem to be logged in plain text during execution. From what I understand, there’s a
    _make_request
    method in the tap that logs the URL and the full body of the POST request used to request a token — which includes API secrets.
    Copy code
    [...]
    
    def _make_request(self, http_method, url, headers=None, body=None, stream=False, params=None, sink_name=None):
            if http_method == "GET":
                <http://LOGGER.info|LOGGER.info>("Making %s request to %s with params: %s", http_method, url, params)
                resp = self.session.get(url, headers=headers, stream=stream, params=params)
            elif http_method == "POST":
                <http://LOGGER.info|LOGGER.info>("Making %s request to %s with body %s", http_method, url, body)
                resp = <http://self.session.post|self.session.post>(url, headers=headers, data=body)
            else:
                raise TapQuickbooksException("Unsupported HTTP method")
    
    [...]
    Is there a way in Meltano to prevent secrets from being written to log files if the logging is done by the tap itself? Or is this considered a tap-specific issue that should be addressed on GitHub? 🤷‍♂️ Thanks in advance for any insights!
    r
    • 2
    • 2
  • s

    Sac

    08/08/2025, 7:25 PM
    Hello everyone, I need some advice on the QuickBooks tap. QuickBooks uses OAuth2, where the refresh token gets updated roughly every day. Although this connector has a mechanism to capture the new refresh token when it’s updated, since there’s no write-back capability to the tap and target settings (as far as I understand – see issue #2660), the new refresh token value just gets lost. I wanted to ask: if someone has experience with this tap, how do you handle this? The only workaround I can think of is an additional helper script that runs right after the pipeline. This script would fetch the new token from the logs, where it’s stored as plain text (which isn’t ideal, but in this case, it’s useful). Currently, I’m running Meltano in a container, so what I’m trying now is to: 1. Pack the additional Python script in the same container. 2. Mount the
    .env
    file with the token. 3. Let the pipeline run, capturing the new token if there is one, and saving it to the log. 4. Have the Python script fetch it as soon as the pipeline is done. 5. Update the value in the
    .env
    file so the next sync uses the new valid token. I don’t have a better idea at the moment, apart from forking the connector and modifying the logic there, which I’d prefer to avoid. Has anyone faced a similar scenario? What do you think of this solution? Any advice or suggestions? Many thanks in advance!
    e
    • 2
    • 2
  • s

    steven_wang

    08/26/2025, 9:21 PM
    I'm looking to sync data from MongoDB and noticed there are several MongoDB tap variants on Meltano Hub: https://hub.meltano.com/extractors/tap-mongodb/ Has anyone tried these and have opinions on which one to use? I noticed the default one hasn't been updated in 2 years and incremental replication is not working.
    m
    • 2
    • 2
  • j

    Jazmin Velazquez

    09/09/2025, 7:45 PM
    i want to use
    tap-google-sheets
    to extract data from multiple google sheets (with different sheet IDs). How do I configure meltano for this?
    r
    • 2
    • 4
  • l

    Luca Capra

    09/10/2025, 10:25 AM
    Hello, I am looking for guidance on incremental handling of taps. I have developed some code and would like to get incremental updates right. Right now I am using a plain fs directory over fsspec and planning to add S3-compatible storage. I have files like the following with monthly/daily additions 2025-01.csv 2025-02.csv I follwowed https://sdk.meltano.com/en/latest/incremental_replication.html and https://sdk.meltano.com/en/latest/implementation/state.html so far What I have been working on is here https://github.com/celine-eu/tap-spreadsheets/blob/main/tap_spreadsheets/stream.py#L33-L41 Basically using a custom ___updated_at field_ to track row level progress and tracking the reference file mtime. So, I suspect to have reinvented the wheel :) My question, what is already managed by the SDK and what should I do in my own code? Thank you
    v
    • 2
    • 3
  • t

    Tanner Wilcox

    09/25/2025, 8:00 PM
    Is there any way to disable ssl verification for tap-rest-api-msdk?
    r
    • 2
    • 1
  • s

    steven_wang

    09/26/2025, 6:39 PM
    In the Salesforce tap, does anyone know how to sync objects other than Account, Opportunity, Opportunityhistory, Lead, User, and Contact? I'm trying to sync the Task object in our Salesforce account but can't seem to select it. Here is my yaml config:
    Copy code
    - name: tap-salesforce
        variant: meltanolabs
        config:
          select_fields_by_default: true
          login_domain: ${TAP_SALESFORCE_LOGIN_DOMAIN}
          streams_to_discover: ["Task"]
        select_filter:
         - 'Task.*'
    https://github.com/MeltanoLabs/tap-salesforce/issues/89
    ✅ 1
    e
    • 2
    • 6
  • k

    Kevin Phan

    10/10/2025, 8:02 PM
    hey folks im using the rest api tap to retrive info from chainalysis endpoint but I keep getting errors about validating 'type' in schema. An example error i have is:
    Copy code
    2025-10-10T19:56:07.945399Z [info     ] Failed validating 'type' in schema['properties']['service']: cmd_type=elb consumer=True job_name=dev:tap-chainalysis-alerts-to-target-jsonl name=target-jsonl producer=False run_id=2a2e07ff-7928-4500-847f-5f58e7e96baf stdio=stderr string_id=target-jsonl
    2025-10-10T19:56:07.949526Z [info     ]     {'type': 'string'}         cmd_type=elb consumer=True job_name=dev:tap-chainalysis-alerts-to-target-jsonl name=target-jsonl producer=False run_id=2a2e07ff-7928-4500-847f-5f58e7e96baf stdio=stderr string_id=target-jsonl
    2025-10-10T19:56:07.952388Z [info     ]                                cmd_type=elb consumer=True job_name=dev:tap-chainalysis-alerts-to-target-jsonl name=target-jsonl producer=False run_id=2a2e07ff-7928-4500-847f-5f58e7e96baf stdio=stderr string_id=target-jsonl
    2025-10-10T19:56:07.955352Z [info     ] On instance['service']:        cmd_type=elb consumer=True job_name=dev:tap-chainalysis-alerts-to-target-jsonl name=target-jsonl producer=False run_id=2a2e07ff-7928-4500-847f-5f58e7e96baf stdio=stderr string_id=target-jsonl
    2025-10-10T19:56:07.957711Z [info     ]     None
    where it expects string but it can also be of none value. Is there a way to do schema overrides for this tap? I did not see such an option in here . I can probably do it with mappers but id rather not if there is a way inside the tap configs
    e
    • 2
    • 2
  • l

    Lior Naim Alon

    10/16/2025, 1:27 PM
    hello, i'm using tap-hubspot --variant "airbyte" to extract data from several hubspot streams. the tap takes about 45 minutes to extract very small amounts of data (~80MB) to S3, but the log is flooded with lots of errors along the lines of
    Copy code
    2025-10-16T13:05:43.487376Z [info     ] {'level': 'WARN', 'message': "Couldn't parse date/datetime string in hs_lifecyclestage_lead_date, trying to parse timestamp... Field value: 1709470649329. Ex: Unable to parse string [1709470649329]"} cmd_type=elb consumer=False job_name=staging:tap-hubspot-to-target-s3--raw-crm:eu-west-1-20251016 name=tap-hubspot producer=True run_id=0199ed1f-676c-7a87-ba25-9ddc70d8434c stdio=stderr string_id=tap-hubspot
    Since the amount of data is very low and other ETLs are running fairly faster, I imagine the issue is with the amount of parsing errors and parsing attempts, logging the error, etc. it looks like there is a log entry for each row in the source data. I tried (to no avail) to filter the specific fields using selection / custom mappers, but the errors persist. It is crucial for me to use the airbyte variant as it is the only variant that supports custom hubspot objects out-of-the-box. I'm looking for ways to tackle this issue - the goal is to make the ETL run as fast as a few minutes instead of 45 minutes
    e
    • 2
    • 2
  • o

    Otto Enholm

    10/23/2025, 8:19 AM
    Hello! I'm new to meltano and just learning how to use it, it seems my team set up a tap for adyen data that has started failing as of recently. It seems the repo has been removed? Do you have any suggestions on ways to work around this for adyen-tap? https://hub.meltano.com/extractors/tap-adyen/
    ✅ 1
    a
    r
    i
    • 4
    • 15
  • m

    mark_estey

    10/23/2025, 2:43 PM
    I'm trying to set up the Meltanolabs
    tap-snowflake
    to read a single table but running into an issue where it keeps trying to look at other schemas in the database that it does not have permission to. This is how my config looks (with values changed):
    Copy code
    plugins:
      extractors:
      - name: tap-snowflake
        variant: meltanolabs
        config:
          account: ...
          role: ...
          user: ...
          warehouse: ...
          database: my_database
          schema: my_schema
          tables:
            - my_schema.my_table
        select:
          - my_schema-my_table.*
    And this is the error I keep getting:
    Copy code
    sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 002043 (02000): 01bfe764-3203-6517-0000-120d27b7901e: SQL compilation error:
    Object does not exist, or operation cannot be performed.
    [SQL: SHOW /* sqlalchemy:get_schema_tables_info */ TABLES IN SCHEMA some_other_schema]
    The database user does not have permission to
    some_other_schema
    and will not get permission to that schema. I read that setting the tables config would limit discovery of the tap to only the listed objects, how do I get it to stop trying to inspect the other schemas in the database?
    e
    • 2
    • 1
  • a

    Andy Carter

    11/03/2025, 9:28 AM
    Any happy users of
    TAP-redshift
    in production? I will need the Monad-Inc variant which has been 2 years since a pr and now basically archived, which makes me slightly nervous. Anything to watch out for?
    ✅ 1
    • 1
    • 2
  • k

    Kevin Phan

    11/10/2025, 6:59 PM
    hey folks, just a general Q for the pipelinewise variant for tap postgres, if we add a new column to the postgres table and backfill the existing rows with the value for that new column, will meltano pickup these changes as part of the CDC ? its going to snowflake.
    e
    • 2
    • 2
  • k

    Kevin Phan

    11/13/2025, 9:02 PM
    Another quick Q, we have meltano map transformer that does some data type conversions for postgres to snowflake. In that custom transformer we list a bunch of tables that do the transformation. When we get an error like so:
    Copy code
    2025-11-13T20:55:29.337678Z [info     ] TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType' cmd_type=elb consumer=True job_name=prod:tap-core-db-source-to-target-snowflake-core-db-source:prod name=meltano-map-transformer producer=True run_id=e91286eb-6df0-440c-9a19-8a61a4dc6d7c stdio=stderr string_id=meltano-map-transformer
    Is there a way to see which table that was causing this? We can always query through each table that the transformer touches but that does take time. The logs do not tell us which table is causing the error
  • t

    Théo Lapido

    11/26/2025, 10:24 PM
    Hey everybody! I need some help with
    tap-googleads
    . I'm having trouble trying to use the custom query feature, I've added the required configs to the yml but the Stream is not being recognized.
    Copy code
    - name: tap-googleads
        variant: matatika
        pip_url: git+<https://github.com/Matatika/tap-googleads>
        config:
          start_date: '2025-10-28'
        custom_queries:
        - name: gads_ads_performance_custom
          query: 'SELECT ad_group_ad.ad.expanded_text_ad.headline_part1, ad_group_ad.ad.expanded_text_ad.headline_part2, ad_group_ad.ad.expanded_text_ad.headline_part3, a>
          primary_keys: [ad_group_ad.ad.name, segments.date]
          replication_key: segments.date
    Does anybody use this feature or has had similar trouble before?
    ✅ 1
    r
    • 2
    • 11
  • l

    Leandro Vieira

    11/27/2025, 4:36 PM
    Hello everyone! I've been using the
    tap-facebook
    extractor to get data from the
    adsinsight_default
    data stream, but all my data has been duplicated for some reason as seen in the screenshot. Can anyone help me figure out what's happening? I've selected only this specific stream in my yml file and it seems like the first day of the extraction it's not duplicated, it feels like it's extracting data for Day 1 and Day 2, then Day 2 and Day 3, Day 3 and Day 4, and so on...
    a
    • 2
    • 12