Peter Clemenko
01/19/2025, 2:23 PMPeter Clemenko
01/19/2025, 2:23 PMIan OLeary
02/04/2025, 6:52 PM2025-02-04T18:44:21.577497Z [info ] 2025-02-04 13:44:21,576 | INFO | tap-litmos.lt_UserDetails | Pagination stopped after 0 pages because no records were found in the last response
Where do I need to alter this to continue paginating even if no records were returned for a particular week? I looked into the BasiAPIPaginator and I didn't find where this message is printed.
Here's my current paginator:
class LitmosPaginator(BaseAPIPaginator):
def __init__(self, *args, **kwargs):
super().__init__(None, *args, **kwargs)
def has_more(self, response):
return self.get_next(response) < date.today()
def get_next(self, response):
params = dict(parse_qsl(urlparse(response.request.url).query))
return datetime.strptime(params["to"], OUTPUT_DATE_FORMAT).date() + timedelta(seconds=1)
I'm paginating via a 1 week date range and even if there were no records I still want to move on to the next range.
edit: would it be in "advance"?
def advance(self, response: requests.Response) -> None:
...
...
# Stop if new value None, empty string, 0, etc.
if not new_value:
self._finished = True
else:
self._value = new_valueAndy Carter
02/24/2025, 2:52 PMget_records in the normal way.
Instead of checking and rechecking for each new file in storage, I've discovered I can check for pipeline logs to see as each table / stream file is complete, then just read the file when I know it's saved to storage. However, I don't want to replace my 'file check' code with 'pipeline log check' code in each stream, as the rest call takes a while.
Is there a process I can run asynchronously at the tap level every 10 seconds or so, and in my stream.get_records() check the tap's cached version of the logs from ADF, and emit records if appropriate?
Ideally I don't want to wait for the whole pipeline to finish before I start emitting records - some data is ready in seconds but others take minutes.Tanner Wilcox
02/27/2025, 11:38 PMrequest_records()) on RESTStream say: "If pagination is detected, pages will be recursed automatically." but I'm not seeing how it's detecting pagination in this caseJun Pei Liang
03/07/2025, 12:14 AM- name: tap-oracle
variant: s7clarke10
pip_url: git+<https://github.com/s7clarke10/pipelinewise-tap-oracle.git>
config:
default_replication_method: LOG_BASED
filter_schemas: IFSAPP
filter_tables:
- IFSAPP-ABC_CLASS_TAB
host: xxxx
port: 1521
service_name: xxxx
user: ifsapp
metadata:
IFSAPP-ABC_CLASS_TAB:
replication-method: INCREMENTAL
replication-key: ROWVERSIONhawkar_mahmod
03/23/2025, 3:06 PMtap-growthbook using the Meltano SDK. When testing without meltano it runs fine, but then when I invoke with meltano run via uv run I get a discovery related error, and I don't know how to debug it. Here's the error:hawkar_mahmod
03/24/2025, 3:48 PMget_records method) the child class produces the expected number of records but the parent class just stops producing any data in the destination. I am only overwriting on the parent get_records not the child stream.Gordon Klundt
03/24/2025, 7:57 PMStéphane Burwash
03/25/2025, 5:51 PMpayload column which contains WAY too much PII, and I was hoping to be able to sync only the datapoints I needed (only grab values from payload.xyz if payload.xyz exists)
Thanks!
cc @visch since you're my tap-postgres guruhawkar_mahmod
03/28/2025, 5:25 PMReuben (Matatika)
04/10/2025, 3:19 AMrequest_records entirely). Thought about setting up a date range parent stream to pass dates through as context, which I think would solve the state update part of my problem, but it felt counter-intuitive to how parent-child streams should work and incremental replication in general (I would end up with a bunch of interim dates stored in state as context, as well as the final date as replication_key_value)visch
04/10/2025, 8:21 PMStéphane Burwash
04/15/2025, 2:06 PMtest functions, but I'd like to understand a bit more how they work (source https://sdk.meltano.com/en/latest/testing.html#singer_sdk.testing.get_tap_test_class)
What do the tests actually DO? My goal would be that it only tests that the tap CAN run, and not that it tries to run to completion. Most of my taps are full_table, so running their tests would take WAY too long.
Thanks 😄hawkar_mahmod
05/08/2025, 9:16 AMSegmentsStream) correctly fetches all segments and my override of get_child_context(record) returns {"segment_id": record["id"]} for the one segment I’m targeting.
- Child stream (SegmentMembersStream) has schema including "segment_id", no replication_key, and overrides parse_response(response) to yield only the member fields:
python
def parse_response(self, response):
for identifier in response.json().get("identifiers", []):
yield {
"member_id": identifier["id"],
"cio_id": identifier.get("cio_id"),
"email": identifier.get("email"),
}
- According to the docs, the SDK should automatically merge in the segment_id from context after parse_response (and before shipping the record out), as long as it’s in the schema. But in practice I only see segment_id in the separate context argument — it never appears in the actual record unless I manually inject it in `post_process`:
python
def post_process(self, row, context):
row["segment_id"] = context["segment_id"]
return row
Has anyone else seen this? Should the SDK be automatically adding parent-context fields into the record dict before emit, or is manual injection (in post_process) the expected approach here? Any pointers or workaround suggestions are much appreciated! 🙏Siddu Hussain
05/10/2025, 1:59 AMNico Deunk
05/23/2025, 7:05 AMRafał
05/27/2025, 8:35 AMmark_estey
06/02/2025, 9:06 PMAndy Carter
06/04/2025, 10:34 AMBuildings stream and a Tenants stream, and a ServiceRequests stream that requires a building_id AND`tenant_id` in the body of a post request (there is no GET/ list ServiceRequests endpoint).
So I would iterate Buildings, then Tenants then iterate ServiceRequests with a full outer join of the two parents. Is that possible?
https://www.linen.dev/s/meltano/t/16381950/hi-all-i-have-built-a-custom-tap-to-extract-data-from-an-api looks like a possible approach, basically I make my Tenants stream an artificial child of Buildings and then make service requests have Tenants as parent.Rafał
06/08/2025, 8:28 AMTanner Wilcox
06/10/2025, 9:04 PMreplication_method = "FULL_TABLE" to my stream class but that doesn't seem rightMindaugas Nižauskas
06/30/2025, 5:06 AMFlorian Bergmann
07/07/2025, 8:57 AMtap-oracle, based on s7clarke10 / pipelinewise to cover some special cases of our source db. During tests I noticed that the replication method log_based uses continuous mine, which is deprecated since Oracle 12c and desupported since Oracle 19c, so six years ago. - Anyone knows an alternative I could use for logmining functionality? - trigger-based instead of log-based is currently no option for us.Bruno Arnabar
08/07/2025, 9:21 PMmeltano select tap-canvas --list --all
the tap supports discovery and catalog featuresMatthew Wiseman
08/11/2025, 7:32 AMDon Venardos
08/20/2025, 6:28 PMmeltano run tap-mssql target-jsonl
The tap correctly reports an error:
2025-08-20T18:11:00.567018Z [info ] FATAL [main] tap-mssql.core - Fatal Error Occured - Stream rss_test_dbo_c_logical_field_user_values has unsupported primary key(s): logical_field_sid cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-jsonl name=tap-mssql producer=True run_id=c8e6cbbf-26d4-415c-83d0-c420ccbf706c stdio=stderr string_id=tap-mssql
2025-08-20T18:11:00.567219Z [info ] ERROR [main] #error { cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-jsonl name=tap-mssql producer=True run_id=c8e6cbbf-26d4-415c-83d0-c420ccbf706c stdio=stderr string_id=tap-mssql
2025-08-20T18:11:00.567439Z [info ] :cause Stream rss_test_dbo_c_logical_field_user_values has unsupported primary key(s): logical_field_sid cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-jsonl name=tap-mssql producer=True run_id=c8e6cbbf-26d4-415c-83d0-c420ccbf706c stdio=stderr string_id=tap-mssql
But then Meltano gets a stack dump.
2025-08-20T18:11:00.905173Z [error ] Extractor failed
2025-08-20T18:11:00.905403Z [error ] Block run completed block_type=ExtractLoadBlocks duration_seconds=89.355 err=RunnerError('Extractor failed') exit_codes={<PluginType.EXTRACTORS: 'extractors'>: 1} run_id=c8e6cbbf-26d4-415c-83d0-c420ccbf706c set_number=0 success=False
2025-08-20T18:11:00.906956Z [info ] Run completed duration_seconds=89.357 run_id=c8e6cbbf-26d4-415c-83d0-c420ccbf706c status=failure
2025-08-20T18:11:00.907373Z [error ] Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to join our friendly Slack community.
Run invocation could not be completed as block failed: Extractor failed
╭─────────────────────────────── Traceback (most recent call last) ...Tanner Wilcox
08/21/2025, 5:44 PMAdam Wegscheid
10/14/2025, 8:48 PMconfig_jsonschema attribute for a Tap and the settings block in meltano.yml? At a high level, they seem to perform the same job. When I view the many extractors on Meltano hub, some have config_jsonschema filled to the brim but settings are relatively bare bones (no description, title, etc.) and then other taps are the exact opposite. I also find it odd that running tap-example --about doesn't print out nested properties of objects regardless of additional properties being set to true so I can't tell if they even matter. As you can see from this mess, my head is all over the place on this topic!Andy Carter
10/15/2025, 7:22 AM