matt_elgazar
12/05/2024, 6:58 PMsettings
and select
streams from meltano.yml in the tap itself? In the tap-mongodb codebase there is a part that hits all collections in the database, but this is unnecessary if I’m only running a select on one collection
for collection in self.database.list_collection_names(authorizedCollections=True, nameOnly=True):
...
I was thinking I can add a configuration for the behavior
if self.discovery_mode == 'select':
collections = <get current selected streams>
else:
collections = self.database.list_collection_names(authorizedCollections=True, nameOnly=True)
I can force it in a way that’s probably super bad practice and wouldn’t generalize across different env configurations:
selected_collections = yaml.safe_load(open('meltano.yml')).get('plugins').get('extractors')[0].get('select')
Andres Felipe Huertas Suarez
12/11/2024, 7:44 AMtarget-parquet
loader I have done in the past for other repos, but now I seems to have some problems with the build of pyarrow (not quite sure what is happening) using uvx meltano add loader target-parquet
yields the following error
It has something to do with the pyarrow wheels, maybe some conflicting versions, I tried using uv pip install pyarrow
or uv pip install pyarrow==14.0.0
the uv.lock file looks like. I'm using python 3.9 as that s the supported version for the tap I want to use (tap-shopify) any ideas what is wrong?
version = 1
requires-python = "==3.9.20"
[[package]]
name = "dataops"
version = "0.1.0"
source = { virtual = "." }
Andres Felipe Huertas Suarez
12/11/2024, 1:45 PMuvx meltano install
and the instalation fails, it was working before and now I dont't quite understand what is going wrong. It seems it is trying to install the tap-awin using a 3.13 python env that I dont know where is it coming from. The tap I have in a local repo, and the pyproject doesn't point to python 3.13 but:
[tool.poetry.dependencies]
python = "<3.10,>=3.6.2"
requests = "^2.25.1"
singer-sdk = "^0.3.16"
and my meltano project should be running using python 3.9 (that is what I see when I do uv run python --version
) any ideas here? also If I go directly to the tap repo and run poetry install
it works without issues, clues? thanks! 🙂josh_lloyd
12/13/2024, 8:18 PM<https://meltano.slack.com/join/shared_invite/zt-2mslb6jbl-5n1DlD_1mFudiJLGBWqA2Q#/shared-invite/error>
Alexander Trauzzi
12/16/2024, 6:27 PMSamson Eromonsei
01/12/2025, 11:57 PMtap-rest-api-msdk
extractor where I’m encountering a JSON decoding error. The API endpoint I’m using points to a CSV file hosted on AWS API Gateway.
When I tested the API endpoint in Postman, it returned a 200 OK status and successfully provided the text data.
However, I wrote my first meltano.yml file to replicate the same operation, but I’m running into an error related to JSON decoding. The API is suppose returns a CSV file, and I’ve specified the Content-Type in the headers as text/csv, but I’m unsure if I’ve configured it correctly.
Here’s the error I’m seeing below
Any guidance or suggestions to resolve this would be greatly appreciated!
Thank you!
File "site-packages/singer_sdk/tap_base.py", line 134, in streams
for stream in self.load_streams():
File "site-packages/singer_sdk/tap_base.py", line 358, in load_streams
for stream in self.discover_streams():
File "site-packages/tap_rest_api_msdk/tap.py", line 494, in discover_streams
schema = self.get_schema()
File "site-packages/tap_rest_api_msdk/tap.py", line 615, in get_schema
extract_jsonpath(records_path, input=_json())
File "site-packages/requests/models.py", line 978, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Here is my first ever version meltanol.yaml not sure if I am using a wrong version or doing the wrong thing just follow the basic instructions
version: 1
default_environment: dev
project_id: fcde5f5-df01-438f-9b43-dd0e0f50e48a
environments:
◦ name: dev
config:
plugins:
extractors:
◦ name: tap-rest-api-msdk
config:
api_url: https://api.stormvistawxmodels.com/v1/model-data/ecmwf-eps/20250109/12
streams:
◦ name: ercot-solargen-forecast
path: ~/home/file.csv
headers:
content-type: text/csv
# api_keys:
# X-API-KEY: <your-api-key>
loaders:
◦ name: target-azureblobstorage
config:
account_name: dlsdevgbaz1527foan1st
container_name: xxxxxxxxxxxx
environments:
◦ name: staging
◦ name: prod
plugins:
extractors:
▪︎ name: tap-rest-api-msdk
variant: widen
pip_url: tap-rest-api-msdk
loaders:
▪︎ name: target-azureblobstorage
variant: shrutikaponde-vc
pip_url: git+https://github.com/shrutikaponde-vc/target-azureblobstorage.git
liat katzav
01/16/2025, 1:16 PMmeltano.yml
file, we have configured the S3 path. However, when I run:
meltano --environment=dev run tap-stripe target-snowflake
I get the following error message:
boto3 required but not installed. Install meltano[s3] to use S3 as a state backend. state_backend=AWS S3
2025-01-16T13:13:02.565031Z [error] Cannot start plugin tap-stripe: Failed to retrieve state
Can you please advise on how to resolve this issue?Anton Kuiper
01/18/2025, 3:20 PMversion: 1
default_environment: dev
project_id: 2910bb16-b01d-469c-8454-1c401537fe4c
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-github
variant: meltanolabs
pip_url: meltanolabs-tap-github
config:
start_date: '2024-01-01'
repositories:
- meltano/meltano
select:
- commits.url
- commits.sha
- commits.commit_timestamp
loaders:
- name: target-jsonl
variant: andyh1203
pip_url: target-jsonl
The output is run into the shell(terminal) i use Visual Studio Code. I've included the github personal key.. i can find that in the .env file. Any help would be nice. I've try to get some advice from chatGPT too but that one is far from it right now.Yasmim
01/25/2025, 8:05 PMKurt Snyder
01/27/2025, 11:36 PMmeltano add extractor tap-mysql
ran into this error (all other steps seemed to have worked):
Building wheel for pendulum (pyproject.toml): started
error: subprocess-exited-with-error
× Building wheel for pendulum (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.
This is on M2 Pro MacOS 14.7 with pyenv running 3.12.6 right after installing meltano with
pipx install "meltano"
installed package meltano 3.6.0, installed using Python 3.13.1
These apps are now globally available
- meltano
Any suggestions appreciatedDenis Gribanov
01/28/2025, 8:28 PM/extract
and /load
directories? I'd like to keep all taps and targets in the same repository. What potential issues might I encounter if I take this approach?Jean Paul Azzopardi
01/28/2025, 9:28 PMmeltano.yml
file but keep receiving a "loader failed" error. Using key-pair auth with private key in .env
file - any thoughts? Tried debugging but logs are unclear to me. Thanks!
loaders:
- name: target-snowflake
variant: meltanolabs
pip_url: meltanolabs-target-snowflake
config:
account: xxxx
add_record_metadata: false
database: production
default_target_schema: public
role: xxxx
schema: xxxx
user: xxxx
warehouse: default
Jay
01/31/2025, 9:16 AMChris Walker
02/13/2025, 9:43 PMdbname
02/17/2025, 8:20 PMKavin Srithongkham
03/04/2025, 11:46 AMtap-sharepointsites
and I am getting this error
2025-03-04T11:39:22.288098Z [error ] Extractor 'tap-sharepointsites' could not be installed: Failed to install plugin 'tap-sharepointsites'.
2025-03-04T11:39:22.288141Z [info ] ERROR: Ignored the following versions that require a different python version: 0.0.1 Requires-Python >=3.7.1,<3.11
ERROR: Could not find a version that satisfies the requirement tap-sharepointsites (from versions: none)
ERROR: No matching distribution found for tap-sharepointsites
Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.
Failed to install plugin(s)
Originally, I had Python 3.11 so I used pyenv to downgrade to Python 3.10 but it still seems like it doesn't want to find the right version.
Does anyone have any ideas about what I should try out?Tanner Wilcox
03/05/2025, 6:19 PMmeltano invoke dbt-postgres:run
can I specify just one source to transform?Chad Bell
03/06/2025, 3:56 AMmeltano run tap-postgres target-bigquery
Is there a way to load the data into bigquery columns directly, instead of one json "data" column?Allan Whatmough
03/07/2025, 5:38 AMNo matching distribution found for meltano==2.10.0
Was this version yanked? I can still see it in PyPIJuan Pablo Herrera
03/12/2025, 10:41 PMversion: 1
default_environment: dev
project_id: 2fc4aa94-ed4d-49cd-9b6b-c1644bf4608e
environments:
- name: dev
- name: staging
- name: prod
plugins:
extractors:
- name: tap-spreadsheets-anywhere
variant: ets
pip_url: git+<https://github.com/ets/tap-spreadsheets-anywhere.git>
config:
tables:
- path: 'file:///Users/juanherrera/Desktop/subway-monthly-data'
name: 'subway_monthly_data'
pattern: 'MTA_Subway_Hourly_Ridership_small.csv'
start_date: '2025-03-12T15:30:00Z'
prefer_schema_as_string: true
key_properties: ['id']
format: csv
loaders:
- name: target-parquet
variant: automattic
pip_url: git+<https://github.com/Automattic/target-parquet.git>
config:
destination_path: data/subway_data
compression_method: snappy
logging_level: info
disable_collection: true
Nivetha
03/17/2025, 7:32 PMOren Teich
03/20/2025, 4:21 AMAlejandro Rodriguez
03/20/2025, 2:04 PM2025-03-20T13:59:43.766320Z [debug ] Could not find state.json in /projects/.meltano/extractors/tap-mysql/state.json, skipping.
2025-03-20T13:59:43.793158Z [warning ] No state was found, complete import.
and then every table says it requires a full resync. Even when I manually copy the state to the location in the first log line, it doesn’t pick it up. Any ideas?Don Venardos
04/22/2025, 1:03 AMmeltano run tap-mssql target-jsonl
The state gets updated with each run: {"bookmarks": {"dbo-c_logical_field_user_values": {}}}
Extractor config:
extractors:
- name: tap-mssql
config:
host: PROJECT01
port: 60065
database: rss_test
username: svcTestAccount
default_replication_method: LOG_BASED
sqlalchemy_url_query_options:
- key: driver
value: ODBC Driver 18 for SQL Server
- key: TrustServerCertificate
value: yes
select:
- dbo-c_logical_field_user_values.*
I think this might be a configuration issue, not sure but perhaps isn't picking up the default replication method?
{"event": "Visiting CatalogNode.STREAM at '.streams[352]'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910101Z"}
{"event": "Setting '.streams[352].selected' to 'False'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910162Z"}
{"event": "Setting '.streams[352].selected' to 'True'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910211Z"}
{"event": "Skipping node at '.streams[352].tap_stream_id'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910259Z"}
{"event": "Skipping node at '.streams[352].table_name'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910306Z"}
{"event": "Skipping node at '.streams[352].replication_method'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910354Z"}
{"event": "Skipping node at '.streams[352].key_properties[0]'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910402Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.logical_field_sid'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910457Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.enabled_flag'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910513Z"}
{"event": "Skipping node at '.streams[352].schema.properties.enabled_flag.maxLength'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910604Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.modified_by_user_sid'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910779Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.modified_datetime'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910924Z"}
{"event": "Skipping node at '.streams[352].schema.properties.modified_datetime.format'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910988Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.timestamp'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.911099Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.system_modified_datetime'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.911160Z"}
{"event": "Skipping node at '.streams[352].schema.properties.system_modified_datetime.format'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911212Z"}
{"event": "Skipping node at '.streams[352].schema.type'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911262Z"}
{"event": "Skipping node at '.streams[352].schema.required[0]'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911312Z"}
{"event": "Skipping node at '.streams[352].schema.$schema'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911361Z"}
{"event": "Skipping node at '.streams[352].is_view'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911410Z"}
{"event": "Skipping node at '.streams[352].stream'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911458Z"}
{"event": "Visiting CatalogNode.METADATA at '.streams[352].metadata[0]'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.911509Z"}
{"event": "Visiting metadata node for tap_stream_id 'dbo-c_logical_field_user_values', breadcrumb '['properties', 'logical_field_sid']'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911558Z"}
{"event": "Setting '.streams[352].metadata[0].metadata.selected' to 'False'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911616Z"}
{"event": "Setting '.streams[352].metadata[0].metadata.selected' to 'True'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911665Z"}
Anyone have suggestions on troubleshooting?
No errors like the previous question about not finding the state.
SQL Server tables have Change Tracking enabled in SQL Server as:
ALTER TABLE dbo.' + @ls_table_name + N'
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = OFF);
jack yang
04/24/2025, 2:15 AMRafael Rotter
04/28/2025, 6:47 PMmeltano config tap-mongodb test
) I get the message:
m-meltano:~/prj-mdb-gbq$ meltano config tap-mongodb test
2025-04-28T17:50:03.990046Z [info ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
2025-04-28T18:03:11.496374Z [warning ] Stream `classe` was not found in the catalog
Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to join our friendly Slack community.
Plugin configuration is invalid
No RECORD or BATCH message received. Verify that at least one stream is selected using 'meltano select tap-mongodb --list'.
The meltano.yml looks like this:
version: 1
default_environment: dev
project_id: c1ac854b-545d
environments:
- name: dev
plugins:
extractors:
- name: tap-mongodb
variant: z3z1ma
pip_url: git+<https://github.com/z3z1ma/tap-mongodb.git>
config:
mongo:
host: 12.34.5.678
port: 27017
directConnection: true
readPreference: primary
username: datalake
password: ****
authSource: db
tls: false
strategy: infer
select:
- classe.*
metadata:
dbprocapi_classe:
replication_key: replication_key
replication-method: LOG_BASED
For testing purposes I am trying to load only the "classe" collection (- classe.*) from the db database.
When I use the command "`meltano select tap-mongodb --list --all`" I have :
Enabled patterns: classe.*
but also appears in
[excluded ] db_classe.field1
[excluded ] db_classe.field2
[excluded ] db_classe.field3
It is important to note that MongoDB does not have replicas.
I'm using:
• a VM on Google Cloud to access MongoDB, both on the same network;
• the tap-mongodb extractor (z3z1ma).
Could someone please help me?
Thank you.Tanner Wilcox
04/29/2025, 8:37 PMJordan Lee
04/30/2025, 1:29 AMmeltano add files files-docker-compose
, but this adds a broken docker-compose.yml
definition that doesn't start, throwing Error: No such command 'ui'.
Steven Searcy
05/02/2025, 3:17 PMRafael Rotter
05/09/2025, 2:41 PMtarget-bigqeury
(z3z1ma) to receive data from MongoDB into BigQuery.
I managed to send some collections to the target (not all), but some questions arose. If you could help me when you can, please, I would appreciate it:
1. How can I specify in target-bigquery
some tables in BigQuery that should be partitioned by field X and clustered by Y, Z?
2. Why are two tables created in BigQuery: one with the execution time suffix, with data, and another without suffix and without data? Is a new table created with each load? (attached file)
3. I would like to confirm: normally there is no change in the MongoDB schema, but it can occur in case of an update. I am using denormalized: true. In case of a change, this can impact the load, correct?
4. The last error I got was "`ParseError: null is not allowed to be used as an element in a repeated field at processo.prioridade[0]`". Is it possible to handle this in stream-maps?
Thanks!