hallowed-quill-28385
10/31/2023, 10:41 PMv1.105.0
that we're preparing to migrate to v3.0.0
for. v1 is currently using a postgres database as its state backend. I've gone ahead and just created a new project running v3, rather than modifying the existing one. My question is would we be able to simply point the new project that's running v3 to the same backend uri while maintaining the existing bookmarks? i've been running v3 locally and inspecting the tables using the local file system in .meltano/meltano.db
vs the tables in our postgres instance and it looks like v3 has some new job related tables state
and runs
vs v1 which only has the job
table. Would someone be able to give me more info on how state was maintained in v1 vs any differences in v3?magnificent-artist-73954
11/01/2023, 12:20 AMbest-jewelry-71796
11/01/2023, 11:03 AMstream_maps
. My tap is tap-saasoptics and target is target-mysql
When I do this config on the target:
..
stream_maps:
transactions:
__alias__: my_prefix_transactions
I get:
singer_sdk.exceptions.RecordsWithoutSchemaException: A record for stream 'transactions' was encountered before a corresponding schema. Check that the Tap correctly implements the Singer spec.
When I do that config on the target:
..
stream_maps:
my_prefix_transactions:
__source__: transactions
I get two tables being created in MySQL - my_prefix_transactions and transactions
Any advice how to achieve that? I mean get only a single table created in MySQL db called my_prefix_transactions.alert-easter-79768
11/01/2023, 2:34 PMtap-google-search-console
but I'm getting duplicate data . I know meltano says that there could be duplicate data because of atleast-once
guarantee between tap and target but not sure that is the case. Is this a recurring theme with users of tap-google-search-console
acoustic-kilobyte-81969
11/01/2023, 8:37 PMmeltano run
? we have a pipeline of this form:
meltano run tap mapper target
where mapper
is responsible for obfuscating data from one of the fields that's being emitted by tap
. however the output of tap
is leaking sensitive information into logs which has to be avoidedwitty-sunset-64721
11/02/2023, 8:43 AMversion: 1
project_id: <project_id>
default_environment: ${MELTANO_ENVIRONMENT}
include_paths:
- ./environments/environments.meltano.yml
- ./extract/*.meltano.${MELTANO_ENVIRONMENT}.yml
- ./load/*.meltano.${MELTANO_ENVIRONMENT}.yml
- ./orchestrate/*.meltano.${MELTANO_ENVIRONMENT}.yml
but it doesn’t read the environment variable. I tried setting it also in environments.meltano.yml but it doesnt also workbland-iron-35300
11/02/2023, 11:58 AMbroad-pillow-19255
11/02/2023, 2:17 PMaverage-librarian-17951
11/02/2023, 4:27 PMmeltano config
. To configure a specific environment, please use the option --environment=<environment name>
"meltano --environment=dev" is no good ? Trying to understand this environment difference issue, i already had this working one dev environment ,rough-microphone-18889
11/02/2023, 6:01 PMjsonschema.exceptions.ValidationError: '"{\\"uuid\\":\\"b0f01821-5ced-4ad1-a430-9c3765c1e777\\",\\"email\\":\\"<mailto:xyz@email.com|xyz@email.com>\"}"' is not of type 'null', 'object'
bright-fountain-11354
11/02/2023, 6:17 PMmeltano select custom-snowflake-tap --list
is returning customer_schema-customer_table.* selected attributes
However, when I run meltano config custom-snowflake-tap test
, I get the error
Plugin configuration is invalid
No RECORD message received
the configurations for your reference.
- config:
account: <redacted>
database: CUSTOMER_DB
password: <redacted>
schema: customer_schema
user: <redacted>
warehouse: CUSTOMER_WH
tables:
- customer_schema.table_name
inherit_from: tap-snowflake
metadata: {}
name: custom-snowflake-tap
schema: {}
select:
- customer_schema-table_name.*
select_filter: []
bitter-byte-97186
11/03/2023, 12:07 AMexecutor
to LocalExecutor
.
Setup info:
• Meltano version: 3.1.0
• Airflow (in meltano.yaml
)
utilities:
- name: airflow
variant: apache
pip_url: psycopg2 apache-airflow-providers-postgres git+<https://github.com/meltano/airflow-ext.git@main>
apache-airflow==2.3.3 --constraint <https://raw.githubusercontent.com/apache/airflow/constraints-2.3.3/constraints-no-providers-${MELTANO__PYTHON_VERSION}.txt>
config:
core:
executor: LocalExecutor
webserver:
web_server_port: '8973'
database:
sql_alchemy_conn: <postgresql://jobyatp:xM2jaGOzfsOTe3o71TPBU1H8IzHvT6@172.29.1.3:5432/airflow>
• Double checked that there is no orchestrator version of airflow present.
• I ran meltano config airflow set core.executor LocalExecutor
However, this is what I get when I run `meltano invoke airflow config list`:
[core]
...
executor = SequentialExecutor
I just switched from using the Orchestrator version because it looks like the utilities version is preferred. Any help is appreciated!average-librarian-17951
11/03/2023, 5:10 AMbrief-market-98541
11/03/2023, 11:05 AMdazzling-intern-75621
11/03/2023, 1:19 PMbusy-florist-58991
11/03/2023, 4:48 PMtap-braintree
through Airbyte, and getting the following error when trying to load payment transactions:
ValueError: No format in ['%Y-%m-%d %H:%M:%S'] matching 2023-10-03
big-match-80461
11/03/2023, 5:15 PMCRITICAL Config is missing required keys: ['start_date', 'access_token']
I tried troubleshooting a couple of ways, not sure either correct.
1. I modified the plugins>extractors>tap-surveymonkey--singer.io.lock to include the above (access token is aliased)
2. The other is I added the above as config items in our environments config file for the extractor environments>environment.yml
a. environments:
b. - name: env
c. config:
d. plugins:
e. extractors:
f. - name: tap-surveymonkey
g. config:
h. start_date: '2023-01-01T00:00:00Z'
i. access_token: $(alias)
I am now met with:
Plugin configuration is invalid
Exception: key_properties must be a string or list of strings
any thoughts are appreciatedancient-television-56417
11/04/2023, 5:57 PMtap-oracle
usage.
Is this still the recommended go to? I see it was updated in august, but its dependency on a pipeline wise fork is from March.
Is this generally pretty normal for metano taps? I’m looking for some basic replication from oracle and hoping that meltano fits the bill here, but a bit concerned thinking of a production pipeline relying on stale packages.busy-activity-37964
11/05/2023, 8:28 PM$ meltano install extractor tap-singer-jsonl
but am getting the error message Failed to parse JSON array from string: 'None'
.
Python: Python 3.10.10
Meltano: meltano, version 3.1.0
How can I get around this issue?
EDIT: Seems to originate here:
/Users/user/data-project/meltano_pipeline/venv/lib/python3.10/site-packag │
│ es/meltano/core/settings_service.py:238 in config_with_metadata │
│ │
│ 235 │ │ │ if prefix and not setting_def.name.startswith(prefix): │
│ 236 │ │ │ │ continue │
│ 237 │ │ │ │
│ ❱ 238 │ │ │ value, metadata = self.get_with_metadata( │
│ 239 │ │ │ │ setting_def.name, │
│ 240 │ │ │ │ setting_def=setting_def, │
│ 241 │ │ │ │ source=source, │
│ │
│ /Users/user/data-project/meltano_pipeline/venv/lib/python3.10/site-packag │
│ es/meltano/core/settings_service.py:408 in get_with_metadata │
│ │
│ 405 │ │ │ │ │ value = object_value │
│ 406 │ │ │ │ │ metadata["source"] = object_source │
│ 407 │ │ │ │
│ ❱ 408 │ │ │ cast_value = setting_def.cast_value(value) │
│ 409 │ │ │ if cast_value != value: │
│ 410 │ │ │ │ metadata["uncast_value"] = value │
│ 411 │ │ │ │ value = cast_value │
│ │
│ /Users/user/data-project/meltano_pipeline/venv/lib/python3.10/site-packag │
│ es/meltano/core/setting_definition.py:441 in cast_value │
│ │
│ 438 │ │ │ │ ) │
│ 439 │ │ │ elif self.kind == SettingKind.ARRAY: │
│ 440 │ │ │ │ value = list( │
│ ❱ 441 │ │ │ │ │ self._parse_value(value, "array", Sequence), # type: ignore │
│ 442 │ │ │ │ ) │
│ 443 │ │ │
│ 444 │ │ processor = self.value_processor │
│ │
│ /Users/user/data-project/meltano_pipeline/venv/lib/python3.10/site-packag │
│ es/meltano/core/setting_definition.py:416 in _parse_value │
│ │
│ 413 │ │ │ ) as ex: │
│ 414 │ │ │ │ raise parse_error from ex │
│ 415 │ │ if not isinstance(parsed, expected_type): │
│ ❱ 416 │ │ │ raise parse_error │
│ 417 │ │ return parsed │
│ 418 │ │
│ 419 │ def cast_value(self, value: t.Any) -> t.Any: # noqa: C901
colossal-zebra-32658
11/06/2023, 2:05 PMbetter-diamond-57621
11/06/2023, 6:02 PMmost-eve-59826
11/06/2023, 10:20 PM.meltano/transformers/dbt/target/
) is outside the dbt project directory (transformer/
). As a result, any run gives an error:
2023-11-06T18:57:57.133172Z [info ] Environment 'dev' is active
Extension executing `dbt clean`...
18:57:58 Running with dbt=1.7.0
18:57:58 Encountered an error:
Runtime Error
dbt will not clean the following directories outside the project: ['.meltano/transformers/dbt/target']
This is similar to but (afaict) not the same as this open issue from last year.
Just wondering if anyone else has had this issue and/or if meltano folks have specific recommendations about how to handle this change in dbt's functionality. Poking around dbt's code suggests that there may be a straightforward workaround -- adding a new envvar / cli flag -- but maybe there's a better or less clunky way to address this. Thanks in advance for your help!average-librarian-17951
11/08/2023, 12:51 AMfast-fireman-17565
11/08/2023, 3:11 PMFULL_TABLE
replication.
Problem: I'm dealing with a table that has a size of 2.1 GB and contains over 11,000 rows, with two fields populated with extensive JSON data. Whenever Meltano processes this table, I encounter the following error message:
[error ] Extraction failed code=-9 message=
ELT could not be completed: Extractor failed.
Curiously, it starts the loading of some records into my data warehouse but subsequently crashes with the above error.
To provide context, this pipeline is orchestrated by Airflow and is built from an image featuring Meltano v:2.12.0.
• Database: PostgresSQL
• Target: Snowflake
• Tap: tap-postgres-transferwise
Has someone encountered something similar while loading large JSONs fields?gorgeous-lifeguard-4452
11/08/2023, 5:21 PM2023-11-08T03:22:48.638466Z [info ] time=2023-11-08 03:22:48 name=tap_s3_csv level=CRITICAL message=("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')) cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.638900Z [info ] Traceback (most recent call last): cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.639177Z [info ] File "/opt/dagster/dagster_home/meltano/.meltano/extractors/tap-s3-csv/venv/lib/python3.9/site-packages/urllib3/response.py", line 444, in _error_catcher cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.639491Z [info ] yield cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.639744Z [info ] File "/opt/dagster/dagster_home/meltano/.meltano/extractors/tap-s3-csv/venv/lib/python3.9/site-packages/urllib3/response.py", line 567, in read cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.639976Z [info ] data = self._fp_read(amt) if not fp_closed else b"" cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.640201Z [info ] File "/opt/dagster/dagster_home/meltano/.meltano/extractors/tap-s3-csv/venv/lib/python3.9/site-packages/urllib3/response.py", line 533, in _fp_read cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.640433Z [info ] return self._fp.read(amt) if amt is not None else self._fp.read() cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.640650Z [info ] File "/usr/local/lib/python3.9/http/client.py", line 463, in read cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.640875Z [info ] n = self.readinto(b) cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.641086Z [info ] File "/usr/local/lib/python3.9/http/client.py", line 507, in readinto cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.641397Z [info ] n = self.fp.readinto(b) cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.641677Z [info ] File "/usr/local/lib/python3.9/socket.py", line 704, in readinto cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.641922Z [info ] return self._sock.recv_into(b) cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.642150Z [info ] File "/usr/local/lib/python3.9/ssl.py", line 1275, in recv_into cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.642369Z [info ] return self.read(nbytes, buffer) cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.642587Z [info ] File "/usr/local/lib/python3.9/ssl.py", line 1133, in read cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.642805Z [info ] return self._sslobj.read(len, buffer) cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.643025Z [info ] ConnectionResetError: [Errno 104] Connection reset by peer cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.643242Z [info ] cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.643466Z [info ] During handling of the above exception, another exception occurred: cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
2023-11-08T03:22:48.643682Z [info ] cmd_type=elb consumer=False name=tap-s3-csv producer=True stdio=stderr string_id=tap-s3-csv
gifted-zoo-4913
11/08/2023, 5:23 PMenvironments:
- name: dev
config:
plugins:
extractors:
- name: tap-mongodb
config:
strategy: envelope
mongo:
host: <mongodb://localhost:27017/>
select:
- t_ent_393_entity_safety-report.*
loaders:
- name: target-snowflake
config:
account: XXX.eu-west-2.aws
database: DEV
schema: NET
user: MELTANO
warehouse: ELT
role: DATALOADER
default_target_schema: NET
file_format: DEV.NET.MELTANO_CSV
hard_delete: false
- name: staging
- name: prod
plugins:
extractors:
- name: tap-mongodb
variant: z3z1ma
pip_url: git+<https://github.com/z3z1ma/tap-mongodb.git>
loaders:
- name: target-jsonl
variant: andyh1203
pip_url: target-jsonl
- name: target-snowflake
variant: meltanolabs
pip_url: meltanolabs-target-snowflake
billions-musician-96923
11/08/2023, 8:07 PMmeltano config <plugin> set
command? Or some other way?busy-activity-37964
11/09/2023, 12:47 PMtap-singer-jsonl -> target-redshift
and although I have the S3 State Backend configured it doesn't seem to be used.
Every run of the pipeline results in all of the files in the S3 bucket being read.
Anyone else encountered this?happy-rocket-1528
11/09/2023, 2:47 PMus
, gb
, fr
) and other miscellaneous schemas.
I want to replicate all country schemas to Snowflake as-is (us
-> us
, gb
-> gb
) but all other schemas should go to misc
. How can I achieve this?anonymous
11/10/2023, 2:07 PMagreeable-fall-83767
joined #troubleshooting. Also, ripe-window-43112
, adorable-photographer-18422
and rapid-answer-47056
joined.