straight-tiger-68114
05/09/2025, 11:55 AMMETAFLOW_ARGO_EVENTS_INTERNAL_WEBHOOK_URL
for sending events from the flows to the outside world and another that is exposed for other services to push events into argo.
Adding the @trigger
decorator and trying to deploy the workflow returns an error
An Argo Event name hasn't been configured for your deployment yet. Please see this article for more details on event names - <https://argoproj.github.io/argo-events/eventsources/naming/>. It is very likely that all events for your deployment share the same name. You can configure it by executing `metaflow configure kubernetes` or setting METAFLOW_ARGO_EVENTS_EVENT in your configuration. If in doubt, reach out for support at <http://chat.metaflow.org>
My question is:
If I configure the METAFLOW_ARGO_EVENTS_EVENT
will that interfere with the existing outgoing events data flow?
Alternatively, I could setup Sensors to trigger each flow, but I was hoping to keep this configuration piece in the metaflow realm.fast-advantage-42097
05/09/2025, 3:46 AMlively-lunch-9285
05/09/2025, 2:08 AMhundreds-wire-22547
05/08/2025, 7:19 PMfrom metaflow import DeployedFlow
# use the identifier saved above..
deployed_flow = DeployedFlow.from_deployment(identifier=identifier)
triggered_run = deployed_flow.trigger()
hundreds-football-74720
05/08/2025, 10:10 AMTraceback (most recent call last):
File "/usr/local/bin/ui_backend_service", line 33, in <module>
sys.exit(load_entry_point('metadata-service', 'console_scripts', 'ui_backend_service')())
File "/root/services/ui_backend_service/ui_server.py", line 152, in main
loop.run_forever()
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 607, in run_forever
self._run_once()
File "/usr/local/lib/python3.11/asyncio/base_events.py", line 1922, in _run_once
handle._run()
File "/usr/local/lib/python3.11/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.11/site-packages/aiohttp/web_protocol.py", line 452, in _handle_request
resp = await request_handler(request)
File "/usr/local/lib/python3.11/site-packages/aiohttp/web_app.py", line 543, in _handle
resp = await handler(request)
File "/usr/local/lib/python3.11/site-packages/aiohttp/web_middlewares.py", line 114, in impl
return await handler(request)
File "/root/services/utils/__init__.py", line 89, in wrapper
err_trace = getattr(err, 'traceback_str', None) or get_traceback_str()
File "/root/services/utils/__init__.py", line 84, in wrapper
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/services/ui_backend_service/api/log.py", line 113, in get_task_log_stderr
return await self.get_task_log(request, STDERR)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/services/ui_backend_service/api/log.py", line 220, in get_task_log
lines, page_count = await read_and_output(self.cache, task, logtype, limit, page, reverse_order)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/services/ui_backend_service/api/log.py", line 261, in read_and_output
raise LogException("Cache returned None for log content and raised no errors. \
services.ui_backend_service.api.log.LogException: Cache returned None for log content and raised no errors. The cache server might be experiencing issues.
adorable-oxygen-86530
05/06/2025, 3:43 PM2025-05-06 15:39:35.945 [759/start/6468 (pid 139695)] Transient S3 failure (attempt #1) -- total success: 2, last attempt 2/4 -- remaining: 2
2025-05-06 15:39:39.125 [759/start/6468 (pid 139695)] Transient S3 failure (attempt #2) -- total success: 2, last attempt 0/2 -- remaining: 2
2025-05-06 15:39:44.889 [759/start/6468 (pid 139695)] Transient S3 failure (attempt #3) -- total success: 2, last attempt 0/2 -- remaining: 2
2025-05-06 15:39:53.515 [759/start/6468 (pid 139695)] Transient S3 failure (attempt #4) -- total success: 2, last attempt 0/2 -- remaining: 2
which leads to a complete halt of the workflow eventually.
Our setup consist of an on-prem Minio s3 instance and metaflow all running on a kubernetes cluster.
Switching back to 2.15.7 resolves the error magically. Any ideas?
Cheersable-battery-82852
05/06/2025, 7:38 AMdry-beach-38304
05/06/2025, 7:26 AMmetaflow environemnt
command which is somewhat useful 🙂 (metaflow environment --help
will provide some more info). You also have myflow.py --environment=conda environment --help
and lastly, I think I also have Runner("myflow.py", environment="conda").environment
type of stuff (that last one may actually still be pending and not yet in the open version).able-battery-82852
05/06/2025, 7:23 AMdry-beach-38304
05/06/2025, 7:21 AMable-battery-82852
05/06/2025, 7:18 AMable-battery-82852
05/06/2025, 7:17 AMable-battery-82852
05/06/2025, 7:16 AMdry-beach-38304
05/06/2025, 7:13 AMdry-beach-38304
05/06/2025, 7:13 AMdry-beach-38304
05/06/2025, 7:12 AMable-battery-82852
05/06/2025, 5:24 AMmetaflow-nflx-ext
and this works flawlessly!
Just one question:
It doesn't seem to be possible to combine a custom image with @batch
and add additional libraries with @pypi
or @conda
. Is this because the extension handles these environments differently?
According to the original metaflow docs this should be possible, so I'm wondering why it is not possible with the extension.enough-article-90757
05/05/2025, 9:57 PMpypi
decorator for package management, but when I run this in the metaflow-dev
stack, I get an error saying that Metaflow can't find Conda artifacts. Has anyone seen this before, and is it expected?
This is my invocation + output:
❯ python ubuntu_updates.py --environment=pypi run --with kubernetes [2/5008]
Metaflow 2.15.10 executing UbuntuUpdatesFlow for user:coder
Validating your flow...
The graph looks good!
Running pylint...
Pylint not found, so extra checks are disabled.
2025-05-05 21:50:24.808 Bootstrapping virtual environment(s) ...
2025-05-05 21:50:24.881 Virtual environment(s) bootstrapped!
2025-05-05 21:50:25.263 Workflow starting (run-id 4), see it in the UI at <http://localhost:3000/UbuntuUpdatesFlow/4>
2025-05-05 21:50:25.673 [4/start/8 (pid 434281)] Task is starting.
2025-05-05 21:50:26.493 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Task is starting (Pod is pending, Container is waiting - ContainerCreating)...
2025-05-05 21:50:27.218 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Setting up task environment.
2025-05-05 21:50:32.327 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Downloading code package...
2025-05-05 21:50:32.938 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Code package downloaded.
2025-05-05 21:50:32.977 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Task is starting.
2025-05-05 21:50:33.839 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Bootstrapping virtual environment...
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Bootstrap failed while executing: set -e;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] tmpfile=$(mktemp);
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] echo "@EXPLICIT" > "$tmpfile";
2025-05-05 21:50:37.212 [4/start/8 (pid 434281)] Kubernetes error:
2025-05-05 21:50:37.212 [4/start/8 (pid 434281)] Error: Setting up task environment.
2025-05-05 21:50:37.212 [4/start/8 (pid 434281)] Downloading code package...
2025-05-05 21:50:37.212 [4/start/8 (pid 434281)] Code package downloaded.
2025-05-05 21:50:37.212 [4/start/8 (pid 434281)] Task is starting.
2025-05-05 21:50:37.213 [4/start/8 (pid 434281)] Bootstrapping virtual environment...
2025-05-05 21:50:37.213 [4/start/8 (pid 434281)] Bootstrap failed while executing: set -e;
2025-05-05 21:50:37.213 [4/start/8 (pid 434281)] tmpfile=$(mktemp);
2025-05-05 21:50:37.213 [4/start/8 (pid 434281)] echo "@EXPLICIT" > "$tmpfile";
2025-05-05 21:50:37.213 [4/start/8 (pid 434281)] ls -d /metaflow/.pkgs/conda// >> "$tmpfile";
2025-05-05 21:50:37.345 [4/start/8 (pid 434281)] export PATH=$PATH:$(pwd)/micromamba;
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] export CONDA_PKGS_DIRS=$(pwd)/micromamba/pkgs;
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] export MAMBA_NO_LOW_SPEED_LIMIT=1;
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] export MAMBA_USE_INDEX_CACHE=1;
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] export MAMBA_NO_PROGRESS_BARS=1;
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] export CONDA_FETCH_THREADS=1;
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] micromamba create --yes --offline --no-deps --safety-checks=disabled --no-extra-safety-checks --prefix /metaflow/linux-64/f35ade658f6977a --file "$tmpfile" -
-no-pyc --no-rc --always-copy;
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] rm "$tmpfile"
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] Stdout:
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] Stderr: ls: cannot access '/metaflow/.pkgs/conda//': No such file or directory
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)]
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)] (exit code 1). This could be a transient error. Use @retry to retry.
2025-05-05 21:50:37.346 [4/start/8 (pid 434281)]
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] ls -d /metaflow/.pkgs/conda/*/* >> "$tmpfile";
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] export PATH=$PATH:$(pwd)/micromamba;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] export CONDA_PKGS_DIRS=$(pwd)/micromamba/pkgs;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] export MAMBA_NO_LOW_SPEED_LIMIT=1;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] export MAMBA_USE_INDEX_CACHE=1;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] export MAMBA_NO_PROGRESS_BARS=1;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] export CONDA_FETCH_THREADS=1;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] micromamba create --yes --offline --no-deps --safety-checks=disabled --no-extra-safety-checks --prefix /metaflow/linux-64/f35ade658f
6977a --file "$tmpfile" --no-pyc --no-rc --always-copy;
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] rm "$tmpfile"
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Stdout:
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v] Stderr: ls: cannot access '/metaflow/.pkgs/conda/*/*': No such file or directory
2025-05-05 21:50:35.659 [4/start/8 (pid 434281)] [pod t-840c2468-jfv52-wt56v]
2025-05-05 21:50:37.362 [4/start/8 (pid 434281)] Task failed.
2025-05-05 21:50:37.381 Workflow failed.
2025-05-05 21:50:37.382 Terminating 0 active tasks...
2025-05-05 21:50:37.382 Flushing logs...
Step failure:
Step start (task-id 8) failed.
Any info would be useful, thanks!!white-helicopter-28706
05/02/2025, 5:25 PMprehistoric-waiter-14304
05/02/2025, 3:19 PM2.15.0
as well as 2.15.9
E1101: Instance of 'Deployer' has no 'step_functions' member (no-member)
It doesn't seem like this should be happening?
https://docs.metaflow.org/api/deployer#Deployer.step_functionselegant-painter-46407
05/01/2025, 12:21 PMstring
. This string is a comma separated list of string which will splitted based on ,
by the flow (exact same method used)
The method simply does:
out_list = [out.rstrip().lstrip() for out in input_string.split(delimeter)]
return out_list
---
Now, I have two flows:
1. A metaflow flow that has been converted to an argo workflow e.g python parameter_flow.py --with retry argo-workflows create
. When using the argo client to submit the workflow with parameters = {"comma_sep_list_of_strings": "string1,string2"}
the application is successfully splitting this into a list
2. In contrast. when submitting the native argo workflow with the same parameter parameters = {"comma_sep_list_of_strings": "string1,string2"}
, the list will end up being ['"string1', 'string2"']
• The default value for this comma_sep_list_of_strings
parameter in both cases is ""
• (Json file) The representation of the default value for the parameter by the metaflow created argo workflow is "\"\""
• (Json file) The representation of the default value for the parameter by the metaflow created argo workflow is ""
(from the ArgoUI)
It should be noted that I don't have the same issue when submitting the native argo workflow from the UI.
Is there anything I am missing on how to submit the native argo workflow? Thanksrhythmic-beach-70913
04/30/2025, 9:15 AM"note": "Internal representation of IncludeFile(…)"
from the parameters
Is there any reason not to do that? (I can’t see how it could be used anyway), or at least allow it to be set rather than always defaulting
The main problem is the length of
"ContainerOverrides": {
"Command": [
"bash",
"-c",
"true && mkdir -p $PWD/.logs && ....
hallowed-soccer-94479
04/28/2025, 8:48 PM@checkpoint
decorator when using @parallel
steps? The docs I found here say TODO
https://github.com/outerbounds/metaflow-checkpoint-examples/blob/master/documentation/checkpoint_deco/checkpoint_usage.md#saving--loa[…]rallel-stepscold-balloon-7686
04/25/2025, 6:43 PMfrom metaflow import FlowSpec, step, NBRunner
class HelloFlow(FlowSpec):
@step
def start(self):
self.x = 1
self.next(self.end)
@step
def end(self):
self.x += 1
print("Hello world! The value of x is", self.x)
run = NBRunner(HelloFlow).nbrun(decospecs=["argo-workflows"])
• Any idea how to achieve this?
• Any idea how to default to using Argo Workflows?
Thanksable-battery-82852
04/25/2025, 7:45 AM@conda
.
Whether run locally on arm64 or pushing it to batch, both give the same errormammoth-rainbow-82717
04/25/2025, 7:07 AMforeach
loop on Kubernetes/Argo. Will the order of the steps respect the order of the list provided to the foreach
? Or the order can be random somehow?
TIAacoustic-van-30942
04/24/2025, 6:58 PMnutritious-forest-60017
04/24/2025, 3:15 PMfew-salesmen-35936
04/23/2025, 10:54 PMService Unavailable: connection error: desc = "transport: authentication handshake failed: credentials: cannot check peer: missing selected ALPN property. If you upgraded from a grpc-go version earlier than 1.67, your TLS connections may have stopped working due to ALPN enforcement. For more details, see: <https://github.com/grpc/grpc-go/issues/434>"
we deployed the entire stack using https://github.com/outerbounds/metaflow-tools/tree/master/gcp/terraform on our GCP project.
Thanks a lot in advance!dazzling-garden-6456
04/22/2025, 3:37 PM