This message was deleted.
# opal
s
This message was deleted.
o
Hey @Jack Geek, I’ll try to investigate it with my team 🙂
a
are you sure
enteries
is not a typo?
should be
entries
also @Ro'e Katz assuming @Jack Geek’s typo was in the question and not in his actual config, this might be related to your changes in 0.5.0
r
Honestly that would surprise me as I don’t see how my changes will do something like that - But let’s wait for Jack’s response. @Jack Geek if the typo is not the issue - it would be great to get the full error.
j
Hello, I fixed the typo and still have the same error : opal-server | [2023-02-28 175051 +0000] [1] [INFO] Starting gunicorn 20.1.0 opal-server | [2023-02-28 175051 +0000] [1] [INFO] Listening at: http://0.0.0.0:7002 (1) opal-server | [2023-02-28 175051 +0000] [1] [INFO] Using worker: uvicorn.workers.UvicornWorker opal-server | [2023-02-28 175051 +0000] [7] [INFO] Booting worker with pid: 7 opal-server | Failed parsing config key- OPAL_DATA_CONFIG_SOURCES opal-server | [2023-02-28 175051 +0000] [7] [ERROR] Exception in worker process opal-server | Traceback (most recent call last): opal-server | File "pydantic/main.py", line 540, in pydantic.main.BaseModel.parse_raw opal-server | File "pydantic/parse.py", line 37, in pydantic.parse.load_str_bytes opal-server | File "/usr/local/lib/python3.10/json/__init__.py", line 346, in loads opal-server | return _default_decoder.decode(s) opal-server | File "/usr/local/lib/python3.10/json/decoder.py", line 337, in decode opal-server | obj, end = self.raw_decode(s, idx=_w(s, 0).end()) opal-server | File "/usr/local/lib/python3.10/json/decoder.py", line 353, in raw_decode opal-server | obj, end = self.scan_once(s, idx) opal-server | json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1) opal-server | opal-server | During handling of the above exception, another exception occurred: opal-server | opal-server | Traceback (most recent call last): opal-server | File "/usr/local/lib/python3.10/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker opal-server | worker.init_process() opal-server | File "/usr/local/lib/python3.10/site-packages/uvicorn/workers.py", line 66, in init_process opal-server | super(UvicornWorker, self).init_process() opal-server | File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/base.py", line 134, in init_process opal-server | self.load_wsgi() opal-server | File "/usr/local/lib/python3.10/site-packages/gunicorn/workers/base.py", line 146, in load_wsgi opal-server | self.wsgi = self.app.wsgi() opal-server | File "/usr/local/lib/python3.10/site-packages/gunicorn/app/base.py", line 67, in wsgi opal-server | self.callable = self.load() opal-server | File "/usr/local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 58, in load opal-server | return self.load_wsgiapp() opal-server | File "/usr/local/lib/python3.10/site-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp opal-server | return util.import_app(self.app_uri) opal-server | File "/usr/local/lib/python3.10/site-packages/gunicorn/util.py", line 359, in import_app opal-server | mod = importlib.import_module(module) opal-server | File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module opal-server | return _bootstrap._gcd_import(name[level:], package, level) opal-server | File "<frozen importlib._bootstrap>", line 1050, in _gcd_import opal-server | File "<frozen importlib._bootstrap>", line 1027, in _find_and_load opal-server | File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked opal-server | File "<frozen importlib._bootstrap>", line 688, in _load_unlocked opal-server | File "<frozen importlib._bootstrap_external>", line 883, in exec_module opal-server | File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed opal-server | File "/usr/local/lib/python3.10/site-packages/opal_server-0.5.0-py3.10.egg/opal_server/main.py", line 8, in <module> opal-server | app = create_app() opal-server | File "/usr/local/lib/python3.10/site-packages/opal_server-0.5.0-py3.10.egg/opal_server/main.py", line 2, in create_app opal-server | from .server import OpalServer opal-server | File "/usr/local/lib/python3.10/site-packages/opal_server-0.5.0-py3.10.egg/opal_server/server.py", line 26, in <module> opal-server | from opal_server.config import opal_server_config opal-server | File "/usr/local/lib/python3.10/site-packages/opal_server-0.5.0-py3.10.egg/opal_server/config.py", line 288, in <module> opal-server | opal_server_config = OpalServerConfig(prefix="OPAL_") opal-server | File "/usr/local/lib/python3.10/site-packages/opal_common-0.5.0-py3.10.egg/opal_common/confi/confi.py", line 128, in init opal-server | value = self._eval_and_save_entry(name, entry) opal-server | File "/usr/local/lib/python3.10/site-packages/opal_common-0.5.0-py3.10.egg/opal_common/confi/confi.py", line 162, in _eval_and_save_entry opal-server | value = self._eval_entry(entry) opal-server | File "/usr/local/lib/python3.10/site-packages/opal_common-0.5.0-py3.10.egg/opal_common/confi/confi.py", line 168, in _eval_entry opal-server | res = self._evaluate(whole_key, entry.default, entry.cast, **entry.kwargs) opal-server | File "/usr/local/lib/python3.10/site-packages/opal_common-0.5.0-py3.10.egg/opal_common/confi/confi.py", line 205, in _evaluate opal-server | res = config(key, default=passed_default, cast=safe_cast_func, **kwargs) opal-server | File "/usr/local/lib/python3.10/site-packages/decouple.py", line 245, in call opal-server | return self.config(*args, **kwargs) opal-server | File "/usr/local/lib/python3.10/site-packages/decouple.py", line 107, in call opal-server | return self.get(*args, **kwargs) opal-server | File "/usr/local/lib/python3.10/site-packages/decouple.py", line 101, in get opal-server | return cast(value) opal-server | File "/usr/local/lib/python3.10/site-packages/opal_common-0.5.0-py3.10.egg/opal_common/confi/confi.py", line 74, in wrapped_cast opal-server | return cast_func(value, *args, **kwargs) opal-server | File "/usr/local/lib/python3.10/site-packages/opal_common-0.5.0-py3.10.egg/opal_common/confi/confi.py", line 55, in cast_pydantic_by_model opal-server | return model.parse_raw(value) opal-server | File "pydantic/main.py", line 549, in pydantic.main.BaseModel.parse_raw opal-server | pydantic.error_wrappers.ValidationError: 1 validation error for ServerDataSourceConfig opal-server | root opal-server | Expecting property name enclosed in double quotes: line 1 column 2 (char 1) (type=value_error.jsondecode; msg=Expecting property name enclosed in double quotes; doc={\"entries\":[]}; pos=1; lineno=1; colno=2) opal-server | [2023-02-28 175051 +0000] [7] [INFO] Worker exiting (pid: 7)
BTW : it was working with the typo "enteries"
o
Hey @Jack Geek, I see few problems here. The first one is that the
entries
key need to be nested under
config
key, you can see the model definition here . A valid json would be:
Copy code
{
  "config": {
    "entries": []
  }
}
The second one is that it looks like it fails to decode the json ( might be because of the
\"
escaping), try removing the escaping part, if this doesn’t work please provide us your docker run command ( without any private data ) so we could check it.
j
Hi @Omer Zuarets, is that format "config" due to a new version of OPAL ? because it used to work last week, now the latest won't install for me
Here is my working Docker Compose file from last week
It is working now with : OPAL_DATA_CONFIG_SOURCES={"config":{"entries":[]}} But in K8S deployment, I must test if I need to escape or not
o
Sorry for the delay, It wasn’t changed in the past few months, I’m not sure how it worked for you last week, but I’m glad it does now 🙂
j
Hello, Indeed, the server is not working anymore on K8S even with the updated OPAL_DATA_CONFIG_SOURCES I will put here some logs
@Or Weis @Asaf Cohen @Omer Zuarets
a
Hi @Jack Geek please do and also provide the config vars that are not working. Is this only in 0.5.0?
Please provide us with as much information (configs and logs) as possible. I will make sure someone investigates this. cc @Oded Bd
j
I'm not sure if it's and env variable problem, yes I use the latest tag, so I assume it's due to the version 0.5.0
a
Can you please check if the same configuration is working on 0.4.0 (previous release). There was a confi change and now I have a very good suspicion
j
I confirm that it's working with the version 0.4.0
a
Amazing, I have a lead! Can you provide your OPAL_DATA_CONFIG_SOURCES again that is failing? not sure if you are using
Copy code
OPAL_DATA_CONFIG_SOURCES={"config":{"entries":[]}}
or something else
once you do i'll try to repro, if it's easy i might be able to push a patch version quickly
j
It's a simple k8s deployment file, working with the image tag 0.4.0 and not with the tag "latest"
a
Hi @Jack Geek, it's definitely a change in the latest release, but i am not sure it's a regression per say. It looks like the config reader (confi) is now more accurate in parsing stuff, and you might pass an invalid json. It the past it used to just ignore the bad value, which is bad because you don't know you passed a bad value. I can revert this if needed, but i am trying to see if there's another solution to encode the json string value within the yaml. btw this works fine in docker compose:
Copy code
version: "3.8"
services:
  # When scaling the opal-server to multiple nodes and/or multiple workers, we use
  # a *broadcast* channel to sync between all the instances of opal-server.
  # Under the hood, this channel is implemented by encode/broadcaster (see link below).
  # At the moment, the broadcast channel can be either: postgresdb, redis or kafka.
  # The format of the broadcaster URI string (the one we pass to opal server as `OPAL_BROADCAST_URI`) is specified here:
  # <https://github.com/encode/broadcaster#available-backends>
  broadcast_channel:
    image: postgres:alpine
    environment:
      - POSTGRES_DB=postgres
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
  opal_server:
    # by default we run opal-server from latest official image
    image: permitio/opal-server:0.5.0
    environment:
      # the broadcast backbone uri used by opal server workers (see comments above for: broadcast_channel)
      - OPAL_BROADCAST_URI=<postgres://postgres:postgres@broadcast_channel:5432/postgres>
      # number of uvicorn workers to run inside the opal-server container
      - UVICORN_NUM_WORKERS=4
      # the git repo hosting our policy
      # - if this repo is not public, you can pass an ssh key via `OPAL_POLICY_REPO_SSH_KEY`)
      # - the repo we pass in this example is *public* and acts as an example repo with dummy rego policy
      # - for more info, see: <https://docs.opal.ac/tutorials/track_a_git_repo>
      - OPAL_POLICY_REPO_URL=<https://github.com/permitio/opal-example-policy-repo>
      # in this example we will use a polling interval of 30 seconds to check for new policy updates (git commits affecting the rego policy).
      # however, it is better to utilize a git *webhook* to trigger the server to check for changes only when the repo has new commits.
      # for more info see: <https://docs.opal.ac/tutorials/track_a_git_repo>
      - OPAL_POLICY_REPO_POLLING_INTERVAL=30
      # configures from where the opal client should initially fetch data (when it first goes up, after disconnection, etc).
      # the data sources represents from where the opal clients should get a "complete picture" of the data they need.
      # after the initial sources are fetched, the client will subscribe only to update notifications sent by the server.
      - OPAL_DATA_CONFIG_SOURCES={"config":{"entries":[]}}
      - OPAL_LOG_FORMAT_INCLUDE_PID=true
    ports:
      # exposes opal server on the host machine, you can access the server at: <http://localhost:7002>
      - "7002:7002"
    depends_on:
      - broadcast_channel
  opal_client:
    # by default we run opal-client from 0.5.0 official image
    image: permitio/opal-client:0.5.0
    environment:
      - OPAL_SERVER_URL=<http://opal_server:7002>
      - OPAL_LOG_FORMAT_INCLUDE_PID=true
      - OPAL_INLINE_OPA_LOG_FORMAT=http
    ports:
      # exposes opal client on the host machine, you can access the client at: <http://localhost:7000>
      - "7766:7000"
      # exposes the OPA agent (being run by OPAL) on the host machine
      # you can access the OPA api that you know and love at: <http://localhost:8181>
      # OPA api docs are at: <https://www.openpolicyagent.org/docs/latest/rest-api/>
      - "8181:8181"
    depends_on:
      - opal_server
    # this command is not necessary when deploying OPAL for real, it is simply a trick for dev environments
    # to make sure that opal-server is already up before starting the client.
    command: sh -c "./wait-for.sh opal_server:7002 --timeout=20 -- ./start.sh"
while this produces an error:
Copy code
version: "3.8"
services:
  # When scaling the opal-server to multiple nodes and/or multiple workers, we use
  # a *broadcast* channel to sync between all the instances of opal-server.
  # Under the hood, this channel is implemented by encode/broadcaster (see link below).
  # At the moment, the broadcast channel can be either: postgresdb, redis or kafka.
  # The format of the broadcaster URI string (the one we pass to opal server as `OPAL_BROADCAST_URI`) is specified here:
  # <https://github.com/encode/broadcaster#available-backends>
  broadcast_channel:
    image: postgres:alpine
    environment:
      - POSTGRES_DB=postgres
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
  opal_server:
    # by default we run opal-server from latest official image
    image: permitio/opal-server:0.5.0
    environment:
      # the broadcast backbone uri used by opal server workers (see comments above for: broadcast_channel)
      - OPAL_BROADCAST_URI=<postgres://postgres:postgres@broadcast_channel:5432/postgres>
      # number of uvicorn workers to run inside the opal-server container
      - UVICORN_NUM_WORKERS=4
      # the git repo hosting our policy
      # - if this repo is not public, you can pass an ssh key via `OPAL_POLICY_REPO_SSH_KEY`)
      # - the repo we pass in this example is *public* and acts as an example repo with dummy rego policy
      # - for more info, see: <https://docs.opal.ac/tutorials/track_a_git_repo>
      - OPAL_POLICY_REPO_URL=<https://github.com/permitio/opal-example-policy-repo>
      # in this example we will use a polling interval of 30 seconds to check for new policy updates (git commits affecting the rego policy).
      # however, it is better to utilize a git *webhook* to trigger the server to check for changes only when the repo has new commits.
      # for more info see: <https://docs.opal.ac/tutorials/track_a_git_repo>
      - OPAL_POLICY_REPO_POLLING_INTERVAL=30
      # configures from where the opal client should initially fetch data (when it first goes up, after disconnection, etc).
      # the data sources represents from where the opal clients should get a "complete picture" of the data they need.
      # after the initial sources are fetched, the client will subscribe only to update notifications sent by the server.
      - OPAL_DATA_CONFIG_SOURCES="{\"config\":{\"entries\":[]}}"
      - OPAL_LOG_FORMAT_INCLUDE_PID=true
    ports:
      # exposes opal server on the host machine, you can access the server at: <http://localhost:7002>
      - "7002:7002"
    depends_on:
      - broadcast_channel
  opal_client:
    # by default we run opal-client from 0.5.0 official image
    image: permitio/opal-client:0.5.0
    environment:
      - OPAL_SERVER_URL=<http://opal_server:7002>
      - OPAL_LOG_FORMAT_INCLUDE_PID=true
      - OPAL_INLINE_OPA_LOG_FORMAT=http
    ports:
      # exposes opal client on the host machine, you can access the client at: <http://localhost:7000>
      - "7766:7000"
      # exposes the OPA agent (being run by OPAL) on the host machine
      # you can access the OPA api that you know and love at: <http://localhost:8181>
      # OPA api docs are at: <https://www.openpolicyagent.org/docs/latest/rest-api/>
      - "8181:8181"
    depends_on:
      - opal_server
    # this command is not necessary when deploying OPAL for real, it is simply a trick for dev environments
    # to make sure that opal-server is already up before starting the client.
    command: sh -c "./wait-for.sh opal_server:7002 --timeout=20 -- ./start.sh"
@Jack Geek are you using helm by any chance?
helm allows you to do something like this:
Copy code
- name: SPRING_APPLICATION_JSON
  value: {{ .Values.service.spring_application_json | toJson | quote }}
to encode a json
j
Hi Asaf, I don't use helm for the moment but I can give a look; Yes it works fine within the Docker Compose without encoding the json, but in k8s env it must be a string
But I think that it must be a solution with the regular deployments (nto all people will use helm)
a
definitely, i am trying to find the correct syntax for you
maybe this will work?
Copy code
- name: FOO
   value: |
   {"foo":"bar"}
j
Implicit map keys need to be followed by map values
Maybe it's just the yaml linter
a
maybe...
i think i'll just try to make the parser less sensitive
it's a bit late in my timezone, i think i'm going to try and come up with a hot fix first thing in the morning
thanks for helping me reproduce the issue!
🙌 1
j
great, thanks Asaf !
a
sure thing 🙂
Hi @Jack Geek, I tried investigating and fix this bug, but i am missing some information. In essence: • this PR is doing a good thing - it's stopping OPAL from starting if the user provide an invalid value to a config var • I am not sure what exactly you are passing that is incorrect. • When i look at your error, it's not the same exception i am getting from passing an incorrect value: ◦ you are getting
invalid literal for int() with base 10: '<tcp://10.116.5.135:80>'
◦ Where is
<tcp://10.116.5.135:80>
passed? is it part of
OPAL_DATA_CONFIG_SOURCES
? ◦ It looks like this value is passed instead of something that is expected to be an int. ◦ Can you extract all the environment variables you are passing to OPAL and check where this value is located? if you give the exact config var that is problematic and the value (assuming it's not sensitive, if so - redact the sensitive stuff) i would be able to reproduce this on my end Sorry i couldn't fix this faster, in the meanwhile please use 0.4.0 until we can narrow it down.
j
Hi @Asaf Cohen, I'm not passing it, it's the IP address of the deployed OPAL server (so I assume this is an internal problem) For OPAL_DATA_CONFIG_SOURCES, it is always empty value: "{\"config\":{\"entries\":[]}}" The same deployment file works on 0.4.0 but not 0.5.0
a
That's weird, i passed this exact same value, and everything works in docker, but granted not in kubernetes. I do get a different error though, i cannot reproduce the
invalid literal for int()
error 😞
We will keep investigating this, thanks for the clarification regarding the ip address. cc @Ro'e Katz
o
@Jack Geek do you get the same error both in K8s and in docker-compose?
j
@Or Weis No it's working fine on Docker Compose (even the Gitlab webhooks because I use 0.5.0)
o
Sounds like there’s something odd with your K8s setup. I’d highly suggest you investigate that. But what we’ll do in the meanwhile is add an option to disable the strict configuration parsing that was added in 0.5.0, so it would work for you like 0.4.0 with. Can you share with me the full exception traceback you’re getting - so I can be sure i’m covering your case?
j
It's Kubernetes on Google (GKE), I just pass deployment yml file that's it. I attached my deployment.yml (working with 0.4.0) and the error that I get (image attached) + text in file
o
I’ve created this PR to allow non-strict config parsing via
OPAL_IS_STRICT_CONFIG=False
https://github.com/permitio/opal/pull/399 @Asaf Cohen for your review
r
Hi @Jack Geek, I think I understand what’s going on in your deployment: Kubernetes automatically sets environment variables inside of pods (so workloads have the addresses of each other). Your opal pod probably have the envar
OPAL_SERVER_PORT='<tcp://10.116.5.135:80>'
, which is unfortunately one of Opal’s configuration options and is expected to be an integer. (as mentioned - since 0.5.0 we don’t fallback to using default values on parsing errors). An easy fix would be setting
OPAL_SERVER_PORT='7002'
explicitly. Or to rename your deployment/service to another name 🙂 Disabling strict parsing as @Or Weis suggested would also work of course - but we recommend testing those options first. (We’re gonna discuss what’s be the best way to avoid this issue in the future - but letting you know first so you can move forward)
j
@Ro'e Katz thank you, that Indeed fix the problem of the installation the version 0.5.0. @Or Weis But I always get the problem cloning the repo with the HTTPs repo URL.
So that I can match the
git_http_url
coming from the Gitlab webhook request.
o
Hi @Jack Geek - It seems that OPAL-SERVER thinks your REPO_URL is just “https://gitlab.com/” Could this be another ENV_VAR override issue
?
j
It's me, I deleted the full URL text
o
Oh. lol Can you try an SSH address instead of HTTPs ?
The webhook issue - should now be resolved; so you can use SSH for the REPO_URL, and http for the webhook
j
@Or Weis, it's working now with webhooks !
o
Yeyy!
Quite a journey 😄
j
Indeed 😄 thank you all for your support ! @Or Weis @Asaf Cohen @Ro'e Katz
🚀 1
I will update you in the upcoming weeks
💜 1