Can someone of devs also take a look at this? <htt...
# orm-help
t
Can someone of devs also take a look at this? https://github.com/prismagraphql/prisma/issues/2574 Very strange problem totally breaking continious delivery 😞
d
Looking at it.
👍 1
t
I think that issue 2623 is with the same propblem
d
I assume that this issue is talking about the passive connector introspection. Do you use an existing database with data?
t
No
I use multy-tenancy db
And deploying manually using cluster api — it works ok. Project is creating, migrations are running. But the project's api doesn't work untill I reload server (recreate container in kub)
d
How does your ingress look like on the kubernetes cluster (if you use ingress)?
t
Copy code
{
  "kind": "Ingress",
  "apiVersion": "extensions/v1beta1",
  "metadata": {
    "name": "ingress-prisma",
    "namespace": "default",
    "selfLink": "/apis/extensions/v1beta1/namespaces/default/ingresses/ingress-prisma",
    "uid": "e5705f08-6811-11e8-b424-4eaef66759da",
    "resourceVersion": "4573807",
    "generation": 3,
    "creationTimestamp": "2018-06-04T16:11:10Z",
    "annotations": {
      "<http://certmanager.k8s.io/cluster-issuer|certmanager.k8s.io/cluster-issuer>": "letsencrypt-prod",
      "<http://kubectl.kubernetes.io/last-applied-configuration|kubectl.kubernetes.io/last-applied-configuration>": "{\"apiVersion\":\"extensions/v1beta1\",\"kind\":\"Ingress\",\"metadata\":{\"annotations\":{\"<http://certmanager.k8s.io/cluster-issuer\|certmanager.k8s.io/cluster-issuer\>":\"letsencrypt-prod\",\"<http://kubernetes.io/ingress.class\|kubernetes.io/ingress.class\>":\"nginx\",\"<http://kubernetes.io/tls-acme\|kubernetes.io/tls-acme\>":\"true\"},\"name\":\"ingress-prisma\",\"namespace\":\"default\"},\"spec\":{\"rules\":[{\"host\":\"<http://prisma.dev.dosvit.org.ua|prisma.dev.dosvit.org.ua>\",\"http\":{\"paths\":[{\"backend\":{\"serviceName\":\"prisma\",\"servicePort\":4466},\"path\":\"/\"}]}}],\"tls\":[{\"hosts\":[\"<http://prisma.dev.dosvit.org.ua|prisma.dev.dosvit.org.ua>\"],\"secretName\":\"prisma-tls\"}]}}\n",
      "<http://kubernetes.io/ingress.class|kubernetes.io/ingress.class>": "nginx",
      "<http://kubernetes.io/tls-acme|kubernetes.io/tls-acme>": "true"
    }
  },
  "spec": {
    "tls": [
      {
        "hosts": [
          "<http://prisma.dev.dosvit.org.ua|prisma.dev.dosvit.org.ua>"
        ],
        "secretName": "prisma-tls"
      }
    ],
    "rules": [
      {
        "host": "<http://prisma.dev.dosvit.org.ua|prisma.dev.dosvit.org.ua>",
        "http": {
          "paths": [
            {
              "path": "/",
              "backend": {
                "serviceName": "prisma",
                "servicePort": 4466
              }
            }
          ]
        }
      }
    ]
  },
  "status": {
    "loadBalancer": {
      "ingress": [
        {}
      ]
    }
  }
}
d
That looks standard to me.
If you dump the kubernetes pod logs is there anything problematic showing up? Also, what do you mean by multi-tenant DB?
logs.. I've recreated a pod recently so there are no logs, but as far I remember nothing showed up. I'll try to deploy a new app now and take a look
d
you can get logs from previous pods with the -p flag in kubectl
Just to understand your setup: Do you use MySQL or PostgreSQL?
t
MySQL
and current Prisma is 1.9 (stable)
d
I’m fairly sure multi-tenancy is not working the way the docs imply and I’m not sure why this is documented the way it is. Not 100% sure if this is related to the issue at hand, but I will look at it a bit more.
t
docs in general is a big problem Prisma has 🙂
d
Agreed. Your problem is indeed a strange one, I can’t directly think of why it should behave the way it does. However, if you can capture some logs from a pod that behaves this way and shoot them my way it could help. Additionally, I will deploy a Prisma pod on a k8s cluster and see what happens later.
t
I'm currently deploying a fresh app to get fresh logs
While it is building, I'll mention that IMO design strategy (since graphcool) targeted at "a single user that deploys projects via cli to aws" was kinda... strange)). Now, as I see, you are making Prisma Server more generic, but old design still kicks in the head) I'm trying to use it as a part of huge and complex system and I need fully custom control via ci/cd etc, and it sometimes just kicks me in my head))
d
Interesting, it would greatly help us if you just write up your thoughts somewhere (doesn’t have to be super coherent, random thoughts are good as well) so we can improve Prisma for your use cases.
t
Well, I suppose that most of things are to lack of docs for this cases (e.g. for cluster api and other low-level things). To start the sings up I was reverse-engeneering all you stuff and continiously asking everyone in chats. forums and github for several weeks 🙂 Another critical thing is about migrations strategy. I'm currently thinking of some ideas about this and will write in github. Now, at Prisma 1.9 it looks like other things are way better
Ok, here we go
App deployed and migrated
Oh. logs are full, one moment
added log to gist
while "project already exists" is ok, when container restarts it tries to run add and migrate commands, but second (DeploymentInProgress) is strange... I run this tasks via curl in container (application) startup script:
Copy code
#!/usr/bin/env bash
REPLACE='$MUNICIPALITY_NAME:$APPLICATION_NAME:$APPLICATION_VERSION_SAFE:$PRISMA_STAGE
cat .dosvit/prisma/mysql.project.request.data | envsubst $REPLACE | curl \
          --request POST --url $PRISMA_HOST:$PRISMA_PORT/cluster \
          --header 'accept: application/json' --header 'content-type: application/json' \
          --data @-
cat .dosvit/prisma/mysql.migrate.request.data | envsubst $REPLACE | curl \
          --request POST --url $PRISMA_HOST:$PRISMA_PORT/cluster \
          --header 'accept: application/json' --header 'content-type: application/json' \
          --data @-
don't get how an can overlap can appear
and despite this exceptions - database is in correct state. but app is not working untill server restart
d
Thanks, I will look at it tomorrow.
t
@dpetrick hi. did you have any ideas about this?)
d
I have nothing conclusive at the moment unfortunately, I have not been able to reproduce the issue.
t
This is very strange. I've already added sleeps in script, but it still errors:
Copy code
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
{
  "data" : {
    "addProject" : null
  },
  "errors" : [ {
    "locations" : [ {
      "line" : 2,
      "column" : 5
    } ],
    "path" : [ "addProject" ],
    "code" : 4005,
    "message" : "Service with name 'stub-documents-0-0-2-beta-3' and stage 'staging' already exists",
    "requestId" : "local:management:cjiok967m001x0986hebna25p"
  } ]

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   680  100   351  100   329  10609   9944 --:--:-- --:--:-- --:--:-- 10968
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  3124  100   429  100  2695   3379  21233 --:--:-- --:--:-- --:--:-- 21388
}{
  "data" : {
    "deploy" : null
  },
  "errors" : [ {
    "locations" : [ {
      "line" : 2,
      "column" : 5
    } ],
    "path" : [ "deploy" ],
    "code" : 4008,
    "message" : "You can not deploy to a service stage while there is a deployment in progress or a pending deployment scheduled already. Please try again after the deployment finished.",
    "requestId" : "local:management:cjiok9a42001y0986ke0bkooc"
  } ]
5 second delay between ops
Looks like it does not clear deployment queue
d
You can take a look at the prisma internal database migration table. It lists queued changed to the schema and the migration status.
Is the prisma container you’re using the only prisma container on the underlying database?
t
Prisma databases are touched only by prisma. There are more dbs on this db-server for other services, but they don't intersect
d
Can you please run a
migrationStatus
query against the
/management
GraphQL endpoint to see what is going on the the stuck deployment?
t
I'm looking at database when this error occurs: all migration statuses are success
d
Then it definitely sounds like a bug. Do you think it’s possible to create a reproduction for us? We do have a kubernetes cluster where we can test it out, we just need the resource definitions around it and maybe the context how you do/trigger everything.
t
as a variant, you can pull an image from here: registry-local.dev.dosvit.org.ua/news:1.0.3-beta.5 and run it with this env:
Copy code
PRISMA_HOST: <http://prisma>
PRISMA_PORT: 4466
PRISMA_STAGE: staging
PRISMA_SECRET: 
PRISMA_PROJECT: stub-news-1-0-3-beta-5
PORT: 3000
APP_KEY: bf22b7ce-01ae-41f9-9cae-c5aa85f7d52b
S3_KEY: 
S3_SECRET: 
S3_HOST: 
S3_BUCKET: 
MUNICIPALITY_NAME: stub
APPLICATION_NAME: news
APPLICATION_VERSION: 1.0.3-beta.5
APPLICATION_VERSION_SAFE: 1-0-3-beta-5
container start script contains deploy and migration commands:
Copy code
#!/usr/bin/env bash
REPLACE='$MUNICIPALITY_NAME:$APPLICATION_NAME:$APPLICATION_VERSION_SAFE:$PRISMA_STAGE
cat .dosvit/prisma/mysql.project.request.data | envsubst $REPLACE | curl \
          --request POST --url $PRISMA_HOST:$PRISMA_PORT/cluster \
          --header 'accept: application/json' --header 'content-type: application/json' \
          --data @-
cat .dosvit/prisma/mysql.migrate.request.data | envsubst $REPLACE | curl \
          --request POST --url $PRISMA_HOST:$PRISMA_PORT/cluster \
          --header 'accept: application/json' --header 'content-type: application/json' \
          --data @-
I also tried to add sleep between them — this did not help. Also worse to mention that this bug fires not all the times. Sometimes container starts ok
d
Thank you, as soon as I have some time I will try to reproduce it.
👍 1
t
I'm monitoring this problems and saw that indeed deployments are stucking with status in progress
d
Interesting, then it’s likely that your schema is actually blowing up internally, leaving everything in an inconsistent state. Can you (or did you already?) share the schema via PM?
t
FYI it seems to be fixed by separation of deploy-migrate-start in separate containers. Previously they all run in one script, now I've separated this tasks and run them in separate processes using Kubernetes init-containers that are running in queue before main container start. Several deploys like this were succesfull
Hello. Init containers made this error occur more rare, but it still appears (
The problem is still there even on 1.12. migrations have status SUCCESS, but nothiubg works untill container recreation
@dpetrick Initial migrate command returns this:
Copy code
{
  "data" : {
    "deploy" : {
      "clientMutationId" : null,
      "migration" : {
        "projectId" : "test-otg-enterprises-registry-0-0-8@staging",
        "status" : "PENDING",
        "applied" : 0
      },
      "errors" : [ ]
    }
  }
It looks like if application tries to access project before migration succeeds, server caches empty schema and does not recreate it after migration
d
Thanks, that helps
t
@dpetrick this issue is still present in 1.19
d
I will read up on the issue again in a bit
t
moreover, it is repeating more and more
after sequential adding project and migrating it with 15 sec interval - this error occurs