I followed <https://docs.airbyte.io/deploying-airb...
# ask-community-for-troubleshooting
p
I followed https://docs.airbyte.io/deploying-airbyte/on-gcp-compute-engine and it looks like airbyte is not able to launch the stripe source image
u
can you share the full logs @Patric? you get into the GCP instance right..
p
yeah I also checked that my user is in the docker group
first log is direct output from the connection logs second is serve logs from admin
@[DEPRECATED] Marcos Marx should I try anything else or do you need anything else from me?
u
hmm
this should work, i started an airbyte instance and got fine with http source
p
what source and destination did you use for testing?
u
http/local-json
p
I can run the http source but not the stripe source
I can generate you a stripe token to test so you dont have to create an account
u
i have one šŸ˜„
p
ok let me know if it works for you, since it obviously isnt a permission issue at this point
u
is not working to me too 😤
p
good to know that the error doesnt sit in front of the pc this time šŸ˜„
u
or maybe is sitting in front of two pc
šŸ˜‚
p
true
hm this is also not working on stripe source 0.1.8 and 0.1.5
u
already ask for the engineer team to take a look
p
thank you!
u
thanks for bringing this up, this is a big block for you using airbyte? i'll let you updated about it šŸ˜„
p
well sales is breathing into my neck because they need the stripe data in bigquery
kind of urgent for us but I also realize that this is a free service so whatever works for you guys works for me
u
yep i'll request here to @s if is possible to take a look on this.
šŸ‘€ 1
s
@charles i’m held up in something atm — is it possible this is related to the changes you made around checkpointing? My only hints are: 1. running the stripe docker connector locally seems to be working for me. I ran
docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/sample_files:/sample_files airbyte/source-stripe:0.1.9 read --config /secrets/config.json --catalog /sample_files/configured_catalog.json
from the
source-stripe
directory 2. the log stops where it says something about forking and joining which is what you worked on
c
i will take a look now
am i crazy in looking at this and saying that it looks like neither the source nor destination ever start?
for example the destination when it starts should log this line:
Copy code
<http://LOGGER.info|LOGGER.info>("Running integration: {}", integration.getClass().getName());
it's the first line in there.
i'm not seeing this in the provided logs.
u
oh god
c
same thing in the source:
Copy code
<http://logger.info|logger.info>(f"Starting syncing {self.name}")
should be the first thing to be logged and i'm not seeing that.
u
@charles
i left my stripe connection on... and after 30 min the process start
šŸ‘€ 1
image.png
the sync transfer data, lol.
c
i guess the lines i mentioned aren't getting logged in that success case either?
u
checking, i printed the part where you can see the time gap between thread start and completed
none of 2 log you inform were printed.
full log here
c
that seems bad.
u
(more context) i made an http source connector and work line a charm 😃
c
okay. so it seems like there's a separate issue right now where we aren't logging anything from the sources or destinations.
this is probably obscuring the real issue here.
blargh. i'm going to try to figure out why this is happening.
logs from version 0.23.0
includes the print statements
just going to search through versions until i find the one where we lost them
still logs in 0.24.0
not in 0.24.2
not in in 0.24.1, so this seems like the first bad version
which is the checkpointing commit, so that seems likely to be causing the logging problem
huh. i think it might be how we are pulling the logs in the UI/API that is wrong. looking at what is logged in the docker-compose process everything is logged properly
(actually there is a bug where we are in fact logging too much)
s
we’ve recently gotten reports about logging every record or something like that
c
yeah. we are.
šŸ™ƒ 1
āœ… 1
s
LGTM 🚢
p
Man what have I started šŸ˜„ ok so to report back it actually did start but it took an hour before it started logging again. that seemed to have opened a new can of worms though
c
If you can jump up to new patch version and try again that will solve the logging problem. And MAY solve the speed problem. We are still texting that second part.
@[DEPRECATED] Marcos Marx did you have any luck seeing if stripe behaves better after the logging change.
Also thanks for being patient with us Patric. We appreciate you helping us sort out this bug.
u
ran a fresh airbyte (docker-compose down -v) with the patch but the problem waiting for source thread to join continues
getting logs from source/dest look the data transfer is happening... but in the UI important logs arent displayed
p
@[DEPRECATED] Marcos Marx is the sync finishing on your side?
u
yes, it finished after 30 min, ~12k records
this can give you an idea about how much time will consuming sync your data from the first time
p
hm mine is erroring out despite transfering data and then restarting again let me try a fresh install
u
now is another problem šŸ˜… conflict between the `catalog.json`types and normalization
c
Is waiting for the source a bug @[DEPRECATED] Marcos Marx? Does it not just take 30 min to download the data?
u
yes!
c
If there is nothing being logged it is going to look like it is hanging. But if it succeeds then you are good.
Okay. So can you list the outstanding issues? I'm getting lost.
p
there seems to be a mapping issue when using basic normalization from stripe to bigquery
u
1. the log in UI hangs until finish transfer the data completely 2. normalization is breaking
i'm working with sherif to patch the connector
c
What does it mean the UI hangs?
Is it just there are no logs until the end? Or is something not loading properly?
p
the connection log logs
2021-05-28 12:51:10 INFO (/tmp/workspace/1/0) DefaultReplicationWorker(run):132 - Waiting for source thread to join.
and then does nothing for an hour, then it logs again normally I assume
since there is no visual indicator that something is still happening I always assumed it's broken after 2 min of waiting
c
Ok. Makes sense. That's something I can work on. But ultimately shouldn't be blocking.
Don't want to diminish it, but just want to make sure that we triage and fix blocking issues first.
@[DEPRECATED] Marcos Marx is the normalization issue stripe specific? Or is normalization broken for everything right now?
p
no i understand, as long as I get the data this has no priority for me
šŸŽ‰ 1
c
Sweet. Thanks! Appreciate the patience.
i'm working with sherif to patch the connector
@[DEPRECATED] Marcos Marx and what fix needs to be made to the connector?
u
we are trying to cast a float field to int in normalization
c
For stripe only or for everything?
u
for a specific stream of stripe connector https://github.com/airbytehq/airbyte/pull/3728
šŸ’Ŗ 1
s
@Patric fix has been published - can you upgrade to 0.1.10 and sync only the invoices and subscriptions stream and see if that works?
upgrade from the Admin screen in the UI
p
sure one sec
looks like that worked immediately
airbyte rocket 1
s
brilliant
you should be good to go then — enable the rest of the streams and move some data!
p
trying actually but it seems like it doesnt allow me to save if I enable the rest of the streams
s
šŸ¤”
p
it warns me that I lose my data but the save button is greyed out
thanks for helping me get this sorted out guys I appreciate it!
u
anything you can ping us
p
So my stripe source is "stuck" for 8h now, whats happening here in the background that takes 8h?
u
The last log displayed was the "source is waiting the thread"?
p
yes
also looks like you use different timezones for logs and UI
I'm syncing 3 years of real data here but it still baffles me that this can take so long
u
update: user cancel job because was using old version of stripe. @Patric when you have an update about the sync please lmk
p
It took 12 hours for something to happen and crashed because of a 404 Error
it is still marked as running
u
@Patric this error happen when submit an upload to BigQuery
p
yes, maybe the dataset is too big? I reduced the dataset from stripe and removed events and stuff and after that every sync took around an hour even on sync that take 3 years of data
those settings seem to be usable at this point
of course it always seem to depend on the size of the data, maybe it would be good if you guys implement some kind of stream based sync, at the moment it seems you pull everything, then convert everything and then upload everything
u
I think the problem happen when we're trying to upload data to bigquery and it's a more large job. I had opened this issue last week. I request to the connector team take a look on this. @Patric not syncing those schemas is a big problem for you now?
p
@[DEPRECATED] Marcos Marx no I don't think so I'll have to wait for feedback from our sales analytics but I think the data I got now is fine
u
Ok Patric, waiting your return šŸ˜„