This message was deleted.
# help
s
This message was deleted.
f
Why are you calling JSON.parse and atob?
s
The response to a fetch isn't really JSON (it's a response object... slightly different) so there's an extra step required:
Copy code
json_df4 = fetch(`${finance_repo}df4.json`, {
  headers: {
    authorization: `token ${Secret("GITHUB_ACCESS_TOKEN")}`
  }
}).then(res => res.json());
I have a feeling you can ditch
await
, too. I think Observable does some auto-async stuff for you.
f
@Sam Mead
(await fetch(...)).json()
is equivalent to
fetch(...).then(r => r.json())
.
s
Good point, Fabian. Does Observable allow
await
in that context (i.e. without declaring an
async
function)? I was wondering if this might affect promise handling
f
@Sam Mead Observable's Runtime resolves top-level promises, and any cell is wrapped in a function that receives the dependencies and is declared async or generator as needed. You can read more about it here: https://observablehq.com/@observablehq/introduction-to-promises?collection=@observablehq/javascript-in-observable
❤️ 1
s
Thanks for sharing! "Observable implicitly awaits promises" 😍
s
@Fabian Iwand If there is a better method to get a private JSON from a github repo, what would that be? It is a base64 encoded file, so I was told that was the way to do it. I'm working with datatables in python and converted to a list of dicts for us in JS.
f
@S Lee I'm still not clear about which part of the file is base64 encoded, so let's try to go through it step by step: 1. Are you already seeing an error for
json_df4
, or only for
table_df4
? 2. If
json_df4
is fine: Does its cell output show you an object with a
content
property? 3. If yes, does the value of that property look like a base64 encode string? 4. If yes, what output do you get if you only call
atob
on the value (without
JSON.parse
?)
s
1. just for table_df4, as json_df4 is fetched fine. 2. Great observation. No, the content is empty! 3. probably need to fix 2 (4 also)
wonder why the fetch isn't getting the data?
f
you may want to just call .text() instead of .json() to see the output. Although json() should probably fail unless it's empty or a single digit/quoted string
s
What is odd is that the token doesn't match. The metadata and addressing of the file is correct save for that
Ideally would like to be able just take the address of the file on github and use that
f
What is the hostname in your URL?
But this isn't working which is unexpected. Would be great to have some example notebook to be able to get this
Around how to work with github urls to files using authentication
f
oh, I see now. I think you might need to pass a proper accept header. let me check on my end
hm, nothing of that sort needed. you may want to verify the path again, and also inspect the response in the network tab?
s
What are the instructions regarding which URL to use? Copying the github raw file doesn't work obviously because I'm logged in and this is a private repo, so that entire part isn't documented
the url to the file while logged in is:
<https://github.com/nyc-tinker/finance/blob/master/df4.json>
f
This should work:
Copy code
fetch(
    "<https://api.github.com/repos/USER/REPO/contents/PATH/TO/FILE.json>",
    { headers: { Authorization: `Bearer ${Secret("MY_TOKEN")}` } }
  )
  .then(r => r.json())
  .then(d => atob(d.content))
  .then(d => JSON.parse(d))
(I prefer doing it this way as it lets you comment out individual operations easily)
s
Gives me an end of json error, presumably because it's not successfully getting the file from the url via the authentication scheme.
f
Ha, I was right! 😄 You need to include
Copy code
Accept: application/vnd.github.raw
for files > 1MB
s
I assume this is in the headers area? When doing the code below, still no content is pulled (content is empty as in "")
Copy code
json_df4 = (
  await fetch(`${finance_repo}df4.json`, {
    headers: {
      authorization: `token ${Secret("GITHUB_ACCESS_TOKEN")}`,
      accept: "application / vnd.github.v3.raw"
    }
  })
).json()
The url is correct because the response object provides html_url and that actually leads me to my file (and since I'm already logged in, it goes there successfully)
Maybe the better type is
*/*
rather than application
Well that didn't work either
f
why did you include whitespace around the slash?
s
Tried below as well with no success
Copy code
json_df4 = (
  await fetch(`${finance_repo}df4.json`, {
    headers: {
      authorization: `token ${Secret("GITHUB_ACCESS_TOKEN")}`,
      accept: `application/vnd.github.v3.raw`
    }
  })
).json()
f
You may want to redact that token in the download URL
s
Well it doesn't work so it doesn't matter - the token isn't correct
Like the token isn't the same token as what I see when I actually can download the file successfully
seems like something about the authentication isn't working for large files?
I'm just grasping at straws here since this should be straightforward
f
can you try
Copy code
accept: "application/vnd.github.raw"
as header? (without the v3)?
also, try it with a file from that repo which is smaller than 1MB (to establish a baseline for further debugging)
s
Yep, with files < 1 MB it works
I get content in the content response.
f
can you paste the header object again, as it looks in your fetch code right now?
alternatively you could try to follow this guide: https://hackernoon.com/how-to-fetch-large-data-files-through-github-api (which might be dated though)
s
Copy code
json_df4 = Object {
  name: "df.json"
  path: "df.json"
  sha: "70a9605c0f6a6d7335d7c737680ea4827140ca28"
  size: 144997
  url: "<https://api.github.com/repos/nyc-tinker/finance/contents/df.json?ref=master>"
  html_url: "<https://github.com/nyc-tinker/finance/blob/master/df.json>"
  git_url: "<https://api.github.com/repos/nyc-tinker/finance/git/blobs/70a9605c0f6a6d7335d7c737680ea4827140ca28>"
  download_url: "<https://raw.githubusercontent.com/nyc-tinker/finan…aster/df.json?token=SDLKFJLK9283487979J>"
  type: "file"
  content: "W3siZGF0ZSI6ICIyMDIzLTA2LTE2IiwgInRpY2tlciI6ICJJTU…nRpY2tlciI6ICJJTkNZIiwgInByaWNlIjog\nNzMuMTR9XQ==\n"
  encoding: "base64"
  _links: Object {self: "<https://api.github.com/repos/nyc-tinker/finance/contents/df.json?ref=master>", git: "<https://api.github.com/repos/nyc-tinker/finance/git/blobs/70a9605c0f6a6d7335d7c737680ea4827140ca28>", html: "<https://github.com/nyc-tinker/finance/blob/master/df.json>"}
}
f
no, not the returned object. your
headers: { ... }
code 🙂
s
Ah, it's what I tried just previously:
Copy code
json_df4 = (
  await fetch(`${finance_repo}df.json`, {
    headers: {
      authorization: `token ${Secret("GITHUB_ACCESS_TOKEN")}`,
      accept: `application/vnd.github.raw`
    }
  })
).json()
the df.json is a smaller file
f
hm. maybe
Copy code
application/vnd.github.raw+json
?
s
Yeah I tried that previously. I guess this is a github issue? Seems important for anyone working with data stored on github
f
I can check if I can find a large file in one of our repos to test with later
s
Is there best practice / other methods to store data? Github is super convenient since the code that I work with is on VCS and it's easier to just git push to the github repo and then have the data accessable via observable
the authentication works at least
f
did you try fetching the contents from the returned download_url?
s
thanks for trying to help! I'll keep playing with it. Yeah going to the download_url doesn't work for the larger file
👍 1
none of those links work
except for the html url
f
and you're not seeing any error response from the api when you're viewing the network tab? ("doesn't work" should normally imply an error code or message)
1
s
400 error (invalid request)
Git url returns "not found" in the json response
the html_url is successful as indicated earlier, navigating to the file in github
Ha. I found the issue - browser was using cached requests. I checked "disable cache" and boom it is working
🎉 1
That was a lot of brain damage for a client-side issue
😡
Thanks for the help
🙏 1
f
That was a lot of brain damage for a client-side issue
Call it "experience" 🙂