https://discord.cloudflare.com logo
Join Discord
Powered by
# analytics-engine-beta
  • m

    Miki Mokrysz | Analytics Engine

    05/09/2023, 7:18 PM
    Hahah, yes
  • a

    Advany

    05/12/2023, 8:48 AM
    @JPL | Data PM i dont mind you using this document for editing but wanted to let you know it was a copy inside my google apps account. Someone from cloudflare requested access and I gave it.
  • t

    Tim Fish

    05/18/2023, 10:35 AM
    I've only scanned all the above so apologies if this has already been asked and answered. Having created a similar ABR solution for time series data years ago, I have some questions about querying. When you say query billing would be by "rows per query" how is this affected by ABR sampling? If I query 1 months worth of 1Hz data, how many rows will this hit? I guess you'd hit the downsampled rows rather than all 2.6m rows! But how many rows would this be? The main downside I see to the current query API is that there's no control over read sampling and if you did add that as a feature, it would be super helpful to have a way to find out roughly how many rows a query would hit. For us it's key for spikes to not get hidden by averaging. So to display downsampled data, we plot at least pixel width number of samples as min/max points with a filled area. This means that the plotted data more closely resembles what you'd see if you plotted every single sample.
  • s

    sdan

    05/19/2023, 8:38 AM
    hi everyone! curious if anyone has any samples on how they are using analytics engine in their worker repo. here is what im doing:
    Copy code
    const dataPoint = {
            'indexes': [`${request.headers.get('CF-Connecting-IP')}`],
            'blobs': [
                [host, path, JSON.stringify(requestBodyJson)],
                [
                    request.cf.colo,
                    request.cf.country,
                    request.cf.city,
                    request.cf.region,
                    request.cf.timezone,
                    request.headers,
                ],
                [
                    response.status,
                    response.headers,
                ]
            ],
        };
    
        console.log('Data to be written:', JSON.stringify(dataPoint, null, 2));
    
        env.<dataset name>.writeDataPoint(dataPoint);
    i assumed you can't have multiple indexes but apparently you can... curious what others are doing
  • s

    Skye

    05/19/2023, 9:05 PM
    Getting a "sorry, we were unable to evaluate this query", when using
    Copy code
    sql
    SELECT
        blob1 AS colo,
        SUM(_sample_interval) AS count1
    FROM 'colo-tracker'
    WHERE $timeFilter
    GROUP BY colo, count1
    ORDER BY count1
    Not sure if there's an particular reason why?
  • p

    PaganMuffin

    05/19/2023, 9:09 PM
    Isn't it the same case as this one?
  • d

    Dani Foldi

    05/19/2023, 9:10 PM
    basically you don't need to group by already aggregated columns
  • s

    Skye

    05/19/2023, 9:10 PM
    Ah, so I don't need the count in it, right
  • s

    Skye

    05/19/2023, 9:11 PM
    It was being fussy with some other fields not being included in the group by column 😅
  • c

    Cyb3r-Jok3

    05/20/2023, 2:33 PM
    Has anyone gotten a forbidden error with a challenge page when querying from Grafana? I know it isn't the token because the query works from a different Grafana Instance
  • l

    Leo

    05/20/2023, 3:58 PM
    Possibly the WAF blocking due to SQL statements?
  • c

    Cyb3r-Jok3

    05/20/2023, 5:31 PM
    I don't think so It is the same query between both instances. I'm migrating over and copied the dashboard JOSN
  • i

    Iann

    05/21/2023, 11:58 AM
    Hi, any recent news on pricing ?
  • j

    john.spurlock

    05/21/2023, 11:37 PM
    do any aggregate functions work on blob/string fields? I just want to take any url when grouping by hostname:
    select if(startsWith(blob3, 'www.'), substring(blob3, 5), blob3) hostname, min(blob2) sample_url, count() c from 'the-dataset' where index1 = 'the-index' group by hostname order by c desc format tsv
    but fails with:
    Input was invalid: cannot use the String type as argument 1 in min("blob2")
  • w

    Walshy | Pages

    05/21/2023, 11:40 PM
    without parsing it to an int, i doubt it
  • j

    john.spurlock

    05/21/2023, 11:41 PM
    maybe we'll get min/max(string) for free with https://discord.com/channels/595317990191398933/981314061268578304/1105481438184345600
  • u

    Unsmart | Tech debt

    05/22/2023, 12:18 AM
    What would you expect min("blob2") to do? Isnt the min functions supposed to return the lowest number how would that work with strings 🤔
  • u

    .Zero

    05/22/2023, 6:39 AM
    0abc < 1abc
  • j

    Jim | Data

    05/22/2023, 7:19 AM
    String
    <
    ,
    >
    ,
    <=
    ,
    >=
    went live last week. Sorry, I forgot to post here. I'm afraid that doesn't help with
    min(string)
    .
  • j

    Jim | Data

    05/22/2023, 7:27 AM
    It does look like all the aggregates expect a numeric expression. I'll have a think about if it needs to be that way.
  • j

    john.spurlock

    05/22/2023, 1:19 PM
    excellent, thanks!
  • j

    john.spurlock

    05/22/2023, 1:20 PM
    blobs can be compared byte by byte: https://clickhouse.com/docs/en/sql-reference/functions/comparison-functions
  • i

    Iann

    05/24/2023, 1:52 AM
    is this product a fit for me to log a counter everytime a customer makes an action, potentially hundreds per second ?
  • w

    Walshy | Pages

    05/24/2023, 2:02 AM
    yep!
  • i

    Iann

    05/24/2023, 2:21 AM
    I am now just worried about the post-beta pricing that is gonna be x$/1M write, because that is gonna be a looot $$$ while I am writing probably just a few bytes every time
  • w

    Walshy | Pages

    05/24/2023, 2:24 AM
    Pricing hasn't been confirmed at all yet, so I can't really give much help but I will say, we aren't known for expensive pricing. We're known for very cheap pricing so, I wouldn't worry too much just yet. Definitely let the team know your opinion though! (cc: @Jim | Data @JPL | Data PM ) I'm eargerly watching out for the price too but I trust in the team 🙂
  • s

    sdan

    05/24/2023, 3:38 AM
    we have a lot of users -- curious if anyone has made a UI or script that easily allow me to parse the data?

    https://cdn.discordapp.com/attachments/981314061268578304/1110773894203310090/Screenshot_2023-05-23_at_8.37.38_PM.pngâ–¾

  • d

    dust

    05/24/2023, 4:47 AM
    I'm curious how the sampling works with really spiky data. Lets say I have a prices logged as doubles, [.01, .5, .02, 1.5, .02, .01]. If I sum over that period do I get the actual sum or an average? One of the examples was billing for sass but it seemed like number of requests. If I wanted to bill by number of bytes in a request would this be accurate if there are some outliers. Hopefully this question makes sense.
  • h

    HardAtWork

    05/24/2023, 5:54 AM
    Like Grafana?
  • c

    charl

    05/27/2023, 7:56 AM
    There’s also this if you wanna roll your own https://developers.grafana.com/ui/latest/index.html?path=/story/docs-overview-intro--page