you can put options like `--memory="1g" --cpus=".5...
# troubleshooting
x
you can put options like
--memory="1g" --cpus=".5"
in
docker run
command
d
@Xiang Fu - Even doing that and lower, docker still uses a ton of cpu and memory. Maybe there's a weird interaction with Presto and the other containers while starting up? I'm going to see if I can get the basic docker instructions for quick start to work separate of my setup.
Even if I wait 10 minutes, the logs don't make additional progress
x
For presto? Are you using docker for desktop?
Maybe enlarge the system disk and memory
d
Yea, I'm using docker for desktop.
I'm going to restart docker
x
image.png
this is my set up
image.png
d
Yea, something is off. I'm going to give Docker some more resources.
Thanks!
Restarting helped me get further but I got a point where I'm killing something in Docker.
Cool, I can get to
com.facebook.presto.server.PrestoServer	======== SERVER STARTED ========
but any queries just get queued and don't run.
Hmm, Presto server died
No logs
I'll play with it some more.
Sweet! I got Presto in a docker container working. I still get the same error even with 0.234.3.
Copy code
presto:default> SELECT DATE_TRUNC('DAYS', "timestamp") as mydate, SUM(cost_usd_micros) FROM pinot.default.events_testing GROUP BY mydate;
Query 20200501_195448_00002_878qh failed: line 1:8: Unexpected parameters (varchar(4), bigint) for function date_trunc. Expected: date_trunc(varchar(x), date) , date_trunc(varchar(x), time) , date_trunc(varchar(x), time with time zone) , date_trunc(varchar(x), timestamp) , date_trunc(varchar(x), timestamp with time zone)
x
hmm
could you describe the table ?
it this field showing as type string or date/timestamp
d
Copy code
presto:default> DESCRIBE  pinot.default.events_testing;
     Column      |  Type  | Extra |  Comment  
-----------------+--------+-------+-----------
 cost_usd_micros | bigint |       | METRIC    
 insertions      | bigint |       | METRIC    
 content_id      | bigint |       | DIMENSION 
 platform_id     | bigint |       | DIMENSION 
 clicks          | bigint |       | METRIC    
 impressions     | bigint |       | METRIC    
 customer_id     | bigint |       | DIMENSION 
 ad_group_id     | bigint |       | DIMENSION 
 campaign_id     | bigint |       | DIMENSION 
 advertiser_id   | bigint |       | DIMENSION 
 timestamp       | bigint |       | TIME      
(11 rows)

Query 20200501_195558_00003_878qh, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
0:04 [11 rows, 926B] [2 rows/s, 214B/s]

presto:default>
x
it’s time field now
SELECT  *date*(timestamp) as mydate, SUM(cost_usd_micros) FROM pinot.default.events_testing group by *date*(timestamp)
can you try this first
d
Copy code
presto:default> SELECT DATE("timestamp") as mydate, SUM(cost_usd_micros) FROM pinot.default.events_testing GROUP BY mydate;
Query 20200501_195938_00006_878qh failed: line 1:8: Unexpected parameters (bigint) for function date. Expected: date(varchar(x)) , date(timestamp) , date(timestamp with time zone)
x
do you have some sample value for timestamp?
is it mills in epoch?
oh
question, is your pinot schema defined this column as dimension or timefield?
if infer is correct, it should show TIMESTAMP in type, not bigint
d
The simple queries are causing issues with Docker Presto
Copy code
Query 20200501_200108_00013_878qh failed: <http://java.net|java.net>.UnknownHostException: pinot-broker-0.pinot-broker-headless.events-local.svc.cluster.local
All of my queries for data are timing out
Bb in 15 minutes
I got it further along
Copy code
presto:default> SELECT "timestamp", "impressions" FROM pinot.default.events_testing LIMIT 3;
   timestamp   | impressions 
---------------+-------------
 1554366550435 |           1 
 1565861533888 |           1 
 1569063656996 |           1 
(3 rows)
x
ic
could you try to update the pinot schema by setting timestamp as either datetime field or timeField?
Copy code
"timeFieldSpec": {
    "incomingGranularitySpec": {
      "dataType": "LONG",
      "timeType": "MILLISECONDS",
      "name": "timestamp"
    }
  },
d
Yea, ill try it in ten minutes
Copy code
"timeFieldSpec": {
    "incomingGranularitySpec": {
      "name": "timestamp",
      "dataType": "LONG",
      "timeFormat" : "EPOCH",
      "timeType": "MILLISECONDS"
    }
  }
That's my current
Do you want me to remove
timeFormat
?
x
this is good
hmm
then why infer timestamp is not working ….
😅 1
d
I'm happy to forward the configs and stuff if it'd help
x
sure
or maybe your schema
d
Here's the schema (in the Kubernetes yaml and the etc for Presto).
x
hmmm
image.png
seems like it works for me
Copy code
presto:default> SELECT DATE_TRUNC('DAYS', "timestamp") as mydate, SUM(cost_usd_micros) FROM pinot.default.events GROUP BY DATE_TRUNC('DAYS', "timestamp");
 mydate | _col1
--------+-------
(0 rows)

Query 20200501_211538_00009_5qeb6, FINISHED, 1 node
Splits: 49 total, 49 done (100.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]
d
Copy code
presto:default> DESCRIBE pinot.default.events;
     Column      |  Type  | Extra |  Comment  
-----------------+--------+-------+-----------
 cost_usd_micros | bigint |       | METRIC    
 insertions      | bigint |       | METRIC    
 content_id      | bigint |       | DIMENSION 
 platform_id     | bigint |       | DIMENSION 
 clicks          | bigint |       | METRIC    
 impressions     | bigint |       | METRIC    
 customer_id     | bigint |       | DIMENSION 
 ad_group_id     | bigint |       | DIMENSION 
 campaign_id     | bigint |       | DIMENSION 
 advertiser_id   | bigint |       | DIMENSION 
 timestamp       | bigint |       | TIME      
(11 rows)
x
can you add
Copy code
pinot.use-date-trunc=true
pinot.infer-date-type-in-schema=true
pinot.infer-timestamp-type-in-schema=true
into
pinot.properties
under
etc
?
image.png
this is my pinot config for presto catalog
d
Ah, yea, I am missing infer-timestamp-type-in-schema. That's probably it.
x
cool, let me know
d
Yes! That's it. I hit issues iterating on it.
Thanks so much!
🎉 1