ayush sharma03/12/2021, 7:18 PM
Presto cannot even query something like this:
Segment query returned '50001' rows per split, maximum allowed is '50000' rows. with query "SELECT * FROM pinot_table LIMIT 50001"
Even, if we increase the 50k limit of presto's pinot.properties
presto:default> select count(*) from pinot.default.pinot_table;
to 1 million, the presto server crashes stating heap memory exceeded. To work it around, we got to know that we can make pinot to do the aggregations and feed the aggregated result to presto which will in turn feed the superset to visualize the charts, by writing the aggregation logic inside the sub query of presto like,
This returns the expected result. Problem # 3 We found that, though we can make pinot to do the aggregations, we cannot use the supported transformation function of pinot listed here, inside the sub query of presto. The query
presto:default> select * from pinot.default."select count(*) from pinot_table"
works fine in pinot but when embedded in presto as sub query like below does not work
select datetrunc('day', epoch_ms_col, 'milliseconds') from pinot_table limit 10
I do not know if we are doing something wrong while querying/implementing or have missed some useful config setting that can solve our problem. The SQL Lab query which we want to query from pinot and eventually use the result to make a chart is like
presto:default> select * from pinot.default."select datetrunc('day', epoch_ms_col, 'milliseconds') from pinot_table limit 10"; Query failed: Column datetrunc('day',epoch_ms_col,'milliseconds') not found in table default.select datetrunc('day', epoch_ms_col, 'milliseconds') from pinot_table limit 10
Any help is really appreciated !!!
SELECT day_of_week(epoch_ms_col), count(*) from pinot_table group by day_of_week(epoch_ms_col)
Elon03/12/2021, 9:21 PM
Ron Kitay03/16/2021, 6:29 PM
to extract a large amount of data - what are the limitations? e.g., if I want to do something like:
And save that output to a file (or files) - e.g. with the spark connector. What are the limits here? If the result is 2 TB of data, will that be supported?
SELECT * from table where creationTime => x and creationTime<y
Elon03/16/2021, 6:31 PM