Hi all i'm new to this is there any good videos o...
# general
i
Hi all i'm new to this is there any good videos on how to gather data from S3 and query it? another question, currently i use Athena to do the queries this is better? sorry for my ignorance 🙂
o
For pinot, currently, you can analyze data with your favorite tool like spark, hive etc then write it to the s3. Then you can trigger pinot spark-batch-ingestion spec(https://docs.pinot.apache.org/basics/data-import/batch-ingestion/spark) to create pinot segments and upload them to pinot. Then you can query data in ui or over http. I do not use athena, but athena queries s3 data with sql query. But pinot is designed for real-time olap queries with some other features. The use cases are different for pinot and athena i think. If I am wrong, we can talk about them
k
https://docs.pinot.apache.org/basics/data-import/pinot-file-system/amazon-s3 This should help. Let me know if you require more details
i
@Kartik Khare thanks for the reply i've read it, but didn't understand on how to use it when deploying via docker
i just want it to import the data which is already there
k
You should just mention the file spec in the job spec e.g.
Copy code
pinotFSSpecs:
  - scheme: s3
    className: org.apache.pinot.plugin.filesystem.S3PinotFS