I have a 1M rows of CSV data. select 10 rows, Will I be billed for 10 rows. What is data returned and data scanned means in S3 Select?
There is less documentation on these terms of S3 select
I have a 1M rows of CSV data. select 10 rows, Will I be billed for 10 rows. What is data returned and data scanned means in S3 Select?
There is less documentation on these terms of S3 select
To keep things simple lets forget for some time that S3 reads in a columnar way. Suppose you have the following data:
| City | Last Updated Date |
|------------|---------------------|
| London | 1st Jan |
| London | 2nd Jan |
| New Delhi | 2nd Jan |
A query for fetching the latest update date
A query of select city where last updated date is 1st Jan,
Hence based on your query, it might scan more data (3 rows) but return less data (2 rows).
I hope you understand the difference between Data Scanned and Data Returned now.
© 2022 - 2024 — McMap. All rights reserved.
prestodb
, since it applies to Presto itself as well (github.com/prestodb/presto/pull/11033). – Nomadize