![]() | |
#1
| |||
| |||
|
#2
| ||||
| ||||
|
|
Hi, I'm a programmer with only rudimentary knowledge of SQL, and i'm wondering about how to implement a sort of versioned, time-based data store. For the sake of argument, let's say i have a table like: CREATE TABLE weather_reports ( station_number INTEGER PRIMARY KEY, report_date DATE NOT NULL, temperature INTEGER ); And some data like: station_number report_date temperature -------------- ----------- ----------- 1 2010-05-02 10 1 2010-05-03 11 2 2010-05-01 27 2 2010-05-02 30 Firstly, how can i ask for: the latest temperature from each station Given that the reports involved may be from different dates? |
|
Secondly, how can i ask for: the numbers of all the stations where the latest temperature was over 20? If i was doing this in imperative code, the former would look like: for each station: sort all the reports by date take the latest But i don't know how to transpose this to SQL. I'm guessing that involves some combination of GROUP BY station_number, to give the aspect of working over all the stations, ORDER BY report_date, to give the aspect of looking at the data in terms of time, and some operator which takes the first row in each sorted group. I don't know what that last operator is, and i don't know how to put them all together. |
|
I suspect the second query could be written as a simple select over the results of the first: SELECT station_number FROM (the first query, projected over at least station_number and temperature) WHERE temperature > 20 |
|
Sorry to ask such a basic question; if there's any fine manual i could read online that would let me work this out myself, please don't hesitate to tell me to read it. |
#3
| ||||
| ||||
|
|
station_number report_date temperature -------------- ----------- ----------- 1 2010-05-02 10 1 2010-05-03 11 2 2010-05-01 27 2 2010-05-02 30 Firstly, how can i ask for: the latest temperature from each station |
|
Given that the reports involved may be from different dates? Secondly, how can i ask for: the numbers of all the stations where the latest temperature was over 20 Just add a WHERE temperature > 20 to above statement. |
|
and some operator which takes the first row in each sorted group. I don't know what that last operator is, and i don't know how to put them all together. |
|
Sorry to ask such a basic question; if there's any fine manual i could read online that would let me work this out myself, please don't hesitate to tell me to read it. |
#4
| ||||||
| ||||||
|
|
I'm a programmer with only rudimentary knowledge of SQL, and i'm wondering about how to implement a sort of versioned, time-based data store. |
|
For the sake of argument, let's say i have a table like: CREATE TABLE weather_reports ( station_number INTEGER PRIMARY KEY, report_date DATE NOT NULL, temperature INTEGER ); And some data like: station_number report_date temperature -------------- ----------- ----------- 1 2010-05-02 10 1 2010-05-03 11 2 2010-05-01 27 2 2010-05-02 30 |
|
Firstly, how can i ask for: the latest temperature from each station Given that the reports involved may be from different dates? |
|
Secondly, how can i ask for: the numbers of all the stations where the latest temperature was over 20 |
|
If i was doing this in imperative code […] But i don't know how to transpose this to SQL. |
|
Sorry to ask such a basic question; if there's any fine manual i could read online that would let me work this out myself, please don't hesitate to tell me to read it. |
#5
| |||
| |||
|
|
Hi, I'm a programmer with only rudimentary knowledge of SQL, and i'm wondering about how to implement a sort of versioned, time-based data store. |
|
For the sake of argument, let's say i have a table like: CREATE TABLE weather_reports ( station_number INTEGER PRIMARY KEY, report_date DATE NOT NULL, temperature INTEGER ); And some data like: station_number report_date temperature -------------- ----------- ----------- 1 2010-05-02 10 1 2010-05-03 11 2 2010-05-01 27 2 2010-05-02 30 Firstly, how can i ask for: the latest temperature from each station Given that the reports involved may be from different dates? |
|
Secondly, how can i ask for: the numbers of all the stations where the latest temperature was over 20 |
#6
| |||
| |||
|
|
select station_number from ( select distinct on station_number * from weather_reports order by station_number, report_date desc ) as latest where temperature> 20; The above is all AFAIK standard SQL99, which is a standard that different databases suppoert ot different extents. it works for me, but may not work for you. |
#7
| |||
| |||
|
|
On 2010-05-04 10:12, Jasen Betts wrote: The above is all AFAIK standard SQL99, which is a standard that different databases suppoert ot different extents. it works for me, but may not work for you. I've never seen that syntax before, what does: select distinct on station_number * from weather_reports mean? |
#8
| |||
| |||
|
|
The expression on which to distinguish should be enclosed in parentheses: SELECT DISTINCT ON (station_number) * FROM weather_reports; According to the PostgreSQL docs, this is not standard SQL: The clause DISTINCT ON is not defined in the SQL standard. URL:http://www.postgresql.org/docs/8.4/s...l#SQL-DISTINCT |
#9
| |||
| |||
|
|
On 2010-05-03, Tom Anderson <twic (AT) urchin (DOT) earth.li> wrote: I'm a programmer with only rudimentary knowledge of SQL, and i'm wondering about how to implement a sort of versioned, time-based data store. which database? they're not all the same. |
|
the latest temperature from each station Given that the reports involved may be from different dates? select distinct on station_number * from weather_reports order by station_number,report_date desc; |
#10
| |||
| |||
|
|
Tom Anderson <twic (AT) urchin (DOT) earth.li> writes: CREATE TABLE weather_reports ( station_number INTEGER PRIMARY KEY, report_date DATE NOT NULL, temperature INTEGER ); And some data like: station_number report_date temperature -------------- ----------- ----------- 1 2010-05-02 10 1 2010-05-03 11 2 2010-05-01 27 2 2010-05-02 30 That data doesn't match the table definition, and will be rejected by a conformant RDBMS: |
|
Firstly, how can i ask for: the latest temperature from each station Given that the reports involved may be from different dates? You would order the reports by timestamp, and ask for the latest value: => SELECT station_code, MAX(read_time) AS read_time FROM reading GROUP BY station_code ; You can then use that relation to join with the reading table and get the resulting temperature: => SELECT station_code, temperature FROM reading NATURAL JOIN ( SELECT station_code, MAX(read_time) AS read_time FROM reading GROUP BY station_code ) AS station_latest ; |
|
If i was doing this in imperative code [?] But i don't know how to transpose this to SQL. Thinking in imperative terms won't help you much. You don't have iteration operations available. With a relational database, you're encouraged to think in terms of operations on whole relations: i.e. set operations like restrict, join, project, difference, and so on. |
![]() |
| Thread Tools | |
| Display Modes | |
| |