Hi,
Yes, as you’re discovering, we didn’t exactly have direct querying in mind when we built the tables, though the flip side of that is that the query interface in the software doesn’t currently let you do things like this easily. That said, you might find it easier to do this in scripting, by querying the raw data, and doing the processing yourself. This would let you ignore the partitions, and the table structure in general.
Going forward with SQL queries, you’ve identified the main parts. Yes, the partitions table maps tables to time ranges. The sqlth_te table maps tags to ids, and the data just uses the id. I don’t think it’s possible to dynamically figure out which tables need to be queried, and then join them together, though you might be able to do this in a stored procedure, or maybe by using a view that unioned them all together (I’m not sure about the performance of doing this, I’m afraid it wouldn’t be good).
Anyhow, the partition size is arbitrary, and can be made whatever you want. You could just change it to be large (5 years) and not worry about partitions. In fact, I suppose there should be an option to just turn them off.
Sticking with what you currently have, I want to offer 2 suggestions in regard to query performance:
- Remove the sub query, these are generally less efficient than joins.
- Change the way you limit the time, to optimize index usage.
In other words…
SELECT MAX(d.intValue),FROM_UNIXTIME(d.t_stamp/1000) FROM sqlt_data_1_2011_11 d, sqlth_te t
WHERE d.tagid=t.id and t.tagpath='YOUR_TAG_PATH_HERE'
AND d.t_stamp between unix_timestamp(date_add(date(now()), interval 8 hour) and unix_timestamp(date_add(date(now()),interval 15 hour) GROUP BY FROM_UNIXTIME(d.t_stamp/1000)
See, by leaving t_stamp alone, the database doesn’t need to do a full table scan (at least, it doesn’t need to run those functions against each value in the table). Instead, it runs the functions once, and then just compares the raw db value to them, which more efficient.
Now that I look at it, I don’t think the group by is correct/necessary, and for what you really want to do you’ll have to adjust the where clause a bit, but I hope this gets you a bit closer to what you want to do. Maybe some friendly script ninja will show up with an example of how to do it in scripting…
Regards,