I have a system which is saving timestamp for events (turning on/off a motor) in a database. I have two queries to get the timestamp of the events (one for 'start' event and other for 'stop' event). Now I want to make a report showing 'start' and 'stop' event timestamps in a table.
Below picture might explain what I am trying to achieve:
I have managed to write a script to create a new dataset from two queries. But it is not giving the result I am expecting. Can anyone help please?
Below is the picture of report I am getting:
Below is my script:
def updateData(data, sample):
start = system.dataset.toPyDataSet(data['Start'].getCoreResults())
stop = system.dataset.toPyDataSet(data['Stop'].getCoreResults())
headers = ['start', 'stop']
dataOut = []
for row in start:
dataOut.append([row[0],None])
for row in stop:
dataOut.append([None,row[0]])
combinedDataSet = system.dataset.toDataSet(headers, dataOut)
data['combinedData'] = system.dataset.sort(combinedDataSet, 0)
This is something normally done in the database with lead() or lag() expressions. Show your Query1 and Query2 and we can help you do this the "right" way.
First, I agree with Paul this is marginal approach.
You need to pair down your pydataset to just call the column with the timestamps and then it may work. try just using [row[0]] and and not row[0], none. Append by nature just adds it to the row of data.
What are you going to do if for some reason your db misses a start or stop event. Once that happens your dataset will be completely confused and rubbish. It will happen it is just a matter of time.
If you want to use the method that you are currently using, I would recommend you iterate over your list of start times and then add to the stop column the first time in your next list that is greater than your start time. If your dataset is really large this may be extremely inefficient and you may want to consider something else. One easy way is to drop the entries that are all before the start time so the tool only has to look through a smaller list every time.
The sql method is to sort your dataset and filter just the columns with start times and retrieve the next entry where the db says stop. This method of querying is also inefficient (can take a lot of memory and cpu usage) since you are iterating over the entire dataset to come up with your list. Not too bad for small data sets but it could cause performance issues.
I bet you if you post your queries someone on this forum may post the code you need to solve this.
Will it be possible to use system.dataset.addColumn() in this case? I am not sure if it is possible to use this function to add column with timestamp values.
SELECT started, stopped
FROM (SELECT ts AS started, start, stop, lead(ts, 1, NULL) OVER (ORDER BY ts) AS stopped
FROM tbl_pumpevent_babu
ORDER BY ts) subq
WHERE start = 1
Consider the table:
The subquery's lead function grabs the time stamp from the following row.
Your solution worked. For some reason, "lead(ts,1,NULL)" is giving me error but "lead(ts)" is working.
The working script is as below:
SELECT started, stopped
FROM (SELECT ts AS started, start, stop, lead(ts) OVER (ORDER BY ts) AS stopped
FROM tbl_pumpevent_babu
ORDER BY ts) subq
WHERE start = 1