Calculating Total Time in Seconds

JordanCClark · December 3, 2021, 7:14pm

Three columns. Machine, Fault Code, Fault Description

zacharyw.larson · December 3, 2021, 7:25pm

Yah, I see.
I forgot about solution and location as well.

I think I need to make a table for all the machines with columns:
machine, fault code, fault description, fault solution, fault location

Then I join this table. In this case I use only description. In other spots, I will need to show the solutions and locations.

Lol I was like why did nobody comment on my where clause?
I guess when I used the right parenthesis it works

select Line, Code as Fault_Code,count(Code) as Stop_Occurences, 
	sum(datediff(second,
	(CASE WHEN  t_stamp< :beginTS THEN :beginTS Else t_stamp END),
	(CASE WHEN lead> :endLead or lead is null  THEN :endLead else lead END)))
	as Total_Seconds

FROM (
	SELECT  Code, t_stamp, Line,
	coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
			CASE WHEN current_timestamp < :endLead 
		THEN current_timestamp ELSE :endLead END) AS lead
	
	from    myTable 
	WHERE t_stamp >= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp < :beginTS
		ORDER BY t_stamp DESC), '1900-01-01')
	AND t_stamp <= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp > :endLead
		ORDER BY t_stamp), '2100-01-01')
	) subq
	where lead >= :beginTS and ( t_stamp  < :endLead ) and 
		Code is not null and ( :machine=-1 or 
		Line=Case 	when :machine=1 then 'blueMachine' 
				When :machine=2 then 'redMachine'
                When :machine=60 then 'sixtiethMachine'
		else null end )

group by Line, Code
order by sum(datediff(second,t_stamp,lead)) desc

zacharyw.larson · December 17, 2021, 8:47pm

Now I have made my static table that describes the fault code.

I believe I will need to add a new select column, then add a join clause just before the group clause.

Getting an error by “Left” it says

I had expected maybe:

select Line, Code as Fault_Code,count(Code) as Stop_Occurences, 
	sum(datediff(second,
	(CASE WHEN  t_stamp< :beginTS THEN :beginTS Else t_stamp END),
	(CASE WHEN lead> :endLead or lead is null  THEN :endLead else lead END)))
	as Total_Seconds, newTable.Description as Description

FROM (
	SELECT  Code, t_stamp, Line,
	coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
			CASE WHEN current_timestamp < :endLead 
		THEN current_timestamp ELSE :endLead END) AS lead
	
	from    myTable 
	WHERE t_stamp >= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp < :beginTS
		ORDER BY t_stamp DESC), '1900-01-01')
	AND t_stamp <= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp > :endLead
		ORDER BY t_stamp), '2100-01-01')
	) subq
	where lead >= :beginTS and ( t_stamp  < :endLead ) and 
		Code is not null and ( :machine=-1 or 
		Line=Case 	when :machine=1 then 'blueMachine' 
				When :machine=2 then 'redMachine'
                When :machine=60 then 'sixtiethMachine'
		else null end )

Left Join myTable on myTable.Line=newTable.Line and myTable.Code = newTable.Code
group by Line, Code
order by sum(datediff(second,t_stamp,lead)) desc

also attempting variations of :

Left Join newTable on subq.Line= newTable.Line and subq.Code=newTable.Code

Matthew.gaitan · December 17, 2021, 9:40pm

Joins need to go between the FROM clause and the WHERE clause

I think you’re trying to put the fault code in the final select, so you’d simply need to move your left join between the ) subq line and the where lead... line

you’d need to replace your references after the ON statement though. one would need to use subq.Code (i think) and the other the table name (or alias) for your joined table

zacharyw.larson · December 20, 2021, 1:33pm

select Line, Code as Fault_Code,count(Code) as Stop_Occurences, 
	sum(datediff(second,
	(CASE WHEN  t_stamp< :beginTS THEN :beginTS Else t_stamp END),
	(CASE WHEN lead> :endLead or lead is null  THEN :endLead else lead END)))
	as Total_Seconds, newTable.Description as Description

FROM (
	SELECT  Code, t_stamp, Line,
	coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
			CASE WHEN current_timestamp < :endLead 
		THEN current_timestamp ELSE :endLead END) AS lead
	
	from    myTable 
	WHERE t_stamp >= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp < :beginTS
		ORDER BY t_stamp DESC), '1900-01-01')
	AND t_stamp <= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp > :endLead
		ORDER BY t_stamp), '2100-01-01')
	) subq
Left Join newTable on subq.Line= newTable.Line and subq.Code=newTable.Code
	where lead >= :beginTS and ( t_stamp  < :endLead ) and 
		Code is not null and ( :machine=-1 or 
		Line=Case 	when :machine=1 then 'blueMachine' 
				When :machine=2 then 'redMachine'
                When :machine=60 then 'sixtiethMachine'
		else null end )


group by Line, Code
order by sum(datediff(second,t_stamp,lead)) desc

says ambiguous column name Code
It seems to be upset with Code in the last Where clause.
If I write myTable.Code it isn’t happy with that either.
Says multipart identifier could not be bound.

am I using the correct join?
I just want to add the description from the second table.

jlandwerlen · December 20, 2021, 1:48pm

Ambiguous means it doesn’t know which table/alias to use. Add the proper alias.

zacharyw.larson · December 20, 2021, 2:43pm

Says multipart identifier could not be bound.
or
Says ambiguous column name code

I don’t seem able to escape these, not sure what the proper alias is.
I tried the table name and the subq.

pturmel · December 20, 2021, 2:45pm

myTable isn’t “visible” at the outermost level. It is part of subq. Pick subq.Code or newTable.Code. Everywhere in the outer section where you use Code by itself. I see three places.

zacharyw.larson · December 20, 2021, 3:02pm

had trouble finding all of them to alias them

select subq.Line, subq.Code as Code,count(subq.Code) as Occurences, sum(datediff(second,
	(CASE WHEN  t_stamp< :beginTS THEN :beginTS Else t_stamp END),
	(CASE WHEN lead> :endLead or lead is null  THEN :endLead else lead END)))as Seconds, newTable.Description as Description

FROM (
	SELECT  myTable.Code,
		t_stamp, myTable.Line,
	coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
			CASE WHEN current_timestamp < :endLead THEN current_timestamp ELSE :endLead END) AS lead
	
	from    myTable 
	WHERE t_stamp >= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp < :beginTS
		ORDER BY t_stamp DESC), '1900-01-01')
	AND t_stamp <= coalesce((
		SELECT TOP 1 t_stamp
		FROM myTable 
		WHERE t_stamp > :endLead
		ORDER BY t_stamp), '2200-01-01')
	) subq
	
	Inner Join newTable on subq.Line=newTable.Line and subq.Code=newTable.Code
	where lead >= :beginTS and ( t_stamp  < :endLead ) and subq.Code is not null and (subq.Code<1100) and ( :machine=-1 or 
		subq.Line=Case 	when :machine=1 then 'b1' 
				When :machine=2 then 'b2'
				When :machine=3 then 'b3'
				When :machine=4 then 'b4'
				When :machine=5 then 'b5'
				When :machine=6 then 'b6'
				When :machine=11 then 'c1'
				When :machine=12 then 'c2'
				When :machine=13 then 'c3'
				When :machine=14 then 'c7'
				When :machine=15 then 'c5'
				When :machine=16 then 'c6'
		else null end )

group by subq.Line, subq.Code, newTable.Description
order by sum(datediff(second,t_stamp,lead)) desc

zacharyw.larson · April 27, 2022, 1:16pm

I need a way to only get the lead when the code is under 1500

update:
I got it.

pturmel’s model works great if it is one machine like in the initial ask.
I broke it when I added more machines.
The codes from multiple machines time stamps were messing up the lead t_stamp.
So I added a bunch of filtering.

Works great now.

Basically if it says “where”, I also added filtering for a code < 1500 and a line= case when :machine…

zacharyw.larson · May 10, 2022, 8:42pm

If I group this query by

format(dateadd(hh,-8, t_stamp), 'yyyy-MMM-dd')

how do I join it to to another query that will be grouped by

format(dateadd(hh,-8, t_stamp), 'yyyy-MMM-dd')

both queries have subqueries

still join on mytable from that inner most subquery?
and if I join, I think my t_stamps would be messed up?

it turned out that I could join a subquery on a subquery

left join ( the entire subquery) sub1
on on format(dateadd(hh,-8, t.t_stamp), 'yyyy-MMM-dd') = sub1.day

I apparently need someone to post or I can’t post on this thread because 3 self replies is the limit.
I think I will have more things to do with this duration query though.

sept 2022

I am using a union per machine to get more than one machine in the query because my table has all machines in one table.
I haven’t figured out a way to get the t_stamps of one machine, then the next, and then the next.
I think it must be some kind of partition by over combination.

I think thought these were the same, with isnull being faster.
Seems to be throwing index errors.

,isnull(lead(t_stamp) OVER (ORDER BY t_stamp),current_timestamp) as test2

,coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),	CASE WHEN current_timestamp < :endLead THEN current_timestamp ELSE :endLead END) AS lead,t_stamp

Zach_Larson · February 8, 2024, 4:26pm

I'm a little rusty.
Been trying to get fault durations across multiple lines in one query.

Select top 1 will not work when I have more lines because t_stamp could be any of 12 production lines, and I still need the other 11 production line codes prior to the shift.

So I attempted to have a where clause to obtain:

when my t_stamp is between shift start and end
when concat(line,t_stamp) IN (subquery for concatenated Line and max(t_stamp) grouped by line)

However, this seems to be so slow that it times out.

Use a union instead of a complex where clause.
I am sure it is a common problem though, so I would appreciate input, thanks.

Maybe I solved it using the union. Getting them in under 2/10ths a second.

ryan.white · May 14, 2024, 7:21pm

Would you be able to provide some additional insight on your changes? I'm also trying to get duration of machine/asset states from a single table. I have something that works but the execution time seems a bit long.

I have the following query, which feels like a bastardization of the original query from pturmel but it seems to work as expected. (It grabs and trims events that cross the timestamp boundaries.) The execution time is around 2.5 seconds on a 4.5 million row table to fetch a result set of ~12,000 rows.

State Duration By Asset SQL

SELECT
	AssetId,
	StateId,
	StateTimestamp as EventStart,
	EventEnd,
	dateDiff(SECOND, StateTimestamp,EventEnd) As duration 
FROM (
	Select
		AssetId,
		StateId,
		CASE WHEN StateTimestamp < DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0)) 
		THEN DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0)) ELSE StateTimestamp END AS StateTimestamp,
		COALESCE(lead(StateTimestamp) Over (Partition By AssetId Order By StateTimestamp), CURRENT_TIMESTAMP) As EventEnd
	FROM(
		Select
			t1.AssetId,
			t1.StateId,
			StateTimestamp,
			lag(t1.stateId) OVER (PARTITION BY T1.AssetId ORDER BY StateTimestamp) as PriorValue
		From dbo.OEE_Asset_State t1
		Left Join (
			Select AssetId, max(StateTimestamp) As PriorStateTs
			FROM dbo.OEE_Asset_State
			Where StateTimestamp < DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0))
			Group By AssetId
		) sub1 On t1.AssetId = sub1.AssetId
		Left Join (
			Select AssetId, min(StateTimestamp) As NextStateTs
			FROM dbo.OEE_Asset_State
			Where StateTimestamp >= CURRENT_TIMESTAMP
			group by AssetId
		) sub2 On t1.AssetId = sub2.AssetId
		Where StateTimestamp >= coalesce(sub1.PriorStateTs, DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0)))
			And StateTimestamp < coalesce(sub2.NextStateTs, CURRENT_TIMESTAMP)
	) Inner1
	WHERE StateId != PriorValue or COALESCE(PriorValue, -100) = -100
)Outer1
ORDER BY AssetId ASC, EventStart DESC

My table columns are EntryId, AssetId, StateId, StateTimestamp
I have the default clustered index on EntryId and a compound unclustered index on AssetId, StateTimestamp including StateId.

According to the execution plan, my query seems to be fetching the full 4.5 million rows twice, once for for the base data to aggregate and once for finding the first StateId value change outside the lower timestamp range specified. That does eventually get filtered down a much smaller number but I feel like it shouldn't be grabbing all the rows from the table each time.

I'm using Microsoft SQL server 2014.

Execution Plan

Is 2.5 seconds on a 4.5 million row table an acceptable time for this query?

pturmel · May 14, 2024, 8:14pm

No, your problem is that you aren't performing any pre-filtering by timestamp in the innermost subqueries. If you are you are regularly getting data, simply expanding the boundaries by a fixed amount in the innermost comparison will often suffice.

The outer WHERE clauses that prune the result to the true output boundaries would remain in place.

ryan.white · May 14, 2024, 9:00pm

So add a WHERE statement to the first innermost query, and change the other two innermost queries to define a set range in their WHERE statements. Something like this then:

SELECT
	AssetId,
	StateId,
	StateTimestamp as EventStart,
	EventEnd,
	dateDiff(SECOND, StateTimestamp,EventEnd) As duration 
FROM (
	Select
		AssetId,
		StateId,
		StateTimestamp,
		COALESCE(lead(StateTimestamp) Over (Partition By AssetId Order By StateTimestamp), '2024-03-01') As EventEnd
	FROM(
		SELECT
			t1.AssetId,
			StateId,
			StateTimestamp,
			PriorValue
		FROM(
			Select
				AssetId,
				StateId,
				StateTimestamp,
				lag(stateId) OVER (PARTITION BY AssetId ORDER BY StateTimestamp) as PriorValue
			From dbo.OEE_Asset_State
			WHERE StateTimestamp BETWEEN DATEADD(HOUR, -336, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0)) AND DATEADD(HOUR, 168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0))
		) t1
		Left Join (
			Select AssetId, max(StateTimestamp) As PriorStateTs
			FROM dbo.OEE_Asset_State
			Where StateTimestamp BETWEEN DATEADD(HOUR, -336, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0)) AND DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0))
			Group By AssetId
		) sub1 On t1.AssetId = sub1.AssetId
		Left Join (
			Select AssetId, min(StateTimestamp) As NextStateTs
			FROM dbo.OEE_Asset_State
			Where StateTimestamp BETWEEN '2024-03-01' AND DATEADD(HOUR, 168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0))
			group by AssetId
		) sub2 On t1.AssetId = sub2.AssetId
		Where StateTimestamp >= coalesce(sub1.PriorStateTs, DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0)))
			And StateTimestamp < coalesce(sub2.NextStateTs, '2024-03-01')
	) Inner1
	WHERE StateId != PriorValue or COALESCE(PriorValue, -100) = -100
)Outer1
ORDER BY AssetId ASC, EventStart DESC

For the set time offset around the given time window, what should I base that offset on? The longest time period we see between states? In my example here I'm using a week for testing, but we have had machines down for up to 3 months while doing upgrades and such.

pturmel · May 14, 2024, 9:12pm

Ew!

OK, you may need another scalar select for each endpoint that finds the last TS prior to the period (use Top 1 Order by StateTimestamp Desc) and another that finds the first TS after the period.

(Are you not parameterizing the begin and end points for a reason?)

ryan.white · May 14, 2024, 9:22pm

I was building this for a query tag and didn't want to use DECLARE as I was under the understanding that the JDBC driver technically doesn't support that. Didn't think to parameterize for building and testing.

Possible other solution to this, what if I set up something to grab the last written state from each AssetId present and insert a new row with the same state but current timestamp? Run it at a set rate of 1 day or similar, something to basically make sure I will always have at least 1 state value per asset in the timestamp offset range I'm selecting.

pturmel · May 14, 2024, 9:24pm

If you develop/test in the named query editor, you can use colon-delimited parameters in multiple places.

Not an unreasonable approach. Certainly would simplify the query.

Zach_Larson · May 21, 2024, 6:51pm

I think what I did:

I used "with" to create my subqueries for the faults by line.

Then I think used a union of sub queries from the "with" and filtered in where.

The table is indexed.

The fetch uses between to default to a range scan if it fails to get the index if I remember right.

I think if I had a machine down 3 months, it would not show in this list, so I would coalesce or case a default value to get he row.

And used things that Pturmel recommended me.