Three columns. Machine, Fault Code, Fault Description
Yah, I see.
I forgot about solution and location as well.
I think I need to make a table for all the machines with columns:
machine, fault code, fault description, fault solution, fault location
Then I join this table. In this case I use only description. In other spots, I will need to show the solutions and locations.
Lol I was like why did nobody comment on my where clause?
I guess when I used the right parenthesis it works
select Line, Code as Fault_Code,count(Code) as Stop_Occurences,
sum(datediff(second,
(CASE WHEN t_stamp< :beginTS THEN :beginTS Else t_stamp END),
(CASE WHEN lead> :endLead or lead is null THEN :endLead else lead END)))
as Total_Seconds
FROM (
SELECT Code, t_stamp, Line,
coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
CASE WHEN current_timestamp < :endLead
THEN current_timestamp ELSE :endLead END) AS lead
from myTable
WHERE t_stamp >= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp < :beginTS
ORDER BY t_stamp DESC), '1900-01-01')
AND t_stamp <= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp > :endLead
ORDER BY t_stamp), '2100-01-01')
) subq
where lead >= :beginTS and ( t_stamp < :endLead ) and
Code is not null and ( :machine=-1 or
Line=Case when :machine=1 then 'blueMachine'
When :machine=2 then 'redMachine'
When :machine=60 then 'sixtiethMachine'
else null end )
group by Line, Code
order by sum(datediff(second,t_stamp,lead)) desc
Now I have made my static table that describes the fault code.
I believe I will need to add a new select column, then add a join clause just before the group clause.
Getting an error by “Left” it says
I had expected maybe:
select Line, Code as Fault_Code,count(Code) as Stop_Occurences,
sum(datediff(second,
(CASE WHEN t_stamp< :beginTS THEN :beginTS Else t_stamp END),
(CASE WHEN lead> :endLead or lead is null THEN :endLead else lead END)))
as Total_Seconds, newTable.Description as Description
FROM (
SELECT Code, t_stamp, Line,
coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
CASE WHEN current_timestamp < :endLead
THEN current_timestamp ELSE :endLead END) AS lead
from myTable
WHERE t_stamp >= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp < :beginTS
ORDER BY t_stamp DESC), '1900-01-01')
AND t_stamp <= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp > :endLead
ORDER BY t_stamp), '2100-01-01')
) subq
where lead >= :beginTS and ( t_stamp < :endLead ) and
Code is not null and ( :machine=-1 or
Line=Case when :machine=1 then 'blueMachine'
When :machine=2 then 'redMachine'
When :machine=60 then 'sixtiethMachine'
else null end )
Left Join myTable on myTable.Line=newTable.Line and myTable.Code = newTable.Code
group by Line, Code
order by sum(datediff(second,t_stamp,lead)) desc
also attempting variations of :
Left Join newTable on subq.Line= newTable.Line and subq.Code=newTable.Code
Joins need to go between the FROM
clause and the WHERE
clause
I think you’re trying to put the fault code in the final select, so you’d simply need to move your left join between the ) subq
line and the where lead...
line
you’d need to replace your references after the ON
statement though. one would need to use subq.Code
(i think) and the other the table name (or alias) for your joined table
select Line, Code as Fault_Code,count(Code) as Stop_Occurences,
sum(datediff(second,
(CASE WHEN t_stamp< :beginTS THEN :beginTS Else t_stamp END),
(CASE WHEN lead> :endLead or lead is null THEN :endLead else lead END)))
as Total_Seconds, newTable.Description as Description
FROM (
SELECT Code, t_stamp, Line,
coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
CASE WHEN current_timestamp < :endLead
THEN current_timestamp ELSE :endLead END) AS lead
from myTable
WHERE t_stamp >= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp < :beginTS
ORDER BY t_stamp DESC), '1900-01-01')
AND t_stamp <= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp > :endLead
ORDER BY t_stamp), '2100-01-01')
) subq
Left Join newTable on subq.Line= newTable.Line and subq.Code=newTable.Code
where lead >= :beginTS and ( t_stamp < :endLead ) and
Code is not null and ( :machine=-1 or
Line=Case when :machine=1 then 'blueMachine'
When :machine=2 then 'redMachine'
When :machine=60 then 'sixtiethMachine'
else null end )
group by Line, Code
order by sum(datediff(second,t_stamp,lead)) desc
says ambiguous column name Code
It seems to be upset with Code in the last Where clause.
If I write myTable.Code it isn’t happy with that either.
Says multipart identifier could not be bound.
am I using the correct join?
I just want to add the description from the second table.
Ambiguous means it doesn’t know which table/alias to use. Add the proper alias.
Says multipart identifier could not be bound.
or
Says ambiguous column name code
I don’t seem able to escape these, not sure what the proper alias is.
I tried the table name and the subq.
myTable
isn’t “visible” at the outermost level. It is part of subq
. Pick subq.Code
or newTable.Code
. Everywhere in the outer section where you use Code
by itself. I see three places.
had trouble finding all of them to alias them
select subq.Line, subq.Code as Code,count(subq.Code) as Occurences, sum(datediff(second,
(CASE WHEN t_stamp< :beginTS THEN :beginTS Else t_stamp END),
(CASE WHEN lead> :endLead or lead is null THEN :endLead else lead END)))as Seconds, newTable.Description as Description
FROM (
SELECT myTable.Code,
t_stamp, myTable.Line,
coalesce(lead(t_stamp) OVER (ORDER BY t_stamp),
CASE WHEN current_timestamp < :endLead THEN current_timestamp ELSE :endLead END) AS lead
from myTable
WHERE t_stamp >= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp < :beginTS
ORDER BY t_stamp DESC), '1900-01-01')
AND t_stamp <= coalesce((
SELECT TOP 1 t_stamp
FROM myTable
WHERE t_stamp > :endLead
ORDER BY t_stamp), '2200-01-01')
) subq
Inner Join newTable on subq.Line=newTable.Line and subq.Code=newTable.Code
where lead >= :beginTS and ( t_stamp < :endLead ) and subq.Code is not null and (subq.Code<1100) and ( :machine=-1 or
subq.Line=Case when :machine=1 then 'b1'
When :machine=2 then 'b2'
When :machine=3 then 'b3'
When :machine=4 then 'b4'
When :machine=5 then 'b5'
When :machine=6 then 'b6'
When :machine=11 then 'c1'
When :machine=12 then 'c2'
When :machine=13 then 'c3'
When :machine=14 then 'c7'
When :machine=15 then 'c5'
When :machine=16 then 'c6'
else null end )
group by subq.Line, subq.Code, newTable.Description
order by sum(datediff(second,t_stamp,lead)) desc
I need a way to only get the lead when the code is under 1500
update:
I got it.
pturmel’s model works great if it is one machine like in the initial ask.
I broke it when I added more machines.
The codes from multiple machines time stamps were messing up the lead t_stamp.
So I added a bunch of filtering.
Works great now.
Basically if it says “where”, I also added filtering for a code < 1500 and a line= case when :machine…
If I group this query by
format(dateadd(hh,-8, t_stamp), 'yyyy-MMM-dd')
how do I join it to to another query that will be grouped by
format(dateadd(hh,-8, t_stamp), 'yyyy-MMM-dd')
both queries have subqueries
still join on mytable from that inner most subquery?
and if I join, I think my t_stamps would be messed up?
it turned out that I could join a subquery on a subquery
left join ( the entire subquery) sub1
on on format(dateadd(hh,-8, t.t_stamp), 'yyyy-MMM-dd') = sub1.day
I apparently need someone to post or I can’t post on this thread because 3 self replies is the limit.
I think I will have more things to do with this duration query though.
sept 2022
I am using a union per machine to get more than one machine in the query because my table has all machines in one table.
I haven’t figured out a way to get the t_stamps of one machine, then the next, and then the next.
I think it must be some kind of partition by over combination.
I think thought these were the same, with isnull being faster.
Seems to be throwing index errors.
,isnull(lead(t_stamp) OVER (ORDER BY t_stamp),current_timestamp) as test2
,coalesce(lead(t_stamp) OVER (ORDER BY t_stamp), CASE WHEN current_timestamp < :endLead THEN current_timestamp ELSE :endLead END) AS lead,t_stamp
I'm a little rusty.
Been trying to get fault durations across multiple lines in one query.
Select top 1
will not work when I have more lines because t_stamp could be any of 12 production lines, and I still need the other 11 production line codes prior to the shift.
So I attempted to have a where clause to obtain:
- when my t_stamp is between shift start and end
- when concat(line,t_stamp) IN (subquery for concatenated Line and max(t_stamp) grouped by line)
However, this seems to be so slow that it times out.
Use a union instead of a complex where clause.
I am sure it is a common problem though, so I would appreciate input, thanks.
Maybe I solved it using the union. Getting them in under 2/10ths a second.
Would you be able to provide some additional insight on your changes? I'm also trying to get duration of machine/asset states from a single table. I have something that works but the execution time seems a bit long.
I have the following query, which feels like a bastardization of the original query from pturmel but it seems to work as expected. (It grabs and trims events that cross the timestamp boundaries.) The execution time is around 2.5 seconds on a 4.5 million row table to fetch a result set of ~12,000 rows.
State Duration By Asset SQL
SELECT
AssetId,
StateId,
StateTimestamp as EventStart,
EventEnd,
dateDiff(SECOND, StateTimestamp,EventEnd) As duration
FROM (
Select
AssetId,
StateId,
CASE WHEN StateTimestamp < DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0))
THEN DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0)) ELSE StateTimestamp END AS StateTimestamp,
COALESCE(lead(StateTimestamp) Over (Partition By AssetId Order By StateTimestamp), CURRENT_TIMESTAMP) As EventEnd
FROM(
Select
t1.AssetId,
t1.StateId,
StateTimestamp,
lag(t1.stateId) OVER (PARTITION BY T1.AssetId ORDER BY StateTimestamp) as PriorValue
From dbo.OEE_Asset_State t1
Left Join (
Select AssetId, max(StateTimestamp) As PriorStateTs
FROM dbo.OEE_Asset_State
Where StateTimestamp < DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0))
Group By AssetId
) sub1 On t1.AssetId = sub1.AssetId
Left Join (
Select AssetId, min(StateTimestamp) As NextStateTs
FROM dbo.OEE_Asset_State
Where StateTimestamp >= CURRENT_TIMESTAMP
group by AssetId
) sub2 On t1.AssetId = sub2.AssetId
Where StateTimestamp >= coalesce(sub1.PriorStateTs, DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, CURRENT_TIMESTAMP), 0)))
And StateTimestamp < coalesce(sub2.NextStateTs, CURRENT_TIMESTAMP)
) Inner1
WHERE StateId != PriorValue or COALESCE(PriorValue, -100) = -100
)Outer1
ORDER BY AssetId ASC, EventStart DESC
My table columns are EntryId, AssetId, StateId, StateTimestamp
I have the default clustered index on EntryId
and a compound unclustered index on AssetId, StateTimestamp
including StateId
.
According to the execution plan, my query seems to be fetching the full 4.5 million rows twice, once for for the base data to aggregate and once for finding the first StateId
value change outside the lower timestamp range specified. That does eventually get filtered down a much smaller number but I feel like it shouldn't be grabbing all the rows from the table each time.
I'm using Microsoft SQL server 2014.
Is 2.5 seconds on a 4.5 million row table an acceptable time for this query?
No, your problem is that you aren't performing any pre-filtering by timestamp in the innermost subqueries. If you are you are regularly getting data, simply expanding the boundaries by a fixed amount in the innermost comparison will often suffice.
The outer WHERE clauses that prune the result to the true output boundaries would remain in place.
So add a WHERE
statement to the first innermost query, and change the other two innermost queries to define a set range in their WHERE
statements. Something like this then:
SELECT
AssetId,
StateId,
StateTimestamp as EventStart,
EventEnd,
dateDiff(SECOND, StateTimestamp,EventEnd) As duration
FROM (
Select
AssetId,
StateId,
StateTimestamp,
COALESCE(lead(StateTimestamp) Over (Partition By AssetId Order By StateTimestamp), '2024-03-01') As EventEnd
FROM(
SELECT
t1.AssetId,
StateId,
StateTimestamp,
PriorValue
FROM(
Select
AssetId,
StateId,
StateTimestamp,
lag(stateId) OVER (PARTITION BY AssetId ORDER BY StateTimestamp) as PriorValue
From dbo.OEE_Asset_State
WHERE StateTimestamp BETWEEN DATEADD(HOUR, -336, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0)) AND DATEADD(HOUR, 168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0))
) t1
Left Join (
Select AssetId, max(StateTimestamp) As PriorStateTs
FROM dbo.OEE_Asset_State
Where StateTimestamp BETWEEN DATEADD(HOUR, -336, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0)) AND DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0))
Group By AssetId
) sub1 On t1.AssetId = sub1.AssetId
Left Join (
Select AssetId, min(StateTimestamp) As NextStateTs
FROM dbo.OEE_Asset_State
Where StateTimestamp BETWEEN '2024-03-01' AND DATEADD(HOUR, 168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0))
group by AssetId
) sub2 On t1.AssetId = sub2.AssetId
Where StateTimestamp >= coalesce(sub1.PriorStateTs, DATEADD(HOUR, -168, DATEADD(HOUR, DATEDIFF(hour, 0, '2024-03-01'), 0)))
And StateTimestamp < coalesce(sub2.NextStateTs, '2024-03-01')
) Inner1
WHERE StateId != PriorValue or COALESCE(PriorValue, -100) = -100
)Outer1
ORDER BY AssetId ASC, EventStart DESC
For the set time offset around the given time window, what should I base that offset on? The longest time period we see between states? In my example here I'm using a week for testing, but we have had machines down for up to 3 months while doing upgrades and such.
Ew!
OK, you may need another scalar select for each endpoint that finds the last TS prior to the period (use Top 1 Order by StateTimestamp Desc) and another that finds the first TS after the period.
(Are you not parameterizing the begin and end points for a reason?)
I was building this for a query tag and didn't want to use DECLARE
as I was under the understanding that the JDBC driver technically doesn't support that. Didn't think to parameterize for building and testing.
Possible other solution to this, what if I set up something to grab the last written state from each AssetId
present and insert a new row with the same state but current timestamp? Run it at a set rate of 1 day or similar, something to basically make sure I will always have at least 1 state value per asset in the timestamp offset range I'm selecting.
If you develop/test in the named query editor, you can use colon-delimited parameters in multiple places.
Not an unreasonable approach. Certainly would simplify the query.
I think what I did:
I used "with" to create my subqueries for the faults by line.
Then I think used a union of sub queries from the "with" and filtered in where.
The table is indexed.
The fetch uses between to default to a range scan if it fails to get the index if I remember right.
I think if I had a machine down 3 months, it would not show in this list, so I would coalesce or case a default value to get he row.
And used things that Pturmel recommended me.