SQL query for a table datasets's top 10 defect value

Don_t_shoot · January 19, 2015, 8:05am

I want to make a report which need a dataset to always show top 10 defect quality in ‘defect_log’ database table , and the report’s form like the picture in attachment , and I want to write the Sql query binding in the table datasets of a analysis controller , how can I write the sql to get the table data source ? (the key point is “always top 10 defect quality to show”) Thanks a lot .

JordanCClark · January 19, 2015, 7:46pm

Doing a ‘top N within a group’ is not is simple as it sounds, especially if the data is in a tall format. Things get wonky quickly and can be a pain in the backside to deal with.

My opinion is to use a script to grab, sort, and write the data to the dataset for the report.

If you can supply with a sample of data, then we can work on a further solution for you.

Also: What platform are you using (MySQL, SQL Server, etc)?

Don_t_shoot · January 20, 2015, 2:55am

Thanks for the help Jordan, I use MYsql . And the attachment is a simple table data, for your reference , I think this file can help . Don’t care about the column ‘yield’ ,you can just make a data source table ongly have “machine”,“line”,“date”,“defect 1 quantity” ,“defect 2 quantity”,“defect 3 quantity”…“defect 5 quantity”, Thanks a lot .
Simple table data.csv (106 Bytes)

JordanCClark · January 20, 2015, 1:35pm

Just took look at the file you posted. There’s no defect data here.

Is there anything in the defect_log data? If so, make an export of that table (csv or sql is fine) and post it here.

Don_t_shoot · January 20, 2015, 3:14pm

Sorry ,I think I forgot save
Simple defect_log table.csv (3.36 KB)

JordanCClark · January 20, 2015, 6:51pm

That looks better!

Okay, just so I’m on the same page:

What are the relationships between the three tables?
In the defect_log, what is the runID column?
In the defect_log, what is the quantity column?
What is the hierarchy (priority) of sorting? Currently I’m thinking:
[ul]a) date
b) line
c) machine
d) defect types and quantities in descending order[/ul]
are the quantities summed up, or are we just counting the number of occurrences (example: MMS happened 4 times on 2014-12-06)? Or both (MMS occurred 4 times for a total of 279)

Don_t_shoot · January 21, 2015, 2:19am

Thanks for the answer Jordan.

1.the relationship between these three tables as below:
Machine table.machineID=Line table.parent ID
Line table.Line Name=defect_log table .line

2.Don’t care about the ‘runID’ in defect_log table

3.The priority you thought is OK.

4.The quantities summed up ,I think it’s OK.

JordanCClark · January 23, 2015, 12:19pm

Sorry, finally able to get back to this.

I imported you data into a db table, so that a query can do the first round of sorting:

select date_format(entryTimestamp,'%Y-%m-%d') as t_stamp, line, productCode, defectCode, sum(quantity) as qty from test.defect_log group by t_stamp, line, productCode, defectCode order by t_stamp, line, productCode, qty desc
gives a result set like this:

Now, we can use a script to format a dataset:

[code]query=""“SELECT date_format(entryTimestamp,’%Y-%m-%d’) as t_stamp, line, productCode, defectCode, sum(quantity) as qty
FROM test.defect_log
GROUP BY t_stamp, line, productCode, defectCode
ORDER BY t_stamp, line, productCode, qty desc”""

dataIn=system.db.runQuery(query)

lastDate=’’
lastLine=’’
lastProd=’’

headers=[“Date”, “line”, “productCode”, “defect1”, “defect2”, “defect3”, “defect4”, “defect5”, “defect6”, “defect7”, “defect8”, “defect9”, “defect10”]
newData=
addRowFlag=0
for row in dataIn:
if row[‘t_stamp’] != lastDate or row[‘line’] != lastLine or row[‘productCode’] != lastProd:
if addRowFlag == 1:
newData.append(newRow)
newRow=[row[‘t_stamp’], row[‘line’], row[‘productCode’], str(row[‘defectCode’]) + ': ’ +str(int(row[‘qty’])), ‘-’, ‘-’, ‘-’, ‘-’, ‘-’, ‘-’, ‘-’, ‘-’, ‘-’]
defectIndex = 4
addRowFlag = 1
lastDate = row[‘t_stamp’]
lastLine= row[‘line’]
lastProd = row[‘productCode’]
else:
if defectIndex <= 12:
newRow[defectIndex] = str(row[‘defectCode’]) + ': ’ +str(int(row[‘qty’]))
defectIndex += 1

event.source.parent.getComponent(‘Table’).data=system.dataset.toDataSet(headers, newData)

[/code]

I have some things to take care of this morning, but I will edit/add to this post to document the script… especially since it’s not documented at all…

Don_t_shoot · January 25, 2015, 8:24am

Thank you so much Jordan, I do some alter to the script for my real project , no error ,but no data in the DATA PROPERTY of ”Table“component , is there something wrong as below？Thanks again.

query=""“SELECT A.Name as Area,L.Name as Line ,W.WORKORDER AS WORKORDER, W.ProductCode AS PartNumber,P.Description AS Description,S.FinishDateTime AS Date,T.tooling1 AS ToolNumber ,line.productioncount AS TotalOutputQuantity,(100*{Root Container.RunQty}/(SUM(D.quantity)+{Root Container.RunQty})) AS Yield,sum(100 * (D.quantity/{Root Container.RunQty})) AS DefectPercentage
FROM ign_runtime.area A INNER JOIN ign_runtime.line L INNER JOIN ign_runtime.schedule S INNER JOIN ign_runtime.workorder W INNER JOIN ign_runtime.productcode P INNER JOIN ign_mes.tooling T INNER JOIN ign_runtime.linepropertyhistory LINE INNER JOIN ign_mes.defect_log D ON W.ProductCode=P.ProductCode and S.WorkOrderID=W.ID and A.ID=L.ParentID and L.ID=S.LineID and T.workOrder=W.WorkOrder and Line.productcode=W.productcode and D.productcode=W.productcode”""

dataIn=system.db.runQuery(query)

headers=[“PartNum”,“Description”,“Machine”,“TotalOutputQty”,“YieldRate”,“Date”, “defect1”, “defect2”, “defect3”, “defect4”, “defect5”]
newData=[]
addRowFlag=0
for row in dataIn:
if row[‘Date’] != lastDate or row[‘Line’] != lastLine or row[‘PartNumber’] != lastProd:
if addRowFlag == 1:
newData.append(newRow)
newRow[defectIndex]=[row[‘PartNumber’], row[‘Description’], row[‘Machine’], row[‘TotalOutputQuantity’], row[‘YieldRate’], str(row[‘DefectPercentage’]), ‘-’, ‘-’, ‘-’, ‘-’]
7<=defectIndex <=10
addRowFlag = 1
defectIndex += 1

event.source.parent.getComponent(‘Table’).data=system.dataset.toDataSet(headers, newData)

JordanCClark · January 26, 2015, 12:27pm

Well, at first look, I’d say it’s because you’re missing some things I had in there. As my brother used to say about a go-cart we had as kids, “We kept fixing it until it didn’t work any more…”

This should get you closer:

[code]query=""“SELECT A.Name as Area,L.Name as Line ,W.WORKORDER AS WORKORDER, W.ProductCode AS PartNumber,P.Description AS Description,S.FinishDateTime AS Date,T.tooling1 AS ToolNumber ,line.productioncount AS TotalOutputQuantity,(100*{Root Container.RunQty}/(SUM(D.quantity)+{Root Container.RunQty})) AS Yield,sum(100 * (D.quantity/{Root Container.RunQty})) AS DefectPercentage
FROM ign_runtime.area A INNER JOIN ign_runtime.line L INNER JOIN ign_runtime.schedule S INNER JOIN ign_runtime.workorder W INNER JOIN ign_runtime.productcode P INNER JOIN ign_mes.tooling T INNER JOIN ign_runtime.linepropertyhistory LINE INNER JOIN ign_mes.defect_log D ON W.ProductCode=P.ProductCode and S.WorkOrderID=W.ID and A.ID=L.ParentID and L.ID=S.LineID and T.workOrder=W.WorkOrder and Line.productcode=W.productcode and D.productcode=W.productcode”""

dataIn=system.db.runQuery(query)

Initialize variables for comparing current Date, Line, and Product

lastDate=’’
lastLine=’’
lastPartNum=’’

Initialize headers and data lists for our processed data

headers=[“PartNum”,“Description”,“Machine”,“TotalOutputQty”,“YieldRate”,“Date”, “defect1”, “defect2”, “defect3”, “defect4”, “defect5”]
newData=[]

This flag is so that we ignore adding a new row to newData at the very beginning

(at start, lastDate, lastLine, and lastPartNum are always different.)

addRowFlag=0

for row in dataIn:
if row[‘Date’] != lastDate or row[‘Line’] != lastLine or row[‘PartNumber’] != lastPartNum: #Look for a change
if addRowFlag == 1:
newData.append(newRow) # add row data to newData
# initialize the data for the next row of data coming in
newRow[]=[row[‘PartNumber’], row[‘Description’], row[‘Machine’], row[‘TotalOutputQuantity’], row[‘YieldRate’], row[‘Date’], str(row[‘DefectPercentage’]), ‘-’, ‘-’, ‘-’, ‘-’]
defectIndex=7 # this points to where defect2 will go (defect1 was already put in when we started the new row)
addRowFlag = 1 # After this point, any differences in PartNum, Date, or Line will result in a new row of data.
lastDate = row[‘Date’] # Set comparison data
lastLine= row[‘Line’]
lastPartNum = row[‘PartNumber’]
else:
if defectIndex <= 10: # As long as we’re not at the end of the row…
newRow[defectIndex] = str(row[‘DefectPercentage’])
defectIndex += 1
event.source.parent.getComponent(‘Table’).data=system.dataset.toDataSet(headers, newData)
[/code]

All that being said, there is no sorting going on in your query. If you look ay my example query:

select date_format(entryTimestamp,'%Y-%m-%d') as t_stamp, line, productCode, defectCode, sum(quantity) as qty from test.defect_log group by t_stamp, line, productCode, defectCode order by t_stamp, line, productCode, qty desc
The GROUP BY and ORDER BY Clauses are where the magic happens. The GROUP BY separates them out by Date, Line and defectCode. The ORDER BY sorts them into the order we want. You’ll want to do something similar to get the script working correctly.

One more thing I notice is that no defect codes are showing up in the query. If you are going with a ‘Top N’ sorting, it will be helpful tou your users if they can see what the defects are.

Take a closer look at the example I posted and see if you can tell how I added the defect codes to the end result.

Don_t_shoot · January 27, 2015, 12:51am

Thanks Jordan, as a rookie ,I am your loyal fans now .