Sql pareto for column values per row help request

zacharyw.larson · May 2, 2022, 2:51pm

This table has about 200 columns of counters that are recorded at the end of each shift.
It has about 4 other columns for some meta data.

How would a top 5 Pareto be made from such a table?

pascal.fragnoud · May 2, 2022, 3:05pm

What’s a perato ? Did you mean pareto ? What do you want the output to look like ?

bkarabinchak.psi · May 2, 2022, 3:06pm

SELECT TOP 5 * and then an appropriate ORDER BY that is measuring the metric’s that determine the top 5.

Without more information as to how you are ranking each row, it’s hard to give any more information.

zacharyw.larson · May 2, 2022, 3:13pm

each row is like


col1        col2          ......col200      t_stamp   color   area
   3          34                     6     (tstamp)    blue   zoneA

I want the output to show the top 5 column values and the column name like:


column     value
col134      71
col22       70
col23       65
col156      62
col111      60

to begin with

bkarabinchak.psi · May 2, 2022, 3:28pm

So for every column you want the max value, and then the order of the columns based on each max value is that right?

zacharyw.larson · May 2, 2022, 3:32pm

to begin with, I need the top 5 columns from row 1 based on max value in those columns for that row

Eventually, I need to get top 5 columns from row 1
then top 5 columns from row 2
then top 5 columns rom row x

jlandwerlen · May 2, 2022, 3:35pm

What have you tried so far? There are many examples on Mr. Internet.

zacharyw.larson · May 2, 2022, 3:35pm

If you see any example of this, let me know.

I don’t see any.

jlandwerlen · May 2, 2022, 3:38pm

That took 20 seconds.

zacharyw.larson · May 2, 2022, 3:40pm

I might be able to

`DECLARE @TB TABLE(Column1 NVARCHAR(50),…)
INSERT @TB
SELECT * FROM student
Then use it:

SELECT Column1 FROM @TB WHERE Column1=‘aa’`

and set some max out of that

zacharyw.larson · May 2, 2022, 3:41pm

@jlandwerlen

why do you post if just posting google search and not actually helping?

You want me to look at all the ways it will not work?

I will be back after lunch, going to spend 2 hours looking at these ways that nobody knows if they work

bkarabinchak.psi · May 2, 2022, 3:42pm

I don’t think these links are quite what he wants though based on the example he gave before. This is like getting the max between column1,2,3 and returning that to the name of the row.

He wants the name of the column and the max value of the column, and then sorting them.

I haven’t used them in forever but I think this might be the time for a PIVOT table.

Personally I would probably just hard code the query once

SELECT max(column1) as 'column1', max(column2) as 'column2', etc....

Which would be annoying but you could probably use NotePad++ to make it easier and then you’d have half of your work done. Then its just a matter of sorting. Though then again, this will be in

column1     column2          column3    etc
111             14            57        etc

but if this is in scripting you can use jython to do what you need to it from here I think

jlandwerlen · May 2, 2022, 3:45pm

My approach at helping is maybe a bit different compared to others. I’m in the camp that believes you should make an effort at something before asking for the answer.

I didn’t go through each link, just showing how easy it is to search for help on SQL. I’ll leave it at that.

pturmel · May 2, 2022, 3:47pm

This requires an “UNPIVOT”, then a rank() window aggregate, finally filtering and ordering by that rank. I have to admit not knowing how to unpivot short of a script. But that is the search term I would put into google.

jlandwerlen · May 2, 2022, 3:50pm

The first link above had an unpivot example

pturmel · May 2, 2022, 3:53pm

Hmmm, yes. However, it doesn't retain the identity of the origin column. Can it?

bkarabinchak.psi · May 2, 2022, 3:56pm

I think its fair of you to ask to see an attempt/effort but also fair to point out that your links didn’t match what he had given as his desired outputs. Posting links for links sakes doesn’t really prove anything or help anyone.

JordanCClark · May 2, 2022, 4:29pm

Is there an aversion to scripting this?

pturmel · May 2, 2022, 4:33pm

A stack overflow post suggests it can....

JordanCClark · May 2, 2022, 4:45pm

### Sample Dataset to simulate query result
import random
headers = ['col1', 'col2', 'col3', 'col4', 'col5', 'col6', 'col7', 'col8', 'col9', 'col10', 't_stamp', 'color', 'area']
data = []
t_stamp = system.date.now()
for i in range(10):
	newRow = [random.randrange(100) for j in range(10)]
	t_stamp = system.date.addMinutes(t_stamp, 1)
	newRow.extend([t_stamp, 'blue', 'zoneX'])
	data.append(newRow)
dataIn = system.dataset.toPyDataSet(system.dataset.toDataSet(headers, data))
#util.printDataSet(dataIn)
print '---'
### Sorting

# List of exluded columns
excluded_cols = ['t_stamp', 'color', 'area']

# Get columns to use in sorting
filteredHeaders = [colName for colName in dataIn.getColumnNames() if colName not in excluded_cols]

for row in dataIn:
	data = [row[colName] for colName in filteredHeaders]
	
	#Sorting a zip sorts all lists by the first one.
	pareto = [[x, y] for y, x in reversed(sorted(zip(data, filteredHeaders)))][:5]

	print row['t_stamp'], pareto

row | col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | t_stamp                      | color | area 
-------------------------------------------------------------------------------------------------------------------------
0   | 99   | 40   | 49   | 0    | 70   | 32   | 76   | 69   | 70   | 33    | Mon May 02 12:39:36 EDT 2022 | blue  | zoneX
1   | 75   | 81   | 77   | 51   | 49   | 93   | 62   | 15   | 59   | 53    | Mon May 02 12:40:36 EDT 2022 | blue  | zoneX
2   | 45   | 7    | 39   | 63   | 74   | 31   | 29   | 8    | 82   | 44    | Mon May 02 12:41:36 EDT 2022 | blue  | zoneX
3   | 31   | 39   | 57   | 72   | 79   | 16   | 17   | 57   | 48   | 9     | Mon May 02 12:42:36 EDT 2022 | blue  | zoneX
4   | 1    | 24   | 68   | 57   | 66   | 92   | 54   | 72   | 46   | 67    | Mon May 02 12:43:36 EDT 2022 | blue  | zoneX
5   | 53   | 64   | 5    | 60   | 55   | 93   | 64   | 96   | 66   | 63    | Mon May 02 12:44:36 EDT 2022 | blue  | zoneX
6   | 41   | 24   | 23   | 82   | 50   | 89   | 77   | 74   | 99   | 27    | Mon May 02 12:45:36 EDT 2022 | blue  | zoneX
7   | 4    | 40   | 24   | 56   | 3    | 17   | 85   | 74   | 68   | 53    | Mon May 02 12:46:36 EDT 2022 | blue  | zoneX
8   | 33   | 16   | 64   | 55   | 76   | 64   | 21   | 44   | 13   | 14    | Mon May 02 12:47:36 EDT 2022 | blue  | zoneX
9   | 79   | 31   | 46   | 53   | 1    | 8    | 35   | 63   | 69   | 14    | Mon May 02 12:48:36 EDT 2022 | blue  | zoneX
---
Mon May 02 12:39:36 EDT 2022 [[u'col1', 99], [u'col7', 76], [u'col9', 70], [u'col5', 70], [u'col8', 69]]
Mon May 02 12:40:36 EDT 2022 [[u'col6', 93], [u'col2', 81], [u'col3', 77], [u'col1', 75], [u'col7', 62]]
Mon May 02 12:41:36 EDT 2022 [[u'col9', 82], [u'col5', 74], [u'col4', 63], [u'col1', 45], [u'col10', 44]]
Mon May 02 12:42:36 EDT 2022 [[u'col5', 79], [u'col4', 72], [u'col8', 57], [u'col3', 57], [u'col9', 48]]
Mon May 02 12:43:36 EDT 2022 [[u'col6', 92], [u'col8', 72], [u'col3', 68], [u'col10', 67], [u'col5', 66]]
Mon May 02 12:44:36 EDT 2022 [[u'col8', 96], [u'col6', 93], [u'col9', 66], [u'col7', 64], [u'col2', 64]]
Mon May 02 12:45:36 EDT 2022 [[u'col9', 99], [u'col6', 89], [u'col4', 82], [u'col7', 77], [u'col8', 74]]
Mon May 02 12:46:36 EDT 2022 [[u'col7', 85], [u'col8', 74], [u'col9', 68], [u'col4', 56], [u'col10', 53]]
Mon May 02 12:47:36 EDT 2022 [[u'col5', 76], [u'col6', 64], [u'col3', 64], [u'col4', 55], [u'col8', 44]]
Mon May 02 12:48:36 EDT 2022 [[u'col1', 79], [u'col9', 69], [u'col8', 63], [u'col4', 53], [u'col3', 46]]