[SOLVED] How to get a unique list from a bigger list of lists based on 1 criteria

bmeyers · August 22, 2020, 12:25am

Good evening,

I have an m by n table with a column called p. How can I get all 1 row per unique p value into a new table i by j?

Example:

  a       b      c      p      d
  1       2      1      1      'aab'
  1       1      1      2      'aac'
  3       3      1      2      'aad'
  3       2      2      3      'aag'

would become:

  a       b      c      p      d
  1       2      1      1      'aab'
  1       1      1      2      'aac'
  3       2      2      3      'aag'

I’ve broken down the first table into a list of lists but I can’t come up with a method for deleting the 3rd row in the table to get the new table.

Any help is much appreciated.

JordanCClark · August 22, 2020, 3:02am

listIn = [[1, 2, 1, 1, 'aab'],
          [1, 1, 1, 2, 'aac'],
          [3, 3, 1, 2, 'aad'], 
          [3, 2, 2, 3, 'aag']]

existsList =[]

listOut =[]

for row in listIn
	if row[3] not in existsList:
		existsList.append(row[3])
		listOut.append(row)

print listOut

pturmel · August 22, 2020, 1:29pm

If you are running with very large datasets, use a set or dict instead of a list for tracking existence.

bmeyers · August 22, 2020, 4:17pm

output = []
seen = set()
for keyID, line, verName, modelDescrip, menGrp, staNo, seq, workCell, t_Sec, mCode, description, key_Point, quality_Chk, partNo, partName, scan, scan_req, pic, videoFileName, userAck in tbl:
   if seq in seen:
       continue
   output.append([keyID, line, verName, modelDescrip, menGrp, staNo, seq, workCell, t_Sec, mCode, description, key_Point, quality_Chk, partNo, partName, scan, scan_req, pic, videoFileName, userAck])
   seen.add(seq)

A user on Stackoverflow was able to come up with the above code but I like yours better and will probably implement it at some point this weekend.

bmeyers · August 22, 2020, 4:19pm

At the end of the day I don’t think it will be ever going over 100 rows but there is no rule stating they can’t have 10k rows i guess.

pturmel · August 22, 2020, 7:37pm

The code from Stackoverflow is using tuple unpacking to place the data into named variables. While handy if you have complex expressions to do, it is pointless in your case. I do like the use of continue for loop short-circuit and the use of the set. I would blend the two implementations.

bmeyers · August 24, 2020, 3:06pm

Took me a while to understand what you meant about unpacking tuples but after reading about Python collections (arrays), Ignition Datasets, and Ignition PyDataSets I have a better understanding of how all the pieces fit together. Thanks for the insight.

output = []
seen = set()
		
for row in tbl:
    if row['seq'] in seen:
        continue
    output.append(list(row))
    seen.add(row['seq'])