Add elements to a JSON file?

Lalillo1 · November 2, 2022, 4:40pm

I am working on a transform, but I don't know the pythonic approach to solve this:
if I have this array:

list = [{'NestID': '1', 'Ch2': '68'}, {'NestID': '2', 'Ch2': '133'}, {'NestID': '3', 'Ch2': '65'}, {'NestID': '4', 'Ch1': '2'}]

How do I modify it so I can change the 1st and 3rd element of this list to be like this:

{'NestID': '2', 'Ch2': '133', 'Ch3': '20'}
{'NestID': '4', 'Ch1': '2','Ch2': '7','Ch4': '9'}]

So that my entire list changes to:

[{'NestID': '1', 'Ch2': '68'}, {'NestID': '2', 'Ch2': '133', 'Ch3': '20'}, {'NestID': '3', 'Ch2': '65'}, {'NestID': '4', 'Ch1': '2','Ch2': '7','Ch4': '9'}]

PGriffith · November 2, 2022, 4:48pm

This is a somewhat underspecified problem.
Are the NestID values always guaranteed to be in order? Are they really strings? Is the actual operation you need to do always going to be adding additional keys, or is it a replacement of existing keys? Or do you want to merge new keys into the existing dictionary? Is your JSON structure guaranteed to be a list of flat objects? Can you modify the existing list (don't name variables list, by the way), or do you need to return a deep-copied new list?

Depending on the answers to those questions, the approach may change significantly.

One simple answer would be something like this:

inputList = [
    {'NestID': '1', 'Ch2': '68'},
    {'NestID': '2', 'Ch2': '133'}, 
    {'NestID': '3', 'Ch2': '65'}, 
    {'NestID': '4', 'Ch1': '2'}
]

def update(json, nestId, newValue):
    index = -1
    for i, obj in enumerate(json):
        if obj['NestID'] == nestId:
            index = i

    if index >= 0:
        json[index] = newValue

update(inputList, '2', {'NestID': '2', 'Ch2': '133', 'Ch3': '20'})
update(inputList, '4', {'NestID': '4', 'Ch1': '2','Ch2': '7','Ch4': '9'})

# output
# [
#     {'NestID': '1', 'Ch2': '68'}, 
#     {'NestID': '2', 'Ch2': '133', 'Ch3': '20'}, 
#     {'NestID': '3', 'Ch2': '65'}, 
#     {'NestID': '4', 'Ch1': '2','Ch2': '7','Ch4': '9'}
# ]

Lalillo1 · November 2, 2022, 4:59pm

Many Thanks:

Are the NestID values always guaranteed to be in order? Yes
Are they really strings? Only strings (and maybe integers in the future)
Is the actual operation you need to do always going to be adding additional keys? Yes
or is it a replacement of existing keys? No
Or do you want to merge new keys into the existing dictionary? Yes

PGriffith · November 2, 2022, 5:02pm

If you're always merging in new keys, then one option for improvement:

def update(json, nestId, **newKeys):
    index = -1
    for i, obj in enumerate(json):
        if obj['NestID'] == nestId:
            index = i

    if index >= 0:
        currentValue = json[index]
        currentValue.update(newKeys)

update(inputList, '2', Ch3='20')
update(inputList, '4', Ch2='7', Ch4='9')

Lalillo1 · November 2, 2022, 5:26pm

Thank you so much for your help! I really appreciate it. Many, many thanks

Lalillo1 · November 2, 2022, 7:22pm

I am not sure why is not doing it

#// This example create a single column dataset.
headers = ['col1', 'col2', 'col3']
list1= [
[1, 30, 68],
[2, 23, 133],
[3, 25, 65],
[4, 23, 145],
[5, 22, 2],
[5, 31, 8],
[5, 32, 30],
[6, 11, 3],
[6, 20, 241],
[7, 22, 2],
[7, 19, 202],
[8, 19, 2],
[8, 24, 241],
[8, 13, 30],
[9, 18, 19],
[10, 13, 4]]
mylist = system.dataset.toPyDataSet(system.dataset.toDataSet(headers, list1))
listOfRowValues=
for row in mylist:
listOfRowValues.append(row[0])
print (row[0],row[1],row[2])
#// create a dictionary
failCodeDesc = {10: "Ch1",
11: "Ch2",
12: "Ch3",
13: "Ch4",
18: "Ch1",
19: "Ch2",
20: "Ch3",
21: "Ch4",
22: "Ch1",
23: "Ch2",
24: "Ch3",
25: "Ch4",
30: "Ch1",
31: "Ch2",
32: "Ch3",
33: "Ch4"}
newData =
#------------------------------------------------------
def update(json, nestId, newValue):
index = -1
for i, obj in enumerate(json):
if obj['NestID'] == nestId:
index = i
if index >= 0:
json[index] = newValue
#------------------------------------------------------
nest1 = mylist.getColumnAsList(0) #channel list
failCode1 = mylist.getColumnAsList(1) #failcode list
counts1 = mylist.getColumnAsList(2) #amount of failcodes list
#// get list of values using map
channels1 = list(map(failCodeDesc.get, failCode1))
#------------------------------------------------------
for data in mylist:
for i, (w,x,y,z) in enumerate(zip(nest1, failCode1,counts1,channels1)):
#checking for duplicates in the first column (duplicate NestIDs)
if ((i != 0) and (nest1[i]== nest1[i-1])): #check if the firs value of first column (nestID) is the same as the previous one
update(newData, nest1[i], {channels1[i]:str(counts1[i])})
else: #else, if there's unique numbers at the first column (unique NestIDs)
newData.append({'NestID': str(nest1[i]), channels1[i]:str(counts1[i])})
break
#------------------------------------------------------
for i in range(0, len(newData)):
print(newData[i])

I get this:

{'NestID': '1', 'Ch1': '68'}
{'NestID': '2', 'Ch2': '133'}
{'NestID': '3', 'Ch4': '65'}
{'NestID': '4', 'Ch2': '145'}
{'NestID': '5', 'Ch1': '2'}
{'NestID': '6', 'Ch2': '3'}
{'NestID': '7', 'Ch1': '2'}
{'NestID': '8', 'Ch2': '2'}
{'NestID': '9', 'Ch1': '19'}
{'NestID': '10', 'Ch4': '4'}

image1377×946 95.6 KB

Script Console Code.txt (2.1 KB)

But, I am expecting this:

{'NestID': '1', 'Ch1': '68'}
{'NestID': '2', 'Ch2': '133'}
{'NestID': '3', 'Ch4': '65'}
{'NestID': '4', 'Ch2': '145'}
{'NestID': '5', 'Ch1': '2','Ch2': '18','Ch3': '30'}
{'NestID': '6', 'Ch2': '3','Ch3': '241'}
{'NestID': '7', 'Ch1': '2','Ch2': '202'}
{'NestID': '8', 'Ch2': '2','Ch3': '241','Ch4': '30'}
{'NestID': '9', 'Ch1': '19'}
{'NestID': '10', 'Ch4': '4'}

PGriffith · November 2, 2022, 8:52pm

The update function was never getting called because nest1[i] was an integer, but the value you were checking against was a string. Types are very important.

# This example create a single column dataset.
headers = ['col1', 'col2', 'col3']
list1= [
[1, 30, 68],
[2, 23, 133],
[3, 25, 65],
[4, 23, 145],
[5, 22, 2],
[5, 31, 8],
[5, 32, 30],
[6, 11, 3],
[6, 20, 241],
[7, 22, 2],
[7, 19, 202],
[8, 19, 2],
[8, 24, 241],
[8, 13, 30],
[9, 18, 19],
[10, 13, 4]]
mylist = system.dataset.toPyDataSet(system.dataset.toDataSet(headers, list1))
listOfRowValues=[]
for row in mylist:
      listOfRowValues.append(row[0])
      print (row[0],row[1],row[2])
# create a dictionary
failCodeDesc = {10: "Ch1", 
				11: "Ch2", 
				12: "Ch3",
				13: "Ch4",
				18: "Ch1",
				19: "Ch2",
				20: "Ch3",
				21: "Ch4",
				22: "Ch1",			
				23: "Ch2",
				24: "Ch3",
				25: "Ch4",				
				30: "Ch1",
				31: "Ch2",		
				32: "Ch3",											
				33: "Ch4"}      
newData = []
#------------------------------------------------------	
def update(json, nestId, **newKeys):
    index = -1
    for i, obj in enumerate(json):
        if obj['NestID'] == nestId:
            index = i

    if index >= 0:
        currentValue = json[index]
        currentValue.update(newKeys)
#------------------------------------------------------	
nest1 = mylist.getColumnAsList(0)  #channel list
failCode1 = mylist.getColumnAsList(1) #failcode list
counts1 = mylist.getColumnAsList(2)  #amount of failcodes list
# get list of values using map
channels1 = list(map(failCodeDesc.get, failCode1)) 
#------------------------------------------------------		
for data in mylist:
	for i, (w,x,y,z) in enumerate(zip(nest1, failCode1,counts1,channels1)):	
		#checking for duplicates in the first column (duplicate NestIDs)
		if ((i != 0) and (nest1[i]== nest1[i-1])): #check if the firs value of first column (nestID) is the same as the previous one
			updates = {channels1[i]:str(counts1[i])}
			update(newData, str(nest1[i]), **updates)
		else: #else, if there's unique numbers at the first column (unique NestIDs)
			newData.append({'NestID': str(nest1[i]), channels1[i]:str(counts1[i])})
	break
#------------------------------------------------------		
for i in range(0, len(newData)):    
    print(newData[i])

>>> 
(1, 30, 68)
(2, 23, 133)
(3, 25, 65)
(4, 23, 145)
(5, 22, 2)
(5, 31, 8)
(5, 32, 30)
(6, 11, 3)
(6, 20, 241)
(7, 22, 2)
(7, 19, 202)
(8, 19, 2)
(8, 24, 241)
(8, 13, 30)
(9, 18, 19)
(10, 13, 4)
{'NestID': '1', 'Ch1': '68'}
{'NestID': '2', 'Ch2': '133'}
{'NestID': '3', 'Ch4': '65'}
{'NestID': '4', 'Ch2': '145'}
{'NestID': '5', 'Ch2': '8', 'Ch1': '2', 'Ch3': '30'}
{'NestID': '6', 'Ch2': '3', 'Ch3': '241'}
{'NestID': '7', 'Ch2': '202', 'Ch1': '2'}
{'NestID': '8', 'Ch2': '2', 'Ch4': '30', 'Ch3': '241'}
{'NestID': '9', 'Ch1': '19'}
{'NestID': '10', 'Ch4': '4'}

PGriffith · November 2, 2022, 8:53pm

There is likely a much more ergonomic, efficient way to gather your data together, but I'll let @JordanCClark or someone else tag in

Lalillo1 · November 2, 2022, 9:10pm

Thank you so much!!

JordanCClark · November 2, 2022, 9:55pm

Ack! I've been sniped!

# Sample Dataset
headers = ['col1', 'col2', 'col3']
list1= [
		[1, 30, 68],
		[2, 23, 133],
		[3, 25, 65],
		[4, 23, 145],
		[5, 22, 2],
		[5, 31, 8],
		[5, 32, 30],
		[6, 11, 3],
		[6, 20, 241],
		[7, 22, 2],
		[7, 19, 202],
		[8, 19, 2],
		[8, 24, 241],
		[8, 13, 30],
		[9, 18, 19],
		[10, 13, 4]
	   ]
mylist = system.dataset.toPyDataSet(system.dataset.toDataSet(headers, list1))

failCodeDesc = {10: "Ch1", 
				11: "Ch2", 
				12: "Ch3",
				13: "Ch4",
				18: "Ch1",
				19: "Ch2",
				20: "Ch3",
				21: "Ch4",
				22: "Ch1",			
				23: "Ch2",
				24: "Ch3",
				25: "Ch4",				
				30: "Ch1",
				31: "Ch2",		
				32: "Ch3",											
				33: "Ch4"}      

dictOut = {}	
for row in mylist:
	nestID, channelID, value = list(row)
	# Check if nestID already exists
	if nestID not in dictOut.keys():
		dictOut[nestID] = {}
	# add channel info
	dictOut[nestID][failCodeDesc[channelID]] = str(value)

# Create json Output	
jsonOut = []
for nestID in sorted(dictOut.keys()):
	newRow = {'NestID':str(nestID)}
	newRow.update(dictOut[nestID])
	jsonOut.append(newRow)
	print newRow
print jsonOut

Lalillo1 · November 2, 2022, 10:00pm

Thank you very much!!

JordanCClark · November 3, 2022, 11:36am

Adding some nanoTime info in there, taking out the print statements.

Paul's script
0.8152 ms
0.5495 ms
0.6138 ms
0.5476 ms
0.5409 ms
0.5716 ms
0.5813 ms
0.6102 ms
0.587 ms
0.5272 ms

My first nerd-sniped script:

0.4069 ms
0.2438 ms
0.2432 ms
0.2385 ms
0.47 ms
0.4183 ms
0.2496 ms
0.2366 ms
0.2565 ms
0.4105 ms

my second sniped script (Thanks, Paul, for making my brain interrupt my first cup of coffee)
This one puts the NestID key in the nested dictionary up front, avoiding the dict.update() in the second pass.

dictOut = {}	
for row in mylist:
	nestID, channelID, value = list(row)
	# Check if nestID already exists
	if nestID not in dictOut.keys():
		dictOut[nestID] = {'NestID':str(nestID)}
	# add channel info
	dictOut[nestID][failCodeDesc[channelID]] = str(value)

# Create json Output	
jsonOut = []
for nestID in sorted(dictOut.keys()):
	jsonOut.append(dictOut[nestID])

Timing:

0.3332 ms
0.2807 ms
0.2031 ms
0.204 ms
0.2443 ms
0.3418 ms
0.2246 ms
0.2184 ms
0.218 ms
0.2696 ms

In the grand scheme of things, a few tenths of a millisecond may not matter. It just depends how many rows in your dataset you need to process.