Multiple datasets join

pascal · October 16, 2017, 2:45pm

Hello,
Is it possible to “concatenate” or join 2 or more datasets?
I have a list of datetimes and i wouldl like to retrieve the data for each month and merge it in a dataset.

for date in test:
   results=system.db.runPrepQuery("SELECT t_stamp as Date, value as Valeur FROM outsideairtemp_snapshot WHERE YEAR(t_stamp)=YEAR(?) AND MONTH(t_stamp)=MONTH(?) ORDER BY t_stamp",[date,date])

I tried using the append function but it works only with lists and not datasets.
Any idea how to make it work ?

lrose · October 16, 2017, 2:58pm

Datasets and PyDatasets are normally immutable, meaning that you can not change them.

You would need to loop through both data sets, creating a new list and then use that to create a new data set containing the result of the operation.

There is almost certainly a better way to go. Depending on how many date values are in test, you would need to loop through the same data multiple times and basically accomplish nothing.

nmudge · October 16, 2017, 3:29pm

What you are asking can be done with some Python code that loops over the datasets to create another dataset.

But the easiest way to do it is by using MutablePyDataSets from the Power Scripting module.

Here is an example of doing it with MutablePyDataSets:

result1 = system.db.runPrepQuery("SELECT ...whatever")
result2 = system.db.runPrepQuery("SELECT ...whatever")

result3 = pa.dataset.toData(result1) + pa.dataset.toData(result2)

#print out result3 to see what it looks like in the console
print result3

#Convert to a regular dataset
result3 = dataset3.toDataSet()

Here is another example that shows some output of printing a MutablePyDataSet:

#Create the first dataset
columnNames = ["name","age","rank"]
rows = [["Bob",32,"Private"],
        ["Bill",28,"Major"]]
dataset1 = system.dataset.toDataSet(columnNames, rows)

#Create the second dataset
rows = [["Mike",33,"Cadet"], 
        ["Jean",67,"Second Officer"],
        ["Bean",77,"First Officer"]]
dataset2 = system.dataset.toDataSet(columnNames, rows)

dataset3 = pa.dataset.toData(dataset1) + pa.dataset.toData(dataset2)

print dataset3

#Convert to a regular dataset
dataset3 = dataset3.toDataSet()

#Here is the output from printing dataset3
"""
row | name age rank          
-----------------------------
 0  | Bob  32  Private       
 1  | Bill 28  Major         
 2  | Mike 33  Cadet         
 3  | Jean 67  Second Officer
 4  | Bean 77  First Officer 
"""

cmisztur2 · October 16, 2017, 6:03pm

Probably need some sanity checks in here, but this should do....

ds1 = system.db.runQuery("select * from table1","IGNITION")
ds2 = system.db.runQuery("select * from table1","IGNITION")

for i in range(ds1.getUnderlyingDataset().getColumnCount()):
	ds1.getUnderlyingDataset().getData()[i] += ds2.getUnderlyingDataset().getData()[i]
	
print ds1.getUnderlyingDataset().getData()

EDIT: probably needs to be modified using SetDataDirectly() instead of just concat.

Kevin.Collins · October 16, 2017, 7:08pm

The method below will accept multiple datasets (at least two) and return a unified dataset if the datasets have the same column specifications:

def appendDataset(self, ignoreMismatch, *dss):
	# Validate at least 2 datasets have been supplied
	if len(dss) < 2:
		raise KeyError("Must provide at least 2 datasets for appending")
		
	# Validate Column Alignment	
	if not ignoreMismatch and len([x for x in range(len(dss))[1:] if (dss[x].getColumnNames() != dss[0].getColumnNames() or dss[x].getColumnTypes() != dss[0].getColumnTypes()) and dss[x].rowCount > 0]) > 0:
		raise KeyError("Column Name/Type Mismatch!")

	# Prepare Final Dataset
	headers = map(str,dss[0].getColumnNames())
	data = []
	
	# Gather Data from supplied datasets
	for ds in dss:
		for i in range(ds.rowCount):
			data.append([x[i] for x in ds.getData()])

	# Return newly compiled dataset
	return system.dataset.toDataSet(headers, data)

Hope this helps!

jespinmartin1 · February 25, 2021, 6:18pm

I think he wanted “horizontal” concatenation,
what you did was “vertical”, instead you could have used

system.dataset.appendDataset()

in a loop

lrose · February 25, 2021, 6:22pm

Not at the time, system.dataset.appendDataset() did not exist

andrewk · November 21, 2022, 3:43pm

Hey @nmudge,

Interested in this module for Ignition 8.0. Do you know if there's an existing version somewhere (that has an up-to-date certificate for 8.0?

pturmel · November 21, 2022, 5:15pm

Nick has not done any work in Ignition in several years (look at his profile activity), so don't hold your breath. And it isn't just a certificate question--almost all modules from v7.9 and before require some rework to load in v8.x.

mojoeswa · March 9, 2023, 4:38pm

I created this code, but maybe there is a better way...

#****************************************************************
# Function to join two datasets (horizontal)
#****************************************************************

def joinDatasets(data,dataset1, dataset2, datasetAlias):
		
	"""
	Arguments:
	data: 		This is a map, whose keys are report data keys (must NOT start
				with a number) and values should be sequences, maps,  scalar values, or
				datasets to provide information to the report.
	
	dataset1: 	First dataset 
	
	dataset2: 	Second dataset
	"""
	#	Get datasets and parse to pydatasets
	pyDataset1 = system.dataset.toPyDataSet(dataset1)		
	pyDataset2 = system.dataset.toPyDataSet(dataset2)

	# Build header and code string with unicode
	headerDataset1 = pyDataset1.getColumnNames()
	headerDataset2 = pyDataset2.getColumnNames()
	header = []
	for item in headerDataset1:
		header.append( u"{0}".format(str(item)))

	for item in (headerDataset2):
		header.append( u"{0}".format(str(item)))
	
	# Convert from PyRow to list
	newListDataset1  = [list(item) for item in pyDataset1]
	newListDataset2  = [list(item) for item in pyDataset2]

	# Zip lists	
	rowzip = zip(newListDataset1,newListDataset2)
	newRows = [item[0]+item[1] for item in rowzip]
				
	# convert the pydataset to a standard dataset
	rows = system.dataset.toDataSet(header, newRows)
 
	# create a new data source 
	data[datasetAlias] = rows