XML file get into ignition dataset using file Upload component

justinedwards.jle · December 10, 2023, 2:26pm

The error you are getting is because of non XML information at the top and bottom of the dataset. Here is an edited version of your test file without the extraneous stuff:
sample_CustomersOrders.xml (15.1 KB)

Another problem I see is that there seems to be two datasets in that file. One is for customers and the other is for orders. Messing around with this, I was able to parse the file into seperate datasets two different ways.

Here is a Python version of the script that follows what you have above:

Python XML Parser

from xml.etree import ElementTree
#filePath = #Get your file

# Parse the XML file
root = ElementTree.parse(filePath).getroot()

# Create a list for the datasets that will be derived from each node
datasets = []

for node in root:

	# Use a set to keep track of all unique tags (to be used as headers)
	tags = set()
	
	# Find all tags in the node to use as headers
	for subnode in node:
		for child in subnode.iter():
			tags.add(child.tag)
	
	# Get the headers and sort them
	headers = sorted(list(tags))
	
	# Create a List for the rows
	data = []
	
	# Process each sub-node as a row, and add the row to the data list
	for subnode in node:
		entry = {}
		for child in subnode.iter():
			if '\n' not in (child.text or ''):
				entry[child.tag] = child.text or ''
		row = [entry.get(header, '') for header in headers]
		data.append(row)
	
	# Convert the headers and data to a dataset, and add them to the datasets list
	datasets.append(system.dataset.toDataSet(headers, data))

Here is the result with the customer dataset on top, and the orders dataset on bottom:

Here is the other way I put together that works similarly:

Jython XML Parser

from javax.xml.parsers import DocumentBuilderFactory

#xmlFile = # Get your file

document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile)

# Get the root element
root = document.documentElement

# Create a list for the datasets that will be derived from each node
datasets = []

# Iterate over each child node of the root element
for rootIndex in range(root.childNodes.length):
	node = root.childNodes.item(rootIndex)
	if node.nodeType == node.ELEMENT_NODE:
        
		# Collect all unique tags (for headers) from the children of this node
		headers = set()
		nodeList = node.childNodes
		for index in range(nodeList.length):
			childNode = nodeList.item(index)
			if childNode.nodeType == childNode.ELEMENT_NODE:
				childNodeList = childNode.childNodes
				for subIndex in range(childNodeList.length):
					subChildNode = childNodeList.item(subIndex)
					if subChildNode.nodeType == subChildNode.ELEMENT_NODE:
						headers.add(subChildNode.nodeName)

		headers = sorted(list(headers))

		# Create a list to hold all rows of data
		data = []
		
		# Process each child node to produce a list of rows for each dataset
		for index in range(nodeList.length):
			childNode = nodeList.item(index)
			if childNode.nodeType == childNode.ELEMENT_NODE:
				entry = {}
				childNodeList = childNode.childNodes
				for subIndex in range(childNodeList.length):
					subChildNode = childNodeList.item(subIndex)
					if subChildNode.nodeType == subChildNode.ELEMENT_NODE:
					    entry[subChildNode.nodeName] = subChildNode.textContent
					row = [entry.get(header, '') for header in headers]
				data.append(row)
		
		# Convert the headers and data to a dataset, and add them to the datasets list
		datasets.append(system.dataset.toDataSet(headers, data))

Here is the result with the customer dataset on top and the orders dataset on bottom:

In this version, there are less columns because the FullAddress information is all combined into a single column in the customer dataset, and the ShipInfo fields are all combined into a single column in the orders dataset.