XML file get into ignition dataset using file Upload component

chandrasekar_s · November 30, 2023, 10:34am

Hi ,

I'm trying to convert the XML file into a dataset using the file upload component, but it is not working as I expected .

In the below link manual, it mentioned without using the file upload component using the button component and it is working by directly copying past the XML string in the script.

https://docs.inductiveautomation.com/display/DOC81/Parsing+XML+with+the+Etree+Library

So is that possible to get the XML data into the dataset using the file upload component in perspective?

Hayden_Watson · November 30, 2023, 10:45am

If you put the script into the file upload componenets on file received event.

The line event.file.getString() will be the data in string format. You should then be able to parse this into the rest of your scripting.

chandrasekar_s · November 30, 2023, 11:27am

Hi Hayden,

Thanks for sharing,
I tried with the below code which is not working.
def runAction(self, event):
import xml.etree.ElementTree as ET

copy = event.file.getString("C:\Testing\ test")
path = "C:\Testing\ test"
 
# We can then parse the string into useable elements.
root = ET.fromstring(document)
 
# This creates empty header and row lists that we will add to later.
# These are used to create the dataset that will go into the Table.
# We could fill in the names of the headers beforehand, since we know what each will be.
# However, this allows us to add or remove children keys, and the script will automatically adjust.
headers = []
rows = []
 
# Now we can loop through each child of the root.
# Since the root is catalog, each child element is an individual book.
# We also create a single row empty list. We can add all of the data for a single book to this list.
for child in root:
    oneRow = []
 
    # Check if the book has any attributes.
    if child.attrib != {}:
 
        # If it does contain attributes, we want to loop through all of them.
        for key in child.attrib:
 
            # Since we only want to add the attributes to our header list once, first check if it is there.
            # If it isn't add it.
            if key not in headers:
                headers.append(key)
 
            # Add the attribute value to the oneRow list.
            oneRow.append(child.attrib[key])
 
    # Loop through the children of the book.
    for child2 in child:
 
        # Similar to above, we check if the tag is present in the header list before adding it.
        if child2.tag not in headers:
            headers.append(child2.tag)
 
        # We can then add the text of the Element to the oneRow list.
        oneRow.append(child2.text)
 
    # Finally, we want to add the oneRow list to our list of rows.
    rows.append(oneRow)
data = system.dataset.toDataSet(headers, rows)
self.getSibling("Table").props.data = data

Hayden_Watson · November 30, 2023, 11:47am

No if yo have that line event.file.getString() in the correct event you don't need the path. The componenet will pass the data and this just reads it out for you.

If you're using the file upload component, in the File received event your code should be something like:

document= event.file.getString()
 
# We can then parse the string into useable elements.
root = ET.fromstring(document)
 
headers = []
rows = []
....

chandrasekar_s · November 30, 2023, 12:22pm

No it is not working .

chandrasekar_s · November 30, 2023, 12:24pm

I'm trying to upload the xml file and that file data should be reflected in the dataset table.

lrose · November 30, 2023, 12:40pm

Can you post an example of the XML file that you are using?

chandrasekar_s · November 30, 2023, 12:43pm

sample_CustomersOrders.xml (15.1 KB)

chandrasekar_s · November 30, 2023, 12:46pm

Is there any format is should follow to pass XML

justinedwards.jle · December 10, 2023, 2:26pm

The error you are getting is because of non XML information at the top and bottom of the dataset. Here is an edited version of your test file without the extraneous stuff:
sample_CustomersOrders.xml (15.1 KB)

Another problem I see is that there seems to be two datasets in that file. One is for customers and the other is for orders. Messing around with this, I was able to parse the file into seperate datasets two different ways.

Here is a Python version of the script that follows what you have above:

Python XML Parser

from xml.etree import ElementTree
#filePath = #Get your file

# Parse the XML file
root = ElementTree.parse(filePath).getroot()

# Create a list for the datasets that will be derived from each node
datasets = []

for node in root:

	# Use a set to keep track of all unique tags (to be used as headers)
	tags = set()
	
	# Find all tags in the node to use as headers
	for subnode in node:
		for child in subnode.iter():
			tags.add(child.tag)
	
	# Get the headers and sort them
	headers = sorted(list(tags))
	
	# Create a List for the rows
	data = []
	
	# Process each sub-node as a row, and add the row to the data list
	for subnode in node:
		entry = {}
		for child in subnode.iter():
			if '\n' not in (child.text or ''):
				entry[child.tag] = child.text or ''
		row = [entry.get(header, '') for header in headers]
		data.append(row)
	
	# Convert the headers and data to a dataset, and add them to the datasets list
	datasets.append(system.dataset.toDataSet(headers, data))

Here is the result with the customer dataset on top, and the orders dataset on bottom:

Here is the other way I put together that works similarly:

Jython XML Parser

from javax.xml.parsers import DocumentBuilderFactory

#xmlFile = # Get your file

document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile)

# Get the root element
root = document.documentElement

# Create a list for the datasets that will be derived from each node
datasets = []

# Iterate over each child node of the root element
for rootIndex in range(root.childNodes.length):
	node = root.childNodes.item(rootIndex)
	if node.nodeType == node.ELEMENT_NODE:
        
		# Collect all unique tags (for headers) from the children of this node
		headers = set()
		nodeList = node.childNodes
		for index in range(nodeList.length):
			childNode = nodeList.item(index)
			if childNode.nodeType == childNode.ELEMENT_NODE:
				childNodeList = childNode.childNodes
				for subIndex in range(childNodeList.length):
					subChildNode = childNodeList.item(subIndex)
					if subChildNode.nodeType == subChildNode.ELEMENT_NODE:
						headers.add(subChildNode.nodeName)

		headers = sorted(list(headers))

		# Create a list to hold all rows of data
		data = []
		
		# Process each child node to produce a list of rows for each dataset
		for index in range(nodeList.length):
			childNode = nodeList.item(index)
			if childNode.nodeType == childNode.ELEMENT_NODE:
				entry = {}
				childNodeList = childNode.childNodes
				for subIndex in range(childNodeList.length):
					subChildNode = childNodeList.item(subIndex)
					if subChildNode.nodeType == subChildNode.ELEMENT_NODE:
					    entry[subChildNode.nodeName] = subChildNode.textContent
					row = [entry.get(header, '') for header in headers]
				data.append(row)
		
		# Convert the headers and data to a dataset, and add them to the datasets list
		datasets.append(system.dataset.toDataSet(headers, data))

Here is the result with the customer dataset on top and the orders dataset on bottom:

In this version, there are less columns because the FullAddress information is all combined into a single column in the customer dataset, and the ShipInfo fields are all combined into a single column in the orders dataset.