I'm trying to convert the XML file into a dataset using the file upload component, but it is not working as I expected .
In the below link manual, it mentioned without using the file upload component using the button component and it is working by directly copying past the XML string in the script.
Thanks for sharing,
I tried with the below code which is not working.
def runAction(self, event):
import xml.etree.ElementTree as ET
copy = event.file.getString("C:\Testing\ test")
path = "C:\Testing\ test"
# We can then parse the string into useable elements.
root = ET.fromstring(document)
# This creates empty header and row lists that we will add to later.
# These are used to create the dataset that will go into the Table.
# We could fill in the names of the headers beforehand, since we know what each will be.
# However, this allows us to add or remove children keys, and the script will automatically adjust.
headers = []
rows = []
# Now we can loop through each child of the root.
# Since the root is catalog, each child element is an individual book.
# We also create a single row empty list. We can add all of the data for a single book to this list.
for child in root:
oneRow = []
# Check if the book has any attributes.
if child.attrib != {}:
# If it does contain attributes, we want to loop through all of them.
for key in child.attrib:
# Since we only want to add the attributes to our header list once, first check if it is there.
# If it isn't add it.
if key not in headers:
headers.append(key)
# Add the attribute value to the oneRow list.
oneRow.append(child.attrib[key])
# Loop through the children of the book.
for child2 in child:
# Similar to above, we check if the tag is present in the header list before adding it.
if child2.tag not in headers:
headers.append(child2.tag)
# We can then add the text of the Element to the oneRow list.
oneRow.append(child2.text)
# Finally, we want to add the oneRow list to our list of rows.
rows.append(oneRow)
data = system.dataset.toDataSet(headers, rows)
self.getSibling("Table").props.data = data
No if yo have that line event.file.getString() in the correct event you don't need the path. The componenet will pass the data and this just reads it out for you.
If you're using the file upload component, in the File received event your code should be something like:
document= event.file.getString()
# We can then parse the string into useable elements.
root = ET.fromstring(document)
headers = []
rows = []
....
The error you are getting is because of non XML information at the top and bottom of the dataset. Here is an edited version of your test file without the extraneous stuff: sample_CustomersOrders.xml (15.1 KB)
Another problem I see is that there seems to be two datasets in that file. One is for customers and the other is for orders. Messing around with this, I was able to parse the file into seperate datasets two different ways.
Here is a Python version of the script that follows what you have above:
Python XML Parser
from xml.etree import ElementTree
#filePath = #Get your file
# Parse the XML file
root = ElementTree.parse(filePath).getroot()
# Create a list for the datasets that will be derived from each node
datasets = []
for node in root:
# Use a set to keep track of all unique tags (to be used as headers)
tags = set()
# Find all tags in the node to use as headers
for subnode in node:
for child in subnode.iter():
tags.add(child.tag)
# Get the headers and sort them
headers = sorted(list(tags))
# Create a List for the rows
data = []
# Process each sub-node as a row, and add the row to the data list
for subnode in node:
entry = {}
for child in subnode.iter():
if '\n' not in (child.text or ''):
entry[child.tag] = child.text or ''
row = [entry.get(header, '') for header in headers]
data.append(row)
# Convert the headers and data to a dataset, and add them to the datasets list
datasets.append(system.dataset.toDataSet(headers, data))
Here is the result with the customer dataset on top, and the orders dataset on bottom:
Here is the other way I put together that works similarly:
Jython XML Parser
from javax.xml.parsers import DocumentBuilderFactory
#xmlFile = # Get your file
document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile)
# Get the root element
root = document.documentElement
# Create a list for the datasets that will be derived from each node
datasets = []
# Iterate over each child node of the root element
for rootIndex in range(root.childNodes.length):
node = root.childNodes.item(rootIndex)
if node.nodeType == node.ELEMENT_NODE:
# Collect all unique tags (for headers) from the children of this node
headers = set()
nodeList = node.childNodes
for index in range(nodeList.length):
childNode = nodeList.item(index)
if childNode.nodeType == childNode.ELEMENT_NODE:
childNodeList = childNode.childNodes
for subIndex in range(childNodeList.length):
subChildNode = childNodeList.item(subIndex)
if subChildNode.nodeType == subChildNode.ELEMENT_NODE:
headers.add(subChildNode.nodeName)
headers = sorted(list(headers))
# Create a list to hold all rows of data
data = []
# Process each child node to produce a list of rows for each dataset
for index in range(nodeList.length):
childNode = nodeList.item(index)
if childNode.nodeType == childNode.ELEMENT_NODE:
entry = {}
childNodeList = childNode.childNodes
for subIndex in range(childNodeList.length):
subChildNode = childNodeList.item(subIndex)
if subChildNode.nodeType == subChildNode.ELEMENT_NODE:
entry[subChildNode.nodeName] = subChildNode.textContent
row = [entry.get(header, '') for header in headers]
data.append(row)
# Convert the headers and data to a dataset, and add them to the datasets list
datasets.append(system.dataset.toDataSet(headers, data))
Here is the result with the customer dataset on top and the orders dataset on bottom:
In this version, there are less columns because the FullAddress information is all combined into a single column in the customer dataset, and the ShipInfo fields are all combined into a single column in the orders dataset.