Copying an Excel file to a table's dataset

How much does it cost for 1 report?

$700 off top of my head, full module is $3.5K again IIRC

For ones like Jordan that has solid coding knowledge, the amount of engineering for both method is same. So it is better to save some money here. Specially when you want to compete to Chinese HMI panel.

I was getting the error, “TypeError: Unable to convert row x, column y to type class java.lang.Integer” on the line “return system.dataset.toDataSet(headers, data)”, which was traced to

value = cell.getNumericCellValue()
if value == int(value):
value = int(value)

The value it was referring to was a single float within a column of mostly integers

I fixed this error following the logic in "Cannot convert row x column y to type java.lang.Long" by adding

try:
value = float(value)
except:
pass

just before rowOut.append(value). I will update this if I run into problems downstream due to this change.

In my project I’m able to use openpyxl for working with excel files. I’m not sure if one of the previous project devs had to do anything weird to set it up but it’s really easy to use.
Ignition v7.9.18
openpyxl v1.7.0 (It’s old, but it works)

I just tried the openpyxl library and it works in Ignition 8.1.
Version 2.6.4 is the last one to support Python 2.7.
If anyone wants to try…
openpyxl-2.6.4.zip (1.1 MB)

Just copy everything in the .zip file into the C:\Program Files\Inductive Automation\Ignition\user-lib\pylib folder under Windows.

https://openpyxl.readthedocs.io/en/stable/

4 Likes

Heh i had tried openpyxl too a while ago, but i was missing a file of jdcal, when i check your zip tho it seems its in there :o
Maybe i pulled it from git tho instead of that site and forgot to copy that file

Edit: jup seems i pulled it from git

Thank you for this very useful snippet of code.
Now, I am using the fileUploader component in Perspective, but it seem to return just the filename, rather than the full path when I use event.file.name to get said path on the OnFileReceived event.

Is it something that I am doing wrong?
Without the full path, I cannot use your code snippet.

Thanks in advance,

The full path is deliberately obscured because it’s an implementation detail and subject to change if necessary. You need to either read from the byte array directly or copy the file to a known location.

Yeah,
Later I realized that all the examples I’ve found copy the file to another location and then it got me thinking that the code is running from the gateway rather than my computer, so it would be the gateway trying to locate the file, even if I provided the path to it. It would never work. It has to be available in a location that the gateway can reach, right?

Thanks for answering …

Dear Jordran,

first of all I like your great work. I am a chemist using Ignition as bloody beginner for data evaluation in our lab.
I’m using your code to import huge excel-files (13 MB) .
When I call your method, which is working quite well, my internal RAM is increasing dramatically from 1.3GB to 4.3GB. Normally I thought there is some Java Garbage Collector releasing RAM afterwards but that seams not to be the case. Do you have any idea to overcome this issue?
Thank you in advance.

Greatings,
Andi

Hi Andi,

First, welcome!

If you have min/max values for your heap size, you can see changes like that.

Garbage collection is based on a percentage of the heap size. You can see the used memory in the Java VM .
In the gateway:

  • Status → Performance

In the client:

  • Help → Diagnostics -or- Ctrl+Shift+F7

After the GC happens, if the free memory ratio is above 70% (I think that’s the default), it should shrink the heap size. @pturmel might know better than I, but I’m not sure I’ve ever seen the heap shrink.

The heap itself doesn’t shrink. Maybe there’s some condition that will trigger it, but I’ve never seen it. When the JVM wants to allocate memory from the heap, and it’s near full, and not yet at the maximum allowed, it will claim more memory from the OS, and never give it back. Once at that max, allocations when near full will trigger aggressive GC.

Because a gateway is long-lived, it should be configured to claim its entire allowed memory at startup (min == max) to avoid starvation when competing with other services and the OS’s own disk cache.

Love this function, thanks for sharing!

Anyone else having issues where the workbook seems to be marked as read-only after running the function though??

EDIT 1: Fixed the read-only thing by adding the following just before the fileStream.close() section:

		wb.close()	## 24/04/2023 - rlee - added, otherwise seems that wb is in readonly mode.
		fileStream.close()

Also added those two lines in the except clause of a try/except that covers most of the heavy duty logic to ensure things close nicely when it goes wrong.

EDIT 2: Encountered a bug when checking for cell type. Apparently a cell can be None if the cell is blank? Not really sure why it isn't handled nicely, as i see there is something to deal with cellType == 'BLANK' later..

Added the following change to get it to work nicely:

if i == firstRow and hasHeaders:
						pass
					else:
						cell = row.getCell(j)
						if cell == None:	# 24/04/2023 - rlee - added.. not sure why cell is returning None... 
							value = None	# 24/04/2023 - rlee - added.. not sure why cell is returning None... 
						else:
							cellType = cell.getCellType().toString()
							if cellType == 'NUMERIC':
								if DateUtil.isCellDateFormatted(cell):
									value =  cell.dateCellValue
								else:
									value = cell.getNumericCellValue()
									if value == int(value):
										value = int(value)
							elif cellType == 'STRING':
								value = cell.getStringCellValue()
							elif cellType == 'BOOLEAN':
								value = cell.getBooleanCellValue()
							elif cellType == 'BLANK':
								value = None	
                            ##...... ETC.....

EDIT 3: read-only issue seems to NOT be resolved still.. reading up about java iostream module and appache aoi functions doesn't seem to indicate why this may be.. Everyone says that closing the filestream should do the trick,..

@JordanCClark, I also want to thank you for sharing this!

For future visitors to this thread, I think I found a bug in this line:

headers = ['Col'+str(i) for i in range(firstCol, lastCol)]

The variable "i" should be changed to something else. For example:

headers = ['Col'+str(x) for x in range(firstCol, lastCol)]

"i" is used later in the code to check if the row is a header, but the previous for loop changes the value. I don't think this causes any big issues, but I noticed it when printing the value of "i" while testing.

2 Likes

Good catch!. Updated.

2 Likes

I was testing this approach today. I was able to write values to some cells but when I wanted to do something else it "bugs out". I think the reason for it is that it doesn't close the file afterwards. If you try to delete or make a change (an save) from Windows/Excel, it says that Ignition is using the file.

Also, I was unable to get calculated values from Excel formulas. Maybe newer versions of openpyxl can solve those issues.

1 Like

Yea i still couldn't get that part to work.. I somehow got it going where it wouldn't hold it as read only, as i said in an earlier edit, but it didn't seem to stick after more testing.. :frowning:

One approach to solving this problem could be to create an independent instance of the file by copying it into some temporary directory, perform the file stream from there, and then, simply deleted it once the dataset import is complete. This could all be done via scripting. I sometimes use the ignition cache directory for temporary files.

Maybe it's possible to do it that way but I dropped openpyxl 2.6.4 because I wasn't able to get Excel formulas evaluated.

So I started testing Apache POI libraries. I thought I had to buy a module for it but I saw this post:

I'm using Ignition 8.1 so if it's already there and I don't need to get a new module for it (although it's not expensive), then I better use it. Plus, I should be able to evaluate Excel formulas based on my inputs. So' I'm trying that out today.

So far it has been easier to play around with Apache POI libraries than with openpyxl 2.6.4. @JordanCClark makes it some much easier with the functions he shared.

4 Likes