Copying an Excel file to a table's dataset

simmone.mago · July 5, 2022, 5:36pm

Yeah,
Later I realized that all the examples I’ve found copy the file to another location and then it got me thinking that the code is running from the gateway rather than my computer, so it would be the gateway trying to locate the file, even if I provided the path to it. It would never work. It has to be available in a location that the gateway can reach, right?

Thanks for answering …

Andreas_Klein · July 22, 2022, 12:42pm

Dear Jordran,

first of all I like your great work. I am a chemist using Ignition as bloody beginner for data evaluation in our lab.
I’m using your code to import huge excel-files (13 MB) .
When I call your method, which is working quite well, my internal RAM is increasing dramatically from 1.3GB to 4.3GB. Normally I thought there is some Java Garbage Collector releasing RAM afterwards but that seams not to be the case. Do you have any idea to overcome this issue?
Thank you in advance.

Greatings,
Andi

JordanCClark · July 22, 2022, 2:29pm

Hi Andi,

First, welcome!

If you have min/max values for your heap size, you can see changes like that.

Garbage collection is based on a percentage of the heap size. You can see the used memory in the Java VM .
In the gateway:

Status → Performance

In the client:

Help → Diagnostics -or- Ctrl+Shift+F7

After the GC happens, if the free memory ratio is above 70% (I think that’s the default), it should shrink the heap size. @pturmel might know better than I, but I’m not sure I’ve ever seen the heap shrink.

pturmel · July 22, 2022, 2:47pm

The heap itself doesn’t shrink. Maybe there’s some condition that will trigger it, but I’ve never seen it. When the JVM wants to allocate memory from the heap, and it’s near full, and not yet at the maximum allowed, it will claim more memory from the OS, and never give it back. Once at that max, allocations when near full will trigger aggressive GC.

Because a gateway is long-lived, it should be configured to claim its entire allowed memory at startup (min == max) to avoid starvation when competing with other services and the OS’s own disk cache.

richardNZ15 · April 24, 2023, 6:15am

Love this function, thanks for sharing!

Anyone else having issues where the workbook seems to be marked as read-only after running the function though??

EDIT 1: Fixed the read-only thing by adding the following just before the fileStream.close() section:

		wb.close()	## 24/04/2023 - rlee - added, otherwise seems that wb is in readonly mode.
		fileStream.close()

Also added those two lines in the except clause of a try/except that covers most of the heavy duty logic to ensure things close nicely when it goes wrong.

EDIT 2: Encountered a bug when checking for cell type. Apparently a cell can be None if the cell is blank? Not really sure why it isn't handled nicely, as i see there is something to deal with cellType == 'BLANK' later..

Added the following change to get it to work nicely:

if i == firstRow and hasHeaders:
						pass
					else:
						cell = row.getCell(j)
						if cell == None:	# 24/04/2023 - rlee - added.. not sure why cell is returning None... 
							value = None	# 24/04/2023 - rlee - added.. not sure why cell is returning None... 
						else:
							cellType = cell.getCellType().toString()
							if cellType == 'NUMERIC':
								if DateUtil.isCellDateFormatted(cell):
									value =  cell.dateCellValue
								else:
									value = cell.getNumericCellValue()
									if value == int(value):
										value = int(value)
							elif cellType == 'STRING':
								value = cell.getStringCellValue()
							elif cellType == 'BOOLEAN':
								value = cell.getBooleanCellValue()
							elif cellType == 'BLANK':
								value = None	
                            ##...... ETC.....

EDIT 3: read-only issue seems to NOT be resolved still.. reading up about java iostream module and appache aoi functions doesn't seem to indicate why this may be.. Everyone says that closing the filestream should do the trick,..

rberghammer · May 1, 2023, 6:11pm

@JordanCClark, I also want to thank you for sharing this!

For future visitors to this thread, I think I found a bug in this line:

headers = ['Col'+str(i) for i in range(firstCol, lastCol)]

The variable "i" should be changed to something else. For example:

headers = ['Col'+str(x) for x in range(firstCol, lastCol)]

"i" is used later in the code to check if the row is a header, but the previous for loop changes the value. I don't think this causes any big issues, but I noticed it when printing the value of "i" while testing.

JordanCClark · May 1, 2023, 6:23pm

Good catch!. Updated.

tupanaster · May 4, 2023, 2:08pm

I was testing this approach today. I was able to write values to some cells but when I wanted to do something else it "bugs out". I think the reason for it is that it doesn't close the file afterwards. If you try to delete or make a change (an save) from Windows/Excel, it says that Ignition is using the file.

Also, I was unable to get calculated values from Excel formulas. Maybe newer versions of openpyxl can solve those issues.

richardNZ15 · May 5, 2023, 3:55am

Yea i still couldn't get that part to work.. I somehow got it going where it wouldn't hold it as read only, as i said in an earlier edit, but it didn't seem to stick after more testing..

justinedwards.jle · May 5, 2023, 4:03am

One approach to solving this problem could be to create an independent instance of the file by copying it into some temporary directory, perform the file stream from there, and then, simply deleted it once the dataset import is complete. This could all be done via scripting. I sometimes use the ignition cache directory for temporary files.

tupanaster · May 5, 2023, 8:23am

Maybe it's possible to do it that way but I dropped openpyxl 2.6.4 because I wasn't able to get Excel formulas evaluated.

So I started testing Apache POI libraries. I thought I had to buy a module for it but I saw this post:

I'm using Ignition 8.1 so if it's already there and I don't need to get a new module for it (although it's not expensive), then I better use it. Plus, I should be able to evaluate Excel formulas based on my inputs. So' I'm trying that out today.

So far it has been easier to play around with Apache POI libraries than with openpyxl 2.6.4. @JordanCClark makes it some much easier with the functions he shared.

nminchin · May 6, 2023, 5:09am

What about the function from Paul's ignition extensions module?

tupanaster · May 9, 2023, 8:22am

I remembered reading somewhere that Paul didn't finish his module

But now I can see the date differences between posts

It would be a good idea to test it because Apache POI has a small issue; I couldn't recognize CONCAT as CONCATENATE. Not a deal breaker but just something to look out for.

@JordanCClark
Is there an idea behind making all numbers int? Although many things can be integers, I would argue that most numbers are floats/doubles.

if value == int(value):
    value = int(value)

nminchin · May 9, 2023, 8:30am

Well I can confirm that Paul's Extensions module does use org.apache.poi as the library. It looks like you might just need to update to the latest library
https://poi.apache.org/apidocs/dev/org/apache/poi/ss/formula/functions/TextFunction.html#CONCATENATE

JordanCClark · May 9, 2023, 8:38am

It's only a test for integer. It's also only a proof of concept, although an admittedly really good one.

Feel free to modify it to your needs.

tupanaster · May 9, 2023, 8:50am

I'm using the built-in libraries in Ignition 8.1.
If there is a way to update just Apache POI I'll skip it for now to leave everything as default.

Mihir_Bhandarkar · June 9, 2023, 6:19pm

Hey, you might have already answered this but does this work in perspective without installing any other libraries.

JordanCClark · June 9, 2023, 8:30pm

From earlier in this thread.My response is in the next post

Keyur_Parikh · July 5, 2023, 7:52pm

I am seeing the following error running this script on Vision. I have installed the POI JAR files in Ignition Common, Client, Gateway and Designer Folders.
Ignition version: 7.9.21

JordanCClark · July 5, 2023, 11:00pm

For 7.9 I used xlrd.