Copying an Excel file to a table's dataset

pturmel · July 5, 2023, 11:02pm

But v7.9 has jython 2.5, not 2.7?

JordanCClark · July 5, 2023, 11:07pm

I'm pretty sure I started using it in 7.9. I'm also coming off an 18 hour work stint, so my memory may not be accurate. There were a lot of projects being juggled.

kssp · November 22, 2023, 4:15pm

Hey, Jordan. I am trying to update the data of a perspective table with the values that are returned, rather than printing it to an external excel sheet. I'm not quite understanding how I can return these values to the table from the script. I tried setting the data property equal to the system.dataset.toDataset value returned, but it made no change. Am I missing something here? Or is it not possible with the current layout? Thanks!!

Transistor · November 22, 2023, 4:20pm

Show your script! Paste as code and format it with the </> button.

kssp · November 22, 2023, 4:43pm

It's the same as above, with the data property of the blank table set to the originally returned value.

def runAction(self, event):
	def excelBytesToDataSet(bytesIn, hasHeaders = True):
		import org.apache.poi.ss.usermodel.WorkbookFactory as WorkbookFactory
		import org.apache.poi.ss.usermodel.DateUtil as DateUtil
		from java.io import ByteArrayInputStream
	
		fileStream = ByteArrayInputStream(bytesIn)
		wb = WorkbookFactory.create(fileStream)
		
		sheet = wb.getSheetAt(sheetNum)
	
		if firstRow is None:
			firstRow = sheet.getFirstRowNum()
		if lastRow is None:
			lastRow = sheet.getLastRowNum()
	
		data = []
		for i in range(firstRow , lastRow + 1):
			row = sheet.getRow(i)
			print str(i).zfill(3), list(row)
			if i == firstRow:
				if firstCol is None:
					firstCol = row.getFirstCellNum()
	
				if lastCol is None:
					lastCol  = row.getLastCellNum()
				else:
					# if lastCol is specified add 1 to it.
					lastCol += 1
				if hasHeaders:
					headers = list(row)[firstCol:lastCol]
				else:
					headers = ['Col'+str(cNum) for cNum in range(firstCol, lastCol)] 
			rowOut = []
			for j in range(firstCol, lastCol):
				if i == firstRow and hasHeaders:
					pass
				else:
					cell = row.getCell(j)
					cellType = cell.getCellType().toString()
					if cellType == 'NUMERIC':
						if DateUtil.isCellDateFormatted(cell):
							value =  cell.dateCellValue
						else:
							value = cell.getNumericCellValue()
							if value == int(value):
								value = int(value)
					elif cellType == 'STRING':
						value = cell.getStringCellValue()
					elif cellType == 'BOOLEAN':
						value = cell.getBooleanCellValue()
					elif cellType == 'BLANK':
						value = None	
					elif cellType == 'FORMULA':
						formulatype=str(cell.getCachedFormulaResultType())
						if formulatype == 'NUMERIC':
							if DateUtil.isCellDateFormatted(cell):
								value =  cell.dateCellValue
							else:
								value = cell.getNumericCellValue()
								if value == int(value):
									value = int(value)
						elif formulatype == 'STRING':
							value = cell.getStringCellValue()
						elif formulatype == 'BOOLEAN':
							value = cell.getBooleanCellValue()
						elif formulatype == 'BLANK':
							value = None
					else:
						value = None	
					rowOut.append(value)
			if len(rowOut) > 0:
				data.append(rowOut)
	
		fileStream.close()
		
		self.getSibling("Table").props.data = system.dataset.toDataSet(headers, data)

bkarabinchak.psi · November 22, 2023, 6:19pm

The only thing your runAction is doing here is defining a function but you never actually call excelBytesToDataSet with parameters (bytesIn, hasHeaders=True). I would recommend putting this in a project library script anyways, have it return the data, and on the runAction, get the bytesIn from wherever makes sense in your application, feed it your function, get the data returned from the function, then assign that to your table.props.data.

kssp · November 22, 2023, 6:46pm

Thank you, will work on implementing this soon!

joel_houck · December 12, 2023, 10:35pm

Great work. I can successfully store my excel books in SQL and access them just fine. I am running into "TypeError: Unable to convert row 2, column 0 to type class java.lang.String" Still on empty cells even with the filtering. As long as all columns are equal it doesn't seem to matter though. I'm digging through the POI trying tons of things and keep coming up with the error.

Error running action 'component.onActionPerformed' on EAMDropBox@D/root/Button: Traceback (most recent call last): File "<function:runAction>", line 19, in runAction File "<module:func>", line 61, in excelToDataSet AttributeError: 'NoneType' object has no attribute 'getCellType'

joel_houck · December 12, 2023, 10:38pm

I should mention this is with byte data stream in perspective grabbing the bytes for only the indicated book. Works flawlessly with equal columns. Download is much simpler but showing a preview of the data in a table is much more challenging apparently.

PGriffith · December 12, 2023, 10:54pm

Without seeing your exact code no one can do anything but guess. Reading the error message tells me that on line 61 something is null (NoneType) that the code doesn't expect to be.

joel_houck · December 13, 2023, 2:40am

I apologize, it is largely similar to the code in this thread. I included two attempts of filtering that don't get the job done.

def excelToDataSet(fileName, hasHeaders = False, sheetNum = 0, firstRow = None, lastRow = None, firstCol = None, lastCol = None):
	import org.apache.poi.ss.usermodel.WorkbookFactory as WorkbookFactory
	import org.apache.poi.ss.usermodel.DateUtil as DateUtil
	#from java.io import FileInputStream
	from java.io import ByteArrayInputStream
	from java.util import Date
	

	fileStream = ByteArrayInputStream(fileName)

	wb = WorkbookFactory.create(fileStream)
	
	sheet = wb.getSheetAt(sheetNum)

	if firstRow is None:
		firstRow = sheet.getFirstRowNum()
	if lastRow is None:
		lastRow = sheet.getLastRowNum()

	data = []
	for i in range(firstRow , lastRow + 1):
		row = sheet.getRow(i)
		print str(i).zfill(3), list(row)
		if i == firstRow:
			if firstCol is None:
				firstCol = row.getFirstCellNum()

			if lastCol is None:
				lastCol  = row.getLastCellNum()
			else:
				# if lastCol is specified add 1 to it.
				lastCol += 1
			if hasHeaders:
				headers = list(row)[firstCol:lastCol]
			else:
				headers = ['Col'+str(cNum) for cNum in range(firstCol, lastCol)] 
		rowOut = []
		for j in range(firstCol, lastCol):
			if i == firstRow and hasHeaders:
				pass
			else:				
				cell = row.getCell(j)
				if cell == None:
					pass
				if cell == 'Blank':
					pass
				else:
					
					cellType = cell.getCellType().toString()
					
					if cellType == 'NUMERIC':
						if DateUtil.isCellDateFormatted(cell):
							value =  cell.dateCellValue
						else:
							value = cell.getNumericCellValue()
							if value == int(value):
								value = int(value)
					elif cellType == 'STRING':
						value = cell.getStringCellValue()
					elif cellType == 'BOOLEAN':
						value = cell.getBooleanCellValue()
					elif cellType == 'BLANK':
						value = None	
					elif cellType == 'FORMULA':
						formulatype=str(cell.getCachedFormulaResultType())
						if formulatype == 'NUMERIC':
							if DateUtil.isCellDateFormatted(cell):
								value =  cell.dateCellValue
							else:
								value = cell.getNumericCellValue()
								if value == int(value):
									value = int(value)
						elif formulatype == 'STRING':
							value = cell.getStringCellValue()
						elif formulatype == 'BOOLEAN':
							value = cell.getBooleanCellValue()
						elif formulatype == 'BLANK':
							value = None
					else:
						value = None
					rowOut.append(value)
		if len(rowOut) > 0:
			data.append(rowOut)

	fileStream.close()
	
	return system.dataset.toDataSet(headers, data)

lrose · December 13, 2023, 4:06am

I believe if I am reading the documentation correctly, if a cell has never had content in it then it may be returned as NULL (depending on the Excel document type and version), and in that case the getCell() function returns null.

There is an override that lets you specify what happens if the cell is not defined. You could then use that as your filter. Or, since there is already a case for if the cellType is blank then you wouldn't need to filter for it to begin with.

Also, I would recommend using the Enum to do the comparisons (perhaps that has changed since Jordan wrote this originally?)

something like:

cell = row.getCell(j,Row.CREATE_NULL_AS_BLANK)

if not cell.cellType.equals(CellType.BLANK):
    cellType = cell.cellType
    if cellType.equals(CellType.NUMERIC):
        etc....

NOTE: This is untested.

You might find this interesting:

https://poi.apache.org/components/spreadsheet/quick-guide.html#Iterator

JordanCClark · December 13, 2023, 11:36am

A sample of the excel file would help. I may be able to look at this later today.

joel_houck · December 13, 2023, 2:15pm

Vat_Options.xlsx (9.6 KB)
Here is one such option that fails at the first empty cell

joel_houck · December 13, 2023, 2:19pm

I see it in your link but however it doesn't appear to recognize using it in that exact manner.

lrose · December 13, 2023, 4:04pm

Like I said, the code was untested. You have to traverse through the MissingCellPolicy Enum, apparently?

Anyway, I ran this modification of the function on the work book you provided and it works as expected.

First a few notes:

Make sure that your work book columns do not have mixed data types, (i.e. String, Double). In the work book you provided the column for peel thickness row 6 was the value 0.191., row 7 was 0.210. and row 8 was "Test". Those are strings and the conversion to dataset will fail because columns can only contain one type. If it needs to be mixed, then they should all be strings.
I changed from java.io.ByteArrayInputStream() to java.io.FileInputStream() because with the former I was getting an IllegalArgumentException claiming Your InputStream was neither an OLE2 stream, nor an OOXML stream, using a FileInputStream resolved that issue.
I modified the cell filter section "significantly" to combine flow control paths, and make it a bit more readable, imo. The logic is exactly the same, just does things in a different order that allows for some syntactic sugar to combine paths.
I also went through and used Jython's optimized property access where possible. The result will be better performance, but unless the file is very, very large probably doesn't really make a difference. I just like to do things the same way, with the best practices (assuming that I know about them).

from org.apache.poi.ss.usermodel import WorkbookFactory as WorkbookFactory, DateUtil as DateUtil, Row, CellType
from java.io import FileInputStream
from java.util import Date

def excelToDataSet(fileName, hasHeaders = False, sheetNum = 0, firstRow = None, lastRow = None, firstCol = None, lastCol = None):
	fileStream = FileInputStream(fileName)

	wb = WorkbookFactory.create(fileStream)
	
	sheet = wb.getSheetAt(sheetNum)

	if firstRow is None:
		firstRow = sheet.firstRowNum
	if lastRow is None:
		lastRow = sheet.lastRowNum

	data = []
	for i in range(firstRow , lastRow + 1):
		row = sheet.getRow(i)
		if i == firstRow:
			if firstCol is None:
				firstCol = row.firstCellNum

			if lastCol is None:
				lastCol  = row.lastCellNum
			else:
				# if lastCol is specified add 1 to it.
				lastCol += 1
			if hasHeaders:
				headers = list(row)[firstCol:lastCol]
			else:
				headers = ['Col'+str(cNum) for cNum in range(firstCol, lastCol)] 
		rowOut = []
		for j in range(firstCol, lastCol):
			if not (i == firstRow and hasHeaders):
				cell = row.getCell(j,Row.MissingCellPolicy.CREATE_NULL_AS_BLANK)
				
				cellTypes = (cell.cellType,cell.cachedFormulaResultType if cell.cellType == CellType.FORMULA else None)				
				if CellType.NUMERIC in cellTypes:
					if DateUtil.isCellDateFormatted(cell):
						value =  cell.dateCellValue
					else:
						value = cell.numericCellValue
						if value == int(value):
							value = int(value)
				elif CellType.STRING in cellTypes:
					value = cell.stringCellValue
				elif CellType.BOOLEAN in cellTypes:
					value = cell.booleanCellValue
				elif CellType.BLANK in cellTypes:
					value = None
				else:
					value = None
				rowOut.append(value)
		if len(rowOut) > 0:
			data.append(rowOut)

	fileStream.close()
	
	return system.dataset.toDataSet(headers, data)

joel_houck · December 13, 2023, 5:49pm

Thank you. I'll see if I can just do it this way but I was originally using just the bytes portion which is why I had it set as such I did manage to filter out empty cells by using:

else:
					if cell is not None:
						cellType = cell.getCellType().toString()

Tamilvanan_S · January 31, 2024, 4:45pm

Hi I Need one help, How find last sheet num in Script?
Thanks in Advance

Tamilvanan_S · January 31, 2024, 4:47pm

How to Read multiple sheets in one Dataset?
Thanks in Advance

JordanCClark · January 31, 2024, 5:01pm

wb.sheetIterator() should let you iterate over the all the sheets in the workbook.