Is it possible to parse pdf files with a python script in ignition with a 3rd party Library like pdfminer or
Jython + pdfbox ?
Not trivial, but possible.
You could create a module with the Ignition SDK adding the pdfbox jars to it and then, call the functions from jython code.
A quick view on PDF Miner (I used pdfbox, not PDF Miner) seems to be a pure python 2.4 library (not a wrapper from CPython, like anothers third party librarys that are not compatible with Ignition), so, if you have Ignition 7.7.x, you could add this as a third-party python library. More info in inductiveautomation.com/forum/vi … 12153&f=50
Regards,
I have just tried exactly what you describe before with pdfminer
I have added pdfminer folder in pylib (…\Inductive Automation\Ignition\user-lib\pylib)
Ignition recognize the new python library (I have a message wich inform me Ignition has found a new library in pylib)
I have added without problem the following sample script on a button:
from pdfminer.pdfparser import PDFParser
from pdfminer.pdfdocument import PDFDocument
from pdfminer.pdfpage import PDFPage
from pdfminer.pdfpage import PDFTextExtractionNotAllowed
from pdfminer.pdfinterp import PDFResourceManager
from pdfminer.pdfinterp import PDFPageInterpreter
from pdfminer.pdfdevice import PDFDevice
Open a PDF file.
fp = open(‘C:\PyhtonScripts\HelloWorld.pdf’, ‘rb’)
Create a PDF parser object associated with the file object.
parser = PDFParser(fp)
Create a PDF document object that stores the document structure.
Supply the password for initialization.
document = PDFDocument(parser, password)
Check if the document allows text extraction. If not, abort.
if not document.is_extractable:
raise PDFTextExtractionNotAllowed
Create a PDF resource manager object that stores shared resources.
rsrcmgr = PDFResourceManager()
Create a PDF device object.
device = PDFDevice(rsrcmgr)
Create a PDF interpreter object.
interpreter = PDFPageInterpreter(rsrcmgr, device)
Process each page contained in the document.
for page in PDFPage.create_pages(document):
interpreter.process_page(page)
But unfortunately I have a Java run time error:
ERROR [ActionAdapter-MainThread] Error executing script for event: actionPerformed
on component: Button 3
.
Traceback (most recent call last):
File “event:actionPerformed”, line 5, in
File “C:\Users\fabrice.chaverot.ignition\cache\gwlocalhost_8088_8043_main\C0\pylib\pdfminer\pdfinterp.py”, line 8, in
from cmapdb import CMapDB, CMap
File “C:\Users\fabrice.chaverot.ignition\cache\gwlocalhost_8088_8043_main\C0\pylib\pdfminer\cmapdb.py”, line 24, in
from encodingdb import name2unicode
File “C:\Users\fabrice.chaverot.ignition\cache\gwlocalhost_8088_8043_main\C0\pylib\pdfminer\encodingdb.py”, line 5, in
from glyphlist import glyphname2unicode
java.lang.ClassFormatError: Invalid method Code length 85551 in class file pdfminer/glyphlist$py
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(Unknown Source)
at org.python.core.BytecodeLoader$Loader.loadClassFromBytes(BytecodeLoader.java:119)
at org.python.core.BytecodeLoader.makeClass(BytecodeLoader.java:37)
at org.python.core.BytecodeLoader.makeCode(BytecodeLoader.java:67)
at org.python.core.imp.createFromSource(imp.java:353)
at org.python.core.imp.loadFromSource(imp.java:578)
…
Mmmm… pretty strange… I’m not an expert but seems that java classes must not exceed a determinated size (according to jvm specifications, a class may not exceed 64KB) and maybe Ignition dynamically compile the python code (jython) to a java class and if the python code it’s too big, not work…
I can’t think a quickly workaround to this… maybe a more expert jython users or some guys from IA could help you…
You still have the pdfbox approach…
Regards,
Yes I am going to look at the pdfbox approach, I’ve just downloaded the sdk. Thanks for your advices, Regards