History Import using system.tag.StoreTagHistory

Hi all,
We are working on importing data from a different system into our Enterprise Historian.
That particular system can generate CSV files for us in the following format:
DateTime,Value

I have code written and debugged, I am just wondering if there is a more efficient way of doing this.
The CSV files can have up to 150,000 lines. The code I am using parses and inserts it in under 5 seconds.
But it just seems like a clunky way of doing this.
Our Historians partition by month, so this is why I am going the scripting route instead of just inserting direct to the database.

import csv
path = system.file.openFile("csv")
reader = csv.reader(open(path, 'rb'))

def gen_chunks(reader, chunksize=1000):
    chunk = []
    for i, line in enumerate(reader):
        if (i % chunksize == 0 and i > 0):
            yield chunk
            del chunk[:]
        chunk.append(line)
    yield chunk

prov = 'EntHistory'
selectedPath = event.source.parent.getComponent('tagPath').text.split(']')[1]
x = 0
fullDS = []
for chunk in gen_chunks(reader, chunksize=1000):
	vals = []
	paths = []
	quals = []
	ts = []
	#Delete the first row of the CSV as it is just headers
	if x < 1:
		del chunk[0]
	
	for row in chunk:
		paths.append(selectedPath)
		d = system.date.parse(row[0],"MM-dd-yyyy HH:mm:ss")
		gd = system.date.format(d,"yyyy-MM-dd HH:mm:ss")
		vals.append(float(row[1]))
		quals.append(192)
		ts.append(gd)
	x=+1
	system.tag.storeTagHistory(prov,'default',paths,vals,quals,ts)
#Put DS in a table on the screen just for debugging and confirmation of values	
hdr = ['tag','val','qual','date']
fullDS = zip(paths,vals,quals,ts)
event.source.parent.getComponent('Power Table 3').data=system.dataset.toDataSet(hdr,fullDS)

The code looks pretty good to me. You might get some improvement by using a more efficient generator function to read the file, then split that generator into chunks:

import csv
import itertools

def readFile(filename):
    with open(filename, "rb") as csvfile:
        datareader = csv.reader(csvfile)
        next(datareader)  # skip the header row
        for row in datareader:
            yield row

def chunks(iterable, size=1000):
    iterator = iter(iterable)
    for first in iterator:
        yield itertools.chain([first], itertools.islice(iterator, size - 1))

prov = 'EntHistory'
path = system.file.openFile("csv")

selectedPath = event.source.parent.getComponent('tagPath').text.split(']')[1]
fullDS = []
for chunk in chunks(readFile(path)):
	vals = []
	paths = []
	quals = []
	ts = []
	
	for row in chunk:
		paths.append(selectedPath)
		d = system.date.parse(row[0],"MM-dd-yyyy HH:mm:ss")
		gd = system.date.format(d,"yyyy-MM-dd HH:mm:ss")
		vals.append(float(row[1]))
		quals.append(192)
		ts.append(gd)
	
	system.tag.storeTagHistory(prov,'default',paths,vals,quals,ts)

Thanks!

Just for anyone else testing the code above,
add

from __future__ import with_statement

at the top of Pauls code.

1 Like

Good catch. I think it’s not required in 8.0, since we’re up to Jython 2.7, but I haven’t actually tested.

1 Like

How can i substitute the ‘size’ in def chunks(iterable, size=1000) with ths csv rows number ?