Using 7.9.4.
Have a task where I am supposed to send two CSV files to to similar but different location. One file the timestamps are normal human readable and the other is isoformat.
I convert the existing file with normal time stamps to iso format with the following function. I had an issue where this itself was not actually writing the iso style format (hence all the print debugs) and I realized csv.writer was autoformatting the datetimes until I put a str
around it.
def convertFileToIsoFormat(read_filePath, write_filePath):
"""
This file takes an already created data text file, reads through it and
converts the time stamp columns to isoformat.
Args:
read_filePath: str, file to read
write_filePath: str, where to put the new file
Returns:
None, or raises error
"""
with open(read_filePath) as csvfile:
with open(write_filePath, 'wb') as newcsv:
reader = csv.reader(csvfile)
writer = csv.writer(newcsv)#lineterminator='\n'
for row in reader:
modifiedRow = row
for column in TIMESTAMP_COLUMNS:
print column
print row[column]
modifiedRow[column] = str(datetime.datetime.strptime(modifiedRow[column],'%m/%d/%Y %H:%M:%S %p').isoformat())
print modifiedRow[column]
print 'writing row %s'%(str(modifiedRow))
writer.writerow(modifiedRow)
The above works for converting the entire file.
Now, I have to split the file into 3 MB chunks. This is where my iso formatted datetimes are getting converted back into human readable timestamps but I just want to keep them in iso format. Here is my current function -
def splitFile(fileToProcess, folderToStoreTo):
"""
Takes a data file and splits it into <3MB chunks
Args:
fileToProcess: str, file path to split into chunks
destDir: str, destination directory to put split files into
Return:
filePaths: list of strings of filepaths of texts that were made
"""
maxLinesPerFile=23000
LOGGER.info("fileToProccess %s folderToStore %s"%(str(fileToProcess), str(folderToStoreTo)))
currentLineCount = 0
readFileName = fileToProcess.split('\\')[-1]
filesMade = 1
filePaths = []
with open(fileToProcess) as f:
line = f.readline()
while line:
filename = readFileName.split('.')[0]+'-%i'%(filesMade)+'.txt'
fullPath = folderToStoreTo + "\\" + filename
filePaths.append(fullPath)
for l in range(0,maxLinesPerFile):
# print line
writeData = "%s%s"%(line, os.linesep)
system.file.writeFile(fullPath, writeData, 1)
line = f.readline()
currentLineCount += 0
if not line:
break
filesMade += 1
print type(line), line, str(line)
return filePaths
I don't know why this is reading things as an iso and reformatting them back to a regular timestamp. You can probably recreate this if you use my second function and feed it a csv file that has iso formatted tiemstmaps. I have no idea what is going wrong here. The only thing I can thing is my with open(fileToProcess)
is doing something behind the scenes in formatting the date or system.file.writeFile
is doing somethign in processing the dates. I've tried changing my with open(fileToPrcess) as f:
and I tried changing writeData = "%s%s"%(line, os.linesep)
since a well placed str()
fixed my previous issue but neither worked.
I am out of ideas. Something is obviously going on the background but I don't know where so I am having trouble figuring out how to stop it. Any insight is appreciated.