Friday, July 22, 2011

Python for non programmers, part 4

During the last lesson, we learned how to filter out rows with invalid input. However, it is not always desirable to delete an entire row. The following variation will intercept problems on a per-value basis instead and append the word Invalid instead:


# Import libraries
import csv
import operator

# Open files and csv readers/writers
infile = open('data.txt', 'rb')
in_csv = csv.reader(infile, delimiter=',')
outfile = open('data2.txt', 'wb')
out_csv = csv.writer(outfile, delimiter=',')

# Process values from a row and intercept errors
def process(row, indices, converter, function):
try:
values = [converter(row[i]) for i in indices]
return reduce(function, values)
except(ValueError, IndexError):
return 'Invalid'

# Go through the rows
is_header = True
for row in in_csv:
if is_header:
row = row + ['Sum', 'Max', 'Description']
is_header = False
else:
row.append(process(row, [1,2], float, operator.add))
row.append(process(row, [1,2], int, max))
row.append(process(row, [1,2], str,
lambda x,y: 'Value pair of %s and %s' % (x, y)))
out_csv.writerow(row)


Your mission, should you choose to accept:


  • Try out the program. Is the output as you would have expected?

  • Add two more columns to each row. The first one should show the difference between the first and second column (a-b), the second one should show the maximum of a/10 and b/11=5.

  • Modify the program to filter out any incoming row that has fewer than 3 columns. Hint: use len(row) to get to the amount of columns.

0 comments: