Tuesday, July 12, 2011

Python for non programmers, part 1

A good friend of mine (not an engineer) recently faced a problem. Customers sent in data samples (usually as comma-separated files) that needed to be processed and/or validated for conformance to a specification. My friend spent hours importing the files into spreadsheets and going through them manually. The data was large for a human (a few thousand rows), but small for a computer. My friend had some basic programming skills from college and early jobs, but over the years, those had become a bit rusty. I wondered if I could help, so here goes...

The objective:
  • Getting my friend all the information needed with as little theory as possible
  • Focus on the specific use case
  • Keep it simple (no list comprehensions, or any of that stuff)
  • Must work on windows machines with minimal system modifcations.
The final result will hopefully be a chain of simple exercises to get this person up to speed.

Lesson 1: Install python, read and write a csv file:

  • Install python, version 2, from http://python.org/download/. Currently, the latest installer can be found here. Remember the folder of where python was installed.
  • Optionally, install an editor like Notepad++ (otherwise, use regular notepad)
  • Create a folder, for example c:\example
  • Add a file called data.txt with the following three lines of content:
Row A,Row B,Row C
Hello,1,2
World,3,4
  • Add a file called csvcopy.py with the following content (the leading whitespace in the last line is important):
import csv
infile = open('data.txt', 'rb')
in_csv = csv.reader(infile, delimiter=',')
outfile = open('data2.txt', 'wb')
out_csv = csv.writer(outfile, delimiter=',')
for row in in_csv:
out_csv.writerow(row)
  • Open a command prompt and switch to the folder that contains data.txt and csvcopy.py.
  • Run "C:\Python\python.exe csvcopy.py" (path needs to be adjusted depending on where you installed python).
  • Open the newly created file "data2.txt" and look at its content.
  • Exercise: rename data.txt to data.csv and modify the program to write the result into data2.csv

0 comments: