Comments on Convet

From Michael Boldin (WRDS):

The dictionary in a dictionary was clever trick. I did change

if re.compile("-?\d+(\.\d+)?").match(values[3]):

to be more general , using ret instead of value[3]

Also there are cases in CRSP extracts where Tickers are missing (read in as NULL I think) if the date range is long enough for a few PERMNO cases, so I used

if ticker and re.compile("-?\d+(\.\d+)?").match(ret):

another option is to use

isinstance(ret,(int,float))

to test for a true number

Finally, I am fairly sure that opening a file as ‘rb’ avoids any need to worry about \n breaks. And it is probably not worth using for this case, but python has a nice CSV module for reading delimited text files. I thought it could both read in large chunks, or iterate through row by row, but looking at the documentation I am not sure about reading row blocks. I have used the csv.writerows() method as a very fast writer of row blocks.

From Michael Boldin (WRDS):

Back to Stat 956 page.