A friend of mine, had a problem today (with Python) which voils down to the following piece of code:
- f = open('input.txt')
- c = f.readlines()
- # here were different operations
- # on c, but that's not really important
- print len(c)
Don't ask why there is readlines() in the first place, that's not the problem.
The problem was, that c had wrong number of lines.
However answer to that lies in python's documentation:
Read until EOF using readline() and return a list containing the lines thus read. If the optional sizehint argument…
Unfortunatelly the file contained some random bytes, among different bytes there was also character with hex value of 0x1A. Now let me quote everyone's favourite source of information - wikipedia:
In the MS-DOS operating system, this character is used to indicate the end of a file or the end of user input in an interactive command line window. This behavior was borrowed from the earlier CP/M operating system.
Some operating systems such as the pre-VMS DEC operating systems, along with CP/M, tracked file length only in units of disk blocks and used Control-Z (SUB) to mark the end of the actual text in the file. For this reason, EOF, or end-of-file, was used colloquially and conventionally as a TLA for Control-Z instead of SUBstitute.
There is simple "solution" to this problem:
- f = open('input.txt', 'rb')
Since CP/M was mentioned twice here, I can't resist to quote famous words of a man I've always regarded as a brilliant visionary - Gary Kildall (but unfortunately rather poor businessman for that matter) (the quote relates to int 21h/AH=09h):
"Ask Bill [Gates] why the string in function 9 is terminated by a dollar sign. Ask him, because he can't answer, only I know that."
gim.org.pl is down






