Logo Search packages:      
Sourcecode: python-biopython version File versions  Download package

Bio::EUtils::ReseekFile Namespace Reference

Detailed Description

Wrap a file handle to allow seeks back to the beginning

Sometimes data coming from a socket or other input file handle isn't
what it was supposed to be.  For example, suppose you are reading from
a buggy server which is supposed to return an XML stream but can also
return an unformatted error message.  (This often happens because the
server doesn't handle incorrect input very well.)

A ReseekFile helps solve this problem.  It is a wrapper to the
original input stream but provides a buffer.  Read requests to the
ReseekFile get forwarded to the input stream, appended to a buffer,
then returned to the caller.  The buffer contains all the data read so

The ReseekFile can be told to reseek to the start position.  The next
read request will come from the buffer, until the buffer has been
read, in which case it gets the data from the input stream.  This
newly read data is also appended to the buffer.

When buffering is no longer needed, use the 'nobuffer()' method.  This
tells the ReseekFile that once it has read from the buffer it should
throw the buffer away.  After nobuffer is called, the behaviour of
'seek' is no longer defined.

For example, suppose you have the server as above which either
gives an error message is of the form:

  ERROR: cannot do that

or an XML data stream, starting with "<?xml".

  infile = urllib2.urlopen("http://somewhere/")
  infile = ReseekFile.ReseekFile(infile)
  s = infile.readline()
  if s.startswith("ERROR:"):
      raise Exception(s[:-1])
  infile.nobuffer()   # Don't buffer the data
   ... process the XML from infile ...

This module also implements 'prepare_input_source(source)' modeled on
xml.sax.saxutils.prepare_input_source.  This opens a URL and if the
input stream is not already seekable, wraps it in a ReseekFile.

  Don't use bound methods for the ReseekFile.  When the buffer is
empty, the ReseekFile reassigns the input file's read/readlines/etc.
method as instance variable.  This gives slightly better performance
at the cost of not allowing an infrequently used idiom.

  Use tell() to get the beginning byte location.  ReseekFile will
attempt to get the real position from the wrapped file and use that as
the beginning location.  If the wrapped file does not support tell(),
ReseekFile.tell() will return 0.

  readlines does not yet support a sizehint.  Want to
an implementation?

The latest version of this code can be found at


class  ReseekFile


def prepare_input_source
def test
def test_reads

Generated by  Doxygen 1.6.0   Back to index