org.biojava.bio.seq.io
Class ChunkedSymbolListFactory

java.lang.Object
  |
  +--org.biojava.bio.seq.io.ChunkedSymbolListFactory

public class ChunkedSymbolListFactory
extends java.lang.Object

class that makes ChunkedSymbolLists with the chunks implemented as SymbolLists themselves.

The advantage is that those SymbolLists can be packed implementations.

You can build a SequenceBuilderFactory to create a packed chunked sequence from an input file without making an intermediate symbol list with:-

 public class PackedChunkedListFactory implements SequenceBuilderFactory
 {
   public SequenceBuilder makeSequenceBuilder()
   {
     return new SequenceBuilderBase() {
       private ChunkedSymbolListFactory chunker = new ChunkedSymbolListFactory(new PackedSymbolListFactory(true));

       // deal with symbols
       public void addSymbols(Alphabet alpha, Symbol[] syms, int pos, int len)
         throws IllegalAlphabetException
       {
         chunker.addSymbols(alpha, syms, pos, len);
       }

       // make the sequence
       public Sequence makeSequence()
       {
         try {
           // make the SymbolList
           SymbolList symbols = chunker.makeSymbolList();
           seq = new SimpleSequence(symbols, uri, name, annotation);

           // call superclass method
           return super.makeSequence();
         }
         catch (IllegalAlphabetException iae) {
           throw new BioError("couldn't create symbol list");
         }
       }
     };
   }
 }
 

Then reading in FASTA files can be done with something like:-

 SequenceIterator seqI = new StreamReader(br, new FastaFormat(),
     DNATools.getDNA().getTokenization("token"),
     new PackedChunkedListFactory() );
 

Blend to suit taste.

Alternatively, you can input Symbols to the factory with addSymbols make the sequence eventually with makeSymbolList.

NOTE: An improvement has been introduced where an internal default SymbolList factory is used for small sequences. This implementation allows for faster SymbolList creation and access for small sequences while allowing a more space-efficient implementation to be selected for large sequences.

Author:
David Huen

Constructor Summary
ChunkedSymbolListFactory(SymbolListFactory symListFactory)
           
ChunkedSymbolListFactory(SymbolListFactory userSymListFactory, int threshold)
           
 
Method Summary
 void addSymbols(Alphabet alfa, Symbol[] syms, int pos, int len)
          tool to construct the SymbolList by adding Symbols.
 SymbolList make(SymbolReader sr)
          Method to create a Sequence with a SymbolReader.
 SymbolList makeSymbolList()
          Converts accumulated Symbols to a SymbolList
 void useSuppliedSymListFactory()
          Call this to convert from default SymbolList implementation to user-supplied implementation.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChunkedSymbolListFactory

public ChunkedSymbolListFactory(SymbolListFactory symListFactory)
Parameters:
symListFactory - class which produces the SymbolLists that are used to store the chunked symbols.

ChunkedSymbolListFactory

public ChunkedSymbolListFactory(SymbolListFactory userSymListFactory,
                                int threshold)
Parameters:
userSymListFactory - User-supplied class which produces the SymbolLists that are used to store the chunked symbols (only used when the chunked list to be created is larger than threshold.
threshold - the size of the SymbolList beyond which the userSymListFactory is used. Below that, the internal default SymbolList factory is used.
Method Detail

addSymbols

public void addSymbols(Alphabet alfa,
                       Symbol[] syms,
                       int pos,
                       int len)
                throws java.lang.IllegalArgumentException,
                       IllegalAlphabetException
tool to construct the SymbolList by adding Symbols. Note that this class is not thread-safe. Also, it can only assemble one SymbolList at a time. And the composite formed by adding Symbols must not have interstitial missing Symbols.

java.lang.IllegalArgumentException
IllegalAlphabetException

useSuppliedSymListFactory

public void useSuppliedSymListFactory()
Call this to convert from default SymbolList implementation to user-supplied implementation.


makeSymbolList

public SymbolList makeSymbolList()
                          throws IllegalAlphabetException
Converts accumulated Symbols to a SymbolList

IllegalAlphabetException

make

public SymbolList make(SymbolReader sr)
                throws java.io.IOException,
                       IllegalSymbolException,
                       IllegalAlphabetException,
                       BioException
Method to create a Sequence with a SymbolReader. (does anyone use this???>

java.io.IOException
IllegalSymbolException
IllegalAlphabetException
BioException