Logo Search packages:      
Sourcecode: python-biopython version File versions

def Bio::NeuralNetwork::Gene::Schema::DifferentialSchemaFitness::calculate_fitness (   self,
  genome 
)

Calculate the fitness for a given schema.

Fitness is specified by the number of occurances of the schema in
the positive sequences minus the number of occurances in the
negative examples.

This fitness is then modified by multiplying by the length of the
schema and then dividing by the number of ambiguous characters in
the schema. This helps select for schema which are longer and have
less redundancy.

Definition at line 281 of file Schema.py.

00281                                        :
        """Calculate the fitness for a given schema.

        Fitness is specified by the number of occurances of the schema in
        the positive sequences minus the number of occurances in the
        negative examples.

        This fitness is then modified by multiplying by the length of the
        schema and then dividing by the number of ambiguous characters in
        the schema. This helps select for schema which are longer and have
        less redundancy.
        """
        # convert the genome into a string
        seq_motif = genome.toseq()
        motif = seq_motif.data
        
        # get the counts in the positive examples
        num_pos = 0
        for seq_record in self._pos_seqs:
            cur_counts = self._schema_eval.num_matches(motif,
                                                      seq_record.seq.data)
            num_pos += cur_counts

        # get the counts in the negative examples
        num_neg = 0
        for seq_record in self._neg_seqs:
            cur_counts = self._schema_eval.num_matches(motif,
                                                      seq_record.seq.data)

            num_neg += cur_counts

        num_ambiguous = self._schema_eval.num_ambiguous(motif)
        # weight the ambiguous stuff more highly
        num_ambiguous = pow(2.0, num_ambiguous)
        # increment num ambiguous to prevent division by zero errors.
        num_ambiguous += 1

        motif_size = len(motif)
        motif_size = motif_size * 4.0

        discerning_power = num_pos - num_neg
        
        diff = (discerning_power * motif_size) / float(num_ambiguous)
        return diff

class MostCountSchemaFitness:


Generated by  Doxygen 1.6.0   Back to index