"""read: get the information in the output of the software cd-hit, that is the size and the cluster head of the cluster larger thant min_clust_size and store it in a object of type ClusterCollection. It also collect all the sequences and their affiliation to a particular cluster and store them into the member sequence_collection """
# initialize the two objects where the information will be stored
cluster_collection=ClusterCollection()
# open the cluster files
clust_file=open(self.filename,'r')
# and read it line by line
clust_line=clust_file.readline()
whileclust_line:
# cluster information is preceded by '>Cluster' + the cluster number