walking and averaging values in python -


i have process .txt files presnent in subfolder inside folder.like:
new folder>folder 1 6>xx.txt & yy.txt(files present in each folder)
each file contain 2 columns as:

arg  asp  gln glu  

and

arg glu arg arg glu asp 

now have :
1)count number of occurance of each word each file > , average total count dividing total no. of lines in file
2)then values obtained after completing 1st step, divide values total no. of files present in folder averaging (i.e. 2 in case) have tried code follows:
have succeeded in 1st case i'm not getting 2nd case.

for root,dirs,files in os.walk(path):     aspcount = 0     glu_count = 0     lys_count = 0     arg_count = 0     his_count = 0     acid_count = 0     base_count = 0     count = 0     listoffile = glob.iglob(os.path.join(root,'*.txt')     filename in listoffile:         linecount = 0         asp_count_col1 = 0         asp_count_col2 = 0         glu_count_col1 = 0         glu_count_col2 = 0         lys_count_col1 = 0         lys_count_col2 = 0         arg_count_col1 = 0         arg_count_col2 = 0         his_count_col1 = 0         his_count_col2 = 0         count += 1         line in map(str.split,inp):             saltcount += 1             k = line[4]             m = line[6]             if k == 'asp':                asp_count_col1 += 1             elif m == 'asp':                asp_count_col2 += 1             if k == 'glu':                glu_count_col += 1             elif m == 'glu':                 glu_count_col2 += 1             if k == 'lys':                 lys_count_col1 += 1             elif m == 'lys':                 lys_count_col2 += 1             if k == 'arg':                 arg_count_col1 += 1             elif m == 'arg':                 arg_count_col2 += 1             if k == 'his':                 his_count_col1 += 1             elif m == 'his':                 his_count_col2 += 1         asp_count = (float(asp_count_col1 + asp_count_col2))/linecount            glu_count = (float(glu_count_col1 + glu_count_col2))/linecount            lys_count = (float(lys_count_col1 + lys_count_col2))/linecount            arg_count = (float(arg_count_col1 + arg_count_col2))/linecount            his_count = (float(his_count_col1 + his_count_col2))/linecount    

upto able average value per file. how able average per subfolder(i.e. dividing count(total no. of file)). problem 2nd part. 1st part done. code provided average values each file. want add averages , make new average dividing total no. of files present in sub-folder.

your use of os.walk glob.iglob bogus. either use 1 or other, not both together. here's how it:

import os, os.path, re, pprint, sys #... root, dirs, files in os.walk(path):   counts = {}   nlines = 0   f in filter(lambda n: re.search(r'\.txt$', n), files):     l in open(f, 'rt'):       nlines += 1       k in l.split():         counts[k] = counts[k]+1 if k in counts else 1   k, v in counts.items():     counts[k] = float(v)/nlines    sys.stdout.write('frequencies directory %s:\n'%root   pprint.pprint(counts) 

Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -