trimming - find index of word from list of words in text file in python -
i have list of words in text file, , have list of text files reach text.
the list of words in text file has words starting word, line containing below words considered .
same text file has words starting word, lines containing above words considered.
in short want trim text files using words.
here code:
# load text file textfile=[] lines=[] path="images summarization _# ctscan/" text in sorted(os.listdir(path+'text/'),key=lambda x: os.path.splitext(x)[0]): textfile.append(open(path+"text/"+text,'r').read().lower()) lines.append(open(path+"text/"+text,'r').read().lower().splitlines()) ## trimming part trimed_words_top=open(path+"words trimming above.txt",'r').read() trimed_words_below=open(path+"words trimming below.txt",'r').read() trimed_words_top=trimed_words_top.lower().splitlines() trimed_words_below=trimed_words_below.lower().splitlines() word_index_top=[] data=[] trimmed_text=[] """ line in lines: word in trimed_words_top: if word data=lines[word_index_top[0]+1:] trimmed_text=' '.join(word word in data) """ # 1 single file , need singlesfile=lines[0] word_index_top=[i i, s in enumerate(singlesfile) if 'ncct' in s] ## here logic word trimming cnt=0 word_index_top=[] line in lines: word_index_top.append([i i, s in enumerate(line) word in trimed_words_top if word in s]) cnt+=1
Comments
Post a Comment