c# - Searching String or StringBuilder with a pattern range /start/ /end/ -
i want create function (w/ set of helper functions if needed) in c# perform similar thing awk '/start/,/end/' file
- except include last matches, rather terminating on first.
lets have:
# cat text "13:08:30:5276604 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:5736962 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:6227343 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:6757752 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:7208103 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:7668739 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:8129079 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m"
expected:
"13:08:30:6227343 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:6757752 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:7208103 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:7668739 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m"
awk output:
# awk '/13:08:30:62/,/13:08:30:7/' text "13:08:30:6227343 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:6757752 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m" "13:08:30:7208103 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m"
i thought might regex match 2 conditions pattern_1 | pattern_2
not work if there values in between matching values.
i discovered c# stringbuilder class not have .indexof()
, .lastindexof()
methods (i have bit more experience in java, thinking use these untill saw c# not having them). since don't have these methods , need possibly implement them, wanted ask if approach go? section suggests use string if extensive searching needed: msdn - , can use well. chose use stringbuilder because string concatenation performed constantly, should use stringbuilder
type when building string (a lot of concatenation), convert string
type when searching?
i performant , awesome hear suggestions how make such. general guidance , implementation details appreciated.
if need process potential large file better use streamreader , process per line using readline method. prevents end complete file in memory, when using stringbuilder. using abstract textreader on implementation can use both string (file)stream.
to check begin end matches can use regex class. it's match method returns instance success
property true when match found.
to achieve logic you're after reckoned there 3 states: before found begin, before found end, while still found end. opted implement in iterator using yield
keyword give me state machine free.
here implementation:
void main() { // use streamreader read characters // .ctor accpets encoding second parameter using(var sr = new streamreader(@"sample.txt")) { readfrombegintoend("13:08:30:62","13:08:30:7",sr); } var text =@" 13:08:30:6227343 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m 13:08:30:6757752 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m 13:08:30:7208103 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m 13:08:30:7668739 main: 41044 - 48.7617 m-- other pids 2 - 79.1016 m"; using(var sr = new stringreader(text)) { readfrombegintoend("13:08:30:62","13:08:30:7", sr); } } // enumerate on lines streamreader // accepting 2 regexes, start , end ienumerable<string> frombegintoend(textreader rdr, regex start, regex end) { // 1st state var line = rdr.readline(); // initial read, null means we're done // read lines until hit our start match while(line != null && !start.match(line).success) { // don't return these lines line = rdr.readline(); } // 2nd state // read lines while didn't hit our end match while(line != null && !end.match(line).success) { // return line caller yield return line; line = rdr.readline(); } // 3rd state // read lines while find our end match while(line != null && end.match(line).success) { // return line caller yield return line; line = rdr.readline(); } // iterator done yield break; } // take start , end string can compiled regex // , file (fullpath) void readfrombegintoend(string start, string end, textreader reader) { // loop on lines mach criteria // frombegintoend our custom enumerator foreach(var line in frombegintoend(reader, new regex(start), new regex(end))) { // write standard out // can streamwriter.writeline well. console.writeline(line); } }
Comments
Post a Comment