15. Regexps

make 15_Regexp subdirectory. All code should reside there.

  1. Look at lecture and try

    • <!> to teacher: select appropriate subset of examples

    • examples of searching/matching — with grep

    • examples of search/replace — with sed 's/regexp/replacement/'

      • convert them to sed -E

  2. Search with fortune -l output

    • "You" and "you" (use grep)

    • Any prononus from "you, me, he, she" list (with egrep --color, you will need an alternative)

      • (!) try to match exact word (not any word part)

        • hint1: "[^[:alpha:]]" is non-letter character

        • hint2: "(^|a)word" matches both word at the beginning of the line or aword everywhere in the line.

  3. (optional if there's time) Search/replace practice TODO

Using C regex

(!) This is mandatory part

taken from here

  1. readlines.c acting as simple cat; using getline is mandatory

    • ./readlines file prints all lines of file

    • notice repeating line=NULL, len=0 getline() parameters to make this work

    • do not forget to free() each line

  2. relines.c acting as simple grep

    • ./relines "regex" file shows only lines containing regex

    • see regcomp and regexec (with nmatch=0, pmatch=NULL parameters it just searches a regexp)

  3. reshow.c that takes in account grouping operations in pattern (less then than #define MAXGR 10)

    • See more on regexec:

      • each pmatch[] element is start and end of substring matching certain regexp

      • pmatch[0] is for whole regexp

      • pmatch[1] is for 1st group

      • pmatch[2] is for 2nd group

    • Use this code:
         1   if(!regexec(&r, line, MAXGR, pm, 0)) {
         2           fputs(line, stdout);
         3           for(i=0; i<MAXGR && pm[i].rm_so>=0; i++)
         4                   printf("\t%d/%d\n", pm[i].rm_so, pm[i].rm_eo);
         5   }
         6   free(line);
      
    • Suppose file textfile is

      ab
      abba
      00aaaEEEbbb11

      ./reshow "\(aaa\).*\(bbb\)" textfile must show

         1 $ ./reshow "\(aa*\).*\(bb*\)" textfile
         2 ab
         3         0/2
         4         0/1
         5         1/2
         6 abba
         7         0/3
         8         0/1
         9         2/3
        10 00aaaEEEbbb11
        11         2/11
        12         2/5
        13         10/11
        14 
      
  4. replone.c

    • ./replone regexp replacement file prints file lines with regexpreplacement substitution.

    • No grouping operator in regexp (phiew! 😅)

    • Only first match is replaced (phiew again!)

    • All possible errors must be treated accurately (i. e. check functions' return values)

H/W

Finish all tasks

HSE/ProgrammingOS/Lab_15_Regexp (последним исправлял пользователь FrBrGeorge 2020-03-20 10:14:43)