регулярные-выражения – Regular expression for selecting alphabetic sequences without gaps


You need to write a command to the terminal using grep or sed. It should output only matching pieces from the text file (no difference in line or column). Perl cannot be used.

Now there is a regular expression like this


Oddly enough, sequences such as "ace" or "bpxz" fall under it. How can I make the expression take into account only sequences without missing letters, such as "abcd", "opqr", "xy"?

UPD : I forgot to add that spaces are ignored (this is what I use \s* for). The regular must find an alphabetic sequence anywhere in the text. For example, from the phrase "roll call of duty officers" there should be "kl" and "dej" (in Russian it was easier to come up with an example).


Unfortunately, you did not specify what dialect of regular expressions can be used and what it is for. Perhaps there are simpler solutions based on special features of regular expressions, or simpler means without using regular expressions.

For a PCRE compatible dialect, a similar expression is obtained (up to the letter d, continue by analogy, insert spaces to taste):


Test on ragex101.com

From the "Special Features" of regular expressions, for example, in the perl language, you can check any consecutive characters like this:

echo "abpade fg xyz" | perl -npe 's/.*?((?:([a-z])\s*(?=(??{chr(ord($2)+1)})))+.)/$1\n/g'

de fg

perl can be used instead of grep on most unix systems by writing the required command as a single line.

UPD For the command line, using only grep and sed, the short version is:

echo "a bcefgkmoxyz" |\
grep -Po `echo -n 'bcdefghijklmnopqrstuvwxyz' |\
sed 's/./\0\0/g;s/^/a/;s/\(.\)\(.\)/\\\\s*(?:\1(?=\\\\s*\2))?/g;s/.$/./'` |\
sed -n '/../p'

a bc

The command is divided into several lines for ease of viewing, you can put it in one line by removing the \ . I was too lazy to write a long regular expression, so the result of executing (in grep ) the echo | sed which creates the necessary expression on the fly from the letters of the alphabet. Unfortunately, the ideal expression did not work out and grep also produces individual characters, the last line sed -n '/../p' is used to suppress them.

The grep parameter generated by the commands from the alphabet looks like this:

Scroll to Top