Question:
I would like a light for my problem. My goal is to get the list below, dividing the blocks between the words LOREM
and LOREM
but I don't want to get the entire text that follows the end of the list pattern, as follows:
LOREM : 10505050
IPSUM : 1050051051084
DOLOR : 2620620620652
AMETI : 54084840540540
LOREM : 10505050
IPSUM : 1050051051084
DOLOR : 2620620620652
AMETI : 54084840540540
LOREM : 10505050
IPSUM : 1050051051084
DOLOR : 2620620620652
AMETI : 54084840540540
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
I'm using this regex: /(?=LOREM :).+?(?:(?=LOREM :))/s
I'm able to select all, but the last block of text I can't select.
For a better understanding, follow this example: https://regex101.com/r/gM2fF1/1
Answer:
I suggest two approaches:
#1: by steps
I could split this text into the "interesting part" and throw away the rest using for example ([^\.]+[\d]+)
.
Then I would just have the pattern chave : valor
and could make a simpler match that would give an array with each line. Something like this :
$regex = '(([\w]+) : ([\d]+))';
preg_match_all($regex, $string, $matches);
#2: Regex grouping capture
There could be a regex that directly captures groups, this implies that your group pattern is consistent. A suggestion is to do this :
$regex = '(([\w]+ : [\d]+[\s\n\r]){4})';
preg_match_all($regex, $string, $matches);