Question:
I'm implementing a function that checks if within a word there are two identical letters in a row, and if so, add an 'x' between the two letters to separate them.
Example:
- Entrance:
pizza
- Output:
pizxza
def checagem_letras(self):
i = 0
for letra in self.__frase:
if self.__frase[i] == self.__frase[i+1]:
?????????
I don't know which method to implement to make the code work.
Answer:
One solution is to go through the characters of the string, and only add the "x" if the next character is the same as the current one:
class Test:
def __init__(self, frase):
self.__frase = frase;
def checagem_letras(self):
result = ''
for i, letra in enumerate(self.__frase):
result += letra
if i < len(self.__frase) - 1 and letra == self.__frase[i + 1]:
result += 'x'
return result
print(Test("pizza").checagem_letras()) # pizxza
print(Test("pizzaiollo").checagem_letras()) # pizxzaiolxlo
print(Test("aaa bbbb").checagem_letras()) # axaxa bxbxbxb
I use enumerate
to loop through the characters of the string, and at the same time I already have the respective index. So I can check for the next character – taking care to check that I'm not on the last character (the condition i < len(self.__frase) - 1
), because in that case I can't check for the next character (otherwise I'll try to access an index that does not exist and an IndexError
will occur).
If the next character is the same as the current one, I add the "x". At the end I return the changed string.
The detail is that I needed to build another string, since strings are immutable in Python (see the documentation ), so it is not possible to change the indices of an already existing string. That is, this code:
s = 'abc'
s[1] = 'x'
Causes a TypeError
( see ) as I tried to change an index of the string. So the only way to do what you want is to create another string.
It is also worth noting that in the case of 3 repetitions ( aaa
) I understood that the result must be an "x" inserted between each occurrence of two repeated letters, and therefore the result must be axaxa
(it was not clear if this case happens, nor the what should happen if it happens).
It was unclear whether the method should return the changed string or just modify the current phrase. If you just want to change the current sentence, do:
def checagem_letras(self):
result = ''
for i, letra in enumerate(self.__frase):
result += letra
if i < len(self.__frase) - 1 and letra == self.__frase[i + 1]:
result += 'x'
# modifico a frase em vez de retornar
self.__frase = result
Also, the above algorithm inserts the "x" for any character that repeats (not just letters). But if you want to restrict yourself to letters, you can change the condition of the if
. For example:
if i < len(self.__frase) - 1 and letra.isalpha() and letra == self.__frase[i + 1]:
I used isalpha()
, which checks if it's a letter (so other characters will be ignored even if they are repeated). Change the condition to whatever you need.
Another way to do it – a little more complicated, and I admit that for this case it's a certain "exaggeration", since the above solution is much simpler – is to use regular expressions (regex) :
import re
class Test:
def __init__(self, frase):
self.__frase = frase;
def checagem_letras(self):
return re.sub(r'([a-zA-Z])(?=\1)', r'\1x', self.__frase)
print(Test("pizza").checagem_letras()) # pizxza
print(Test("pizzaiollo").checagem_letras()) # pizxzaiolxlo
print(Test("aaa bbbb").checagem_letras()) # axaxa bxbxbxb
The regex uses the character class [a-zA-Z]
, which takes a letter from a
to z
(lowercase or uppercase) and since it's in parentheses, this forms a capturing group .
Then I use a lookahead (the stretch between (?=
and )
), which checks if something exists ahead. And that something is \1
, which is a backreference and means "the same thing that was captured by capture group 1". In this case, group 1 is the first pair of parentheses, which is what contains the letter a
to z
. That is, the regex checks if it has a repeating letter.
Then in the replacement I use \1x
, i.e. the letter that the regex detected repeats (the backreference \1
), followed by an "x".
In this case I am being very restricted and I only insert the "x" when it is a repeated letter. But if you want to be more generic like the first option above and consider any character (not just letters), you can change the character class to a period:
def checagem_letras(self):
return re.sub(r'(.)(?=\1)', r'\1x', self.__frase)
Because in regex, the period matches any character (except line breaks) .