# How to check similarity between strings?

## Question:

It's very common to compare `strings` , which is usually done by comparing for equality. However, today I have a need to compare the similarity between two `strings` so that I can compare how similar they are.

For example:

"City" is different from "city", but is similar (consider `c` != `C` ).

"City" is different from "city" but is similar (assuming the user typed it wrong).

"city" is different from "city" but is similar (same thing).

My question is: how do I check the similarity between `strings` ? Is there a ready-made algorithm to do this?

I'm looking for a way to do this:

``````if(checkSimilarity("cidade","cdade") > 0.8)
{
// as duas strings são muito parecidas
// sendo que checkSimilarity() retornando 1 para strings iguais
}
``````

It doesn't matter what programming language is used in the answer. My question is more related to the algorithm.

I believe that Levenshtein's algorithm is a solution that works well for what you are intending to do, since with it it is possible to know the number of modifications that a word needs to go through before it equals the other.

A possible implementation of it (in C++) would be:

``````#include <iostream.h>
#include <string.h>
#include <math.h>
#include <algorithm>

using namespace std;

{
int tam1 = palavra1.size();
int tam2 = palavra2.size();
int verif[tam1 + 1][tam2 + 1];

// Se uma das palavras tiver coprimento igual a zero, é necessária uma modificação do tamanho da outra:
if (tam1 == 0)
return tam2;

if (tam2 == 0)
return tam1;

// Atribuir ordem numérica resultante das palavras ao vetor de modo, respectivamente, "vertical" e "horizontal":
int i = 0;
while(i <= tam1)
verif[i][0] = i++;

int j = 0;
while(j <= tam2)
verif[0][j] = j++;

// Verificação:
for (int i = 1; i <= tam1; i++)
{
for (int j = 1; j <= tam2; j++)
{
// Definindo custos de modificação (deleção, inserção e substituição):
int custo = (palavra2[j - 1] == palavra1[i - 1]) ? 0 : 1;

verif[i][j] = min(
min(verif[i - 1][j] + 1, verif[i][j - 1] + 1),
verif[i - 1][j - 1] + custo);
}
}

return verif[tam1][tam2];
}

int main()
{
string pala1, pala2;

cout << "Informe a primeira palavra: " << endl;
cin >> pala1;
cout << "Informe a segunda palavra: " << endl;
cin >> pala2;

cout << "O numero de modificacoes necessarias para que as duas palavras se igualem e: "