I've been reading the Underhanded C Contest website, where the goal is to write subtly malicious code that looks normal at first glance. One of the common techniques mentioned was the use of not a number , or
nan , a constant that has some special properties; notably, any sort of comparison with
nan results in
Thinking about a proof of concept in Python, I arrived at the following:
def maior_que_10(): entrada = input('Digite um número: ') try: entrada_float = float(entrada) except ValueError: print('Erro!') return if entrada_float > 10: print('Maior que 10!') return elif entrada_float <= 10: print('Não é maior que 10!') return print('Inesperado!') while True: maior_que_10()
The function correctly handles invalid numeric values by throwing an error and on careless looking, it never seems to get to
print('Inesperado!') because it checks
> 10 and
<= 10 , but having input "nan" executes the last line:
Digite um número: 11 Maior que 10! Digite um número: 9 Não é maior que 10! Digite um número: 10 Não é maior que 10! Digite um número: foobar Erro! Digite um número: nan Inesperado!
Theoretically, in less trivial code, malicious code could be hidden after the two
if . This however depends on having user input passed to the
Is there any operation between variables that generates a
I thought division by zero or negative number root, but result in exceptions, not
>>> math.sqrt(-1) ValueError: math domain error >>> 1/0 ZeroDivisionError: division by zero
(*rereading the entire question, I saw that I wrote an extensive answer on how to check a decimal point entry, but that doesn't quite answer your specific question – sorry. I'll keep the answer because it might help newbies who fall here because of the title of the question)
In newer versions of Python it is possible to do
from math import nan – this puts in the namespace the variable
nan that contains a number
In older versions (prior to 3.5 of Python), the recommended thing was to put in your code:
nan = float('nan')
same (or use the expression
Furthermore it is important to keep in mind when dealing with NaN's that one NaN value is never equal to another when compared to
== (nor is it equal to itself). The best way to tell if a value is a NaN is to use the
isnan function in the math module:
from math import nan, isnam isnan(nan)
That said about NaNs – there are more things to consider about using
float directly on top of a string the user types. In particular, infinite values can be expressed with
float('inf') (and negative infinity with "-inf"), and numbers with scientific notation are also accepted, where an exponent factor of "10" can be added to the number after of the letter "e":
In : float("1e3") Out: 1000.0
So if you really want to limit the input to positive or negative numbers, with decimal points, it's better to parse them more carefully than simply calling
In general, when we talk about "parse", many people think of regular expressions first. I consider regular expressions to be difficult to read and maintain, and people tend to put simple expressions, which don't match all possible data.
checking typed data with regular expressions:
Python is a good language for regular expressions because luckily they didn't invent to mix them with the language's syntax – you call normal functions and pass a string with the regular expression you want to compare with your text – there are several functions in the
re module of regular expressions – for example to "find all occurrences" (
re.findall ) or replace (
re.sub ). In this case, we simply want to see if an expression matches user input.
And in a hurry one might think "I want to see if the user has typed one or more digits, followed by an optional period, followed by one or more digits" – this expression can be written as
"[0-9]+\.?[0-9]+" – just look at it and see that it's not good: what if the user types a "-" sign? What if there is only one digit? (the second part expects one more digit after the dot – although the dot is optional) – result – while this expression can match "11", "23.2", "0.1", it will not match "1", "- 1", ".23", etc…
To make a long story short, the regular expression that checks for a decimal number, with optional sign, with at least one valid digit, or no digits if there is a decimal point, and if there is a decimal point at least one digit after it is:
c = r"-?(?:[0-9]+|(?=\.))(?:\.[0-9]+)?$"
(Python regexps documentation is here – https://docs.python.org/3/library/re.html )
And you could do in your code:
import re def maior_que_10(): entrada = input('Digite um número: ') if not re.match(r"-?(?:[0-9]+|(?=\.))(?:\.[0-9]+)?$", entrada): print('Erro!') return entrada_float = float(entrada) ...
Checking Input with Python Code
So, in the name of readability, and knowing what you're doing, it might be worth using Python's string manipulation functions: split, find, count, isdigit to make a function that checks if a string is a decimal well formatted before trying to convert it to float.
You can do something like:
verifica_decimal(text): if not text: # string vazia return False filtered = text.replace('-', '').replace('.', '') if not filtered.isdigit(): # há caracteres que não são nem dígito nem - nem . return False if '-' in text[1:]: # sinal 'perdido' no meio do número. return False if text.count('.') > 1 or text[-1] == '.': # Mais de um '.', ou '.' na última casa return False return True def maior_que_10(): entrada = input('Digite um número: ') if not verifica_decimal(entrada): print('Erro!') return entrada_float = float(entrada) ...