Accepting User Input

Posted by admin

When accepting form submissions or even querystrings from the URL, you, as the developer should know what type of data is expected. What is best to do is write a little function to check it. Allow this function to accept a data value, data type expected and the maximum length of the data.

If the data is supposed to be a number, make sure it is, if it's an email, make sure it is, if it's free text, make sure you properly escape or cleanse the data before inserting it into your database.

Most every language has function to check in variables are integers or valid dates, there are plenty of regular expressions to check for email addresses, phone numbers, date formatting and such other predictably formatted data.

Also, look for strings or characters that could be used maliciously such as HTML brackets, ampersands, single quotations, the string "UNION", "SELECT", etc which are commonly used in SQL Injection attacks.

If you function picks up something breaking the rules, stop the processing of the page right there.

Regular Expressions are great for testing data input. They are very versatile and very powerful. On the downside, I have always had a very hard time committing the language to memory. To me it's very cryptic and complex. I have figured out enough to get some jobs done, or found some thanks to the internet. I will share a few basics with you here. In cases that you see X would be replaced with the maximum length of the input that would be accepted.

  • To check if some input is an integer, check it again this:
    "^[0-9-]{1,X}$"
  • To check for a string with no spaces, such as a querystring value:
    "[A-Z0-9_-]{1,X}$"
    this allow numbers, letters, underscore and dash.
  • Very simple email validation:
    "^[A-Z][\w\.-]*[A-Z0-9]@[A-Z][\w\.-]*[A-Z0-9]\.[A-Z][A-Z\.]*[A-Z]$"
  • Simple ###-###-#### phone number validation:
    "^[0-9]{3}-[0-9]{3}-[0-9]{4}$"
  • Simple ##/##/#### date validation allowing for one or two digits on month and date:
    "^[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{4}$"

These should work for you and perhaps show you enough RegExp to perhaps hack and change to suit your needs.

As a general rule, if an input requirement doesn't fall within a predictable format type like the examples above, one should just check it against the maximum allowed length and then pass the string thru a function to render any possibly malicious characters encoded, escaped, or otherwise made harmless to your application.