Regex Patterns for Personal Data…
It seems that I am always searching for good regular expressions, but really accurate and concise ones are seemingly impossible to find. Most do very poor jobs at covering corner cases, or they are completely wrong and useless. For example, I recently discovered a regex designed to match an MD5 hash, which is a 32-digit hexadecimal number. The regex author had specified “^([a-z0-9]{32})$” (which included numbers and all letters) when, by definition, it clearly should have been “\b[a-f0-9]{32}\b” (only numbers and the letters A through F).
So, as usual, I set out to create my own expressions. The expressions below were developed and tested using official formats, definitions, and sample data valid as of December 2007. Be aware that all of the following assume case insensitivity, so they will grab both upper- and lower-case letters. And, assuming you’re going to cut and paste, don’t forget to remove the carriage return from those expressions that fall onto a second line.
Regular Expressions
Almost all major credit cards (and most debit cards) (example=4111 1111 1111 1111):
\b(1800|2131|30[0-5]\d|3[4-7]\d{2}|4\d{3}|5[0-5]\d{2}|6011|6[2357]
\d{2})[- ]?(\d{4}[- ]?\d{4}[- ]?\d{4}|\d{6}[- ]?\d{5})\b
Austrian Social Security Number (de=Sozialversicherungsnummer) (example=1788011550):
\b\d{4}(0[1-9]|[12]\d|3[01])(0[1-9]|1[0-5])\d{2}\b
Bulgarian Uniform Civil Number (bg=Единен граждански номер) (example:7523169263):
\b\d{2}([024][1-9]|[135][0-2])(0[1-9]|[12]\d|3[01])[-+]?\d{4}\b
Canadian Social Insurance Number:
\b[1-9]\d{2}[- ]?\d{3}[- ]?\d{3}\b
Chinese National Identification Card Number (cn=身份证):
\b\d{6}(19|20)\d{2}(0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])\d{4}\b
Croatian Master Citizen Number (hr=Matični broj građana):
\b(0[1-9]|[12]\d|3[01])(0[1-9]|1[0-2])(9\d{2}|0[01]\d)\d{6}\b
Danish Civil Registration Number (dk=Personnummer, CPR Nummer):
\b(0[1-9]|[12]\d|3[01])(0[1-9]|1[0-2])\d{2}[-+]?\d{4}\b
Finnish Social Security Number (fi=Henkilötunnus) (example=311280-999J):
\b(0[1-9]|[12]\d|3[01])(0[1-9]|1[0-2])\d{2}[-+a]\d{3}\w\b
Indian Permanent Account Number:
\b[a-z]{3}[abcfghjlpt][a-z]\d{4}[a-z]\b
Indian Vehicle License Plate Number (example=DL 11 C AA 1111):
\b([a-z]{2}[ ]\d{1,2}|dl[ ][1-9]?\d[ ][cprstvy])[ ][a-z]{0,2}[ ]\d{1,4}\b
Italian Fiscal Code (it=Codice fiscale) (example=HDDFTH63H28Z352V):
\b([bcdfghj-np-tv-z][a-z]{2}){2}\d{2}[a-ehlmprst]([04][1-9]|
[1256]\d|[37][01])(\d[a-z]{3}|z\d{3})[a-z]\b
Norwegian Social Security Number (no=Personnummer, Fødselsnummer, SSNR):
\b(0[1-9]|[12]\d|3[01])([04][1-9]|[15][0-2])\d{7}\b
Romanian Personal Numeric Code (ro=Cod Numeric Personal) (example=1800101221144):
\b[1-8]\d{2}(0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])(0[1-9]|[1-4]\d|
5[0-2]|99)\d{4}\b
South Korean Resident Registration Number (ko=주민등록번호):
\b\d{2}(0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])\-[0-49]\d{6}\b
Swedish Personal Identification Number (se=Personnummer):
\b(19\d{2}|20\d{2}|\d{2})(0[1-9]|1[0-2])(0[1-9]|[12]\d|3[01])
[-+]?\d{4}\b
Taiwanese National Identification Card Number:
\b[a-z][12]\d{8}\b
United Kingdom National Insurance Number (example=AA 01 23 44 B):
\b[abceghj-prstw-z][abceghj-nprstw-z][ ]?\d{2}[ ]?\d{2}[ ]?\d{2}[ ]?[a-dfm]?\b
United States Social Security Number (example=078-05-1120):
\b(?!000)(?!666)([0-6]\d{2}|7([0-356]\d|7[012]))[- ]?(?!00)\d{2}
[- ]?(?!0000)\d{4}\b
Feel free to use these anywhere for any reason. You do not need to credit me at all, but if you do use them or plan to use them, I’d appreciate you leaving a comment that tells me how they’re being used so I can have some bragging rights! Or buy me a cup of coffee.
Disclaimer
Your use of this website and any information offered here is at your own risk. I make no representations about the suitability of the information for any purpose. In no event shall I be liable for any special, indirect, or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of this website or any information offered here. Basically, if you use any of these regular expressions and they don’t work, it’s not my fault because you should have checked them yourself before relying on them. Consider yourself warned.