|
|
Question : Parsing out name string
|
|
Can anyone point me to any routines to parse out the components of a name string. The strings I have may or may not have a title such as Dr., Dr, Mr, Mrs., etc. It may or may not have one or more credentials such as MD, M.D., Phd, etc. It may have a middle name or initial. The parts I need are: title, first name, middle initial (if any), last name including Jr. - III - etc., and credentials. Some example name strings: John Q Public Esq. Dr. Jane O Smith M.D. John Doe Dave Smith Jr. PT Normal Rae Jean Joel Vergel de Dios
|
Answer : Parsing out name string
|
|
I might be a little more devious and drop all periods and commas before parsing. This will simplify your solution space, and allow you to reformat the results as you like. Run through and replace all multiple-spaces with a single space, then split into an array on a single space.
The way I see it, you've got three common situations for the name to deal with then -- last name only, first and last name, first name and initial and last name.
Check each element to see if it is a valid prefix; if not treat it as a name field. Check the n+2, n+3 and n+4 fields for suffixes... this will tell you what name format you're dealing with and you can parse the name appropriately.
I notice that there is no mention of formal titles such as "The Right Honorable" etc, for judges, mayors, etc. Also military ranks might cause problems e.g. BGen (ret'd)
|
|
|
|
|