Regular Expression - Fuzzy Regular Expressions

Fuzzy Regular Expressions

Variants of regular expressions can be used for working with text in natural language, when it is necessary to take into account possible typos and spelling variants. For example, the text "Julius Caesar" might be a fuzzy match for:

  • Gaius Julius Caesar
  • Yulius Cesar
  • G. Juliy Caezar

In such cases the mechanism implements some fuzzy string matching algorithm and possibly some algorithm for finding the similarity between text fragment and pattern.

This task is closely related to both full text search and named entity recognition.

Some software libraries work with fuzzy regular expressions:

  • TRE - well-developed portable free project in C, which uses syntax similar to POSIX
  • FREJ - open source project in Java with non-standard syntax (which utilizes prefix, Lisp-like notation), targeted to allow easy use of substitutions of inner matched fragments in outer blocks, but lacks many features of standard regular expressions.
  • agrep - command-line utility (proprietary, but free for non-commercial usage).

Read more about this topic:  Regular Expression

Famous quotes containing the words fuzzy, regular and/or expressions:

    Even their song is not a sure thing.
    It is not a language;
    it is a kind of breathing.
    They are two asthmatics
    whose breath sobs in and out
    through a small fuzzy pipe.
    Anne Sexton (1928–1974)

    A regular council was held with the Indians, who had come in on their ponies, and speeches were made on both sides through an interpreter, quite in the described mode,—the Indians, as usual, having the advantage in point of truth and earnestness, and therefore of eloquence. The most prominent chief was named Little Crow. They were quite dissatisfied with the white man’s treatment of them, and probably have reason to be so.
    Henry David Thoreau (1817–1862)

    Many expressions in the New Testament come naturally to the lips of all Protestants, and it furnishes the most pregnant and practical texts. There is no harmless dreaming, no wise speculation in it, but everywhere a substratum of good sense. It never reflects, but it repents. There is no poetry in it, we may say, nothing regarded in the light of beauty merely, but moral truth is its object. All mortals are convicted by its conscience.
    Henry David Thoreau (1817–1862)