About

This website generates descriptive, non-trivial, optimal, or near optimal, executable and matching, or exact matching, regular expressions (regex) from positive example input strings.

This has not been done before -- it is a breakthrough in Computer Science and Machine Learning.

This solves the Regular Expression Induction (REI) problem for the first time, in a significant and practical way.

Up to 21 regexes are learned for each input set of strings -- providing a choice between Optimality, Readability and Abstractions.

Definitions

Descriptive means that the original input strings can be reconstructed by examining the learned regex.

Optimal means the shortest regex describing the input string set.

Executable means that most normal regular expression engines can execute the learned regex.

Matching means that the learned regex matches all the input strings.

Exact Matching means that the learned regex matches all and only the input strings.

Abstractions use Character Classes (\d and \w), as well as computed Character Ranges (e.g. [3-5bg-j]).

Notes

The purpose of descriptive regexes, is to be both executable and to be readable (and modifiable) by humans. Furthermore, it allows for analysis of strings by humans. It is a new form of explainable machine learning.


Microsoft for Startups