Gallery

Actual Example 1: Simple example with 2 input strings and a choice of 2 learned regexes (The Optimal Exact Match and 1 Match):


MLREGEX results 2025-03-07 14:16:16 UTC

Your Learn Event ID:

6c34a2f2-0e25-4a65-bbf6-519bfde9238b

Your Learn Event Name/Description:

Coffee and Tea

Your set of input strings (2)

coffee

tea

Learned Regexes (2)


1. MOST OPTIMAL EXACT MATCH
ABSTRACTION TYPE: NONE
EXPANSION FACTOR: 1.0X
SIGNIFICANT LENGTH: 7

cof{2}e{2}|tea


2. MATCH
ABSTRACTION TYPE: Structural
EXPANSION FACTOR: 4.0X
SIGNIFICANT LENGTH: 6

(cof{2}|t)e{1,2}a?



In Example 1, you can see that two Regular Expressions are learned for the two input strings: coffee and tea. The first Regular Expression is the Optimal Exact Match, meaning it the shortest Regular Expression that matches all and only the two input strings.

The second Regular Expression is a structural abstraction, that matches the two input strings, but not only the two input strings. For example, the second Regular Expression also matches the string “coffea”, which is incorrect. Note that although the second Regular Expression has a shorter significant length than the first one, it matches not only the two input strings, but eight strings in total (Expansion Factor * number of input strings). Which Regular Expression you want, depends on your use case.


Actual Example 2 (Text)

Input Strings:

“A regular expression (shortened as regex or regexp;[1] sometimes referred to as rational expression[2][3]) is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.” - Wikipedia

Learned regex:

((s(ometim|pecifi)|techniqu)e|a|i)s|(refer{2}|\(shorten|develop|us)ed|string(-searching|s,)|(validation|text)\.|theor(etical|y\.)|s(equ|ci)ence|(Usual{2}|b)y|(languag|ar)e|(r|R)egular|algorithms|characters|operations|(mat|su)ch|computer|rational|replace\"|o(r|f|n)|t(hat|o)|A|expres{2}ion(\[2]\[3]\))?|regex(p;\[1])?|pat{2}erns?|for(mal)?|in(put)?|\"find\"?|a(nd)?



Actual Example 3 (Optimality, Readability and Abstraction: URLs)

Input Strings (13):

http://1.alpha.com

http://2.alpha.com

http://3.alpha.com

http://4.beta.com

http://5.beta.com

http://6.beta.org

http://7.beta.org

https://1.alpha.com

https://2.alpha.com

https://3.alpha.com

https://4.beta.com

https://5.beta.com

https://6.alpha.org


Learned Regexes (6)


1. MOST OPTIMAL EXACT MATCH
ABSTRACTION TYPE: Digit Ranges 1
EXPANSION FACTOR: 1.0X
SIGNIFICANT LENGTH: 46

ht{2}p(s?:/{2}([1-3]\.alph|[4-5]\.bet)a\.com|(s:/{2}6\.alph|:/{2}[6-7]\.bet)a\.org)


2. EXACT MATCH
ABSTRACTION TYPE: NONE
EXPANSION FACTOR: 1.0X
SIGNIFICANT LENGTH: 47

ht{2}p(s?:/{2}((1|2|3)\.alph|(4|5)\.bet)a\.com|(s:/{2}6\.alph|:/{2}(6|7)\.bet)a\.org)


3. MATCH
ABSTRACTION TYPE: Structural
EXPANSION FACTOR: 8.6X
SIGNIFICANT LENGTH: 28

ht{2}ps?:/{2}(1|2|3|4|5|6|7)\.(alph|bet)a\.c?o(rg|m)


4. MATCH
ABSTRACTION TYPE: Digit Class 1
EXPANSION FACTOR: 3.9X
SIGNIFICANT LENGTH: 40

ht{2}p(s?:/{2}(\d\.alph|\d\.bet)a\.com|(s:/{2}6\.alph|:/{2}\d\.bet)a\.org)


5. MATCH
ABSTRACTION TYPE: Word Class 1
EXPANSION FACTOR: 24.3X
SIGNIFICANT LENGTH: 40

ht{2}p(s?:/{2}(\w\.alph|\w\.bet)a\.com|(s:/{2}6\.alph|:/{2}\w\.bet)a\.org)


6. MATCH
ABSTRACTION TYPE: Character Ranges 2
EXPANSION FACTOR: 76923.1X
SIGNIFICANT LENGTH: 19

ht{2}p[.-/1-7:a-ceg-hl-mo-pr-t]{13,15}




Actual Example 4 (Nested Repeating Substrings)

Input Strings (4):

waabbccddaabbccddr

waabbcffggvcffggvcffggvddaabbccddaabbccddr

waabbcffggffggvcffggffggvcffggffggvddaabbccddaabbccddr

waabbcffgeegeevcffgeegeevcffgeegeevddaabbccddaabbccddr

Learned regex:

w(a{2}b{2}((c(f{2}g{2}){2}v){3}|(cf{2}(ge{2}){2}v){3}|(cf{2}g{2}v){3})d{2})?(a{2}b{2}c{2}d{2}){2}r



Actual Example 5 (Scalability: 50 Random Strings, with lengths between 1 and 100)

Input Strings (50):

Learned regex:


1. MOST OPTIMAL EXACT MATCH
ABSTRACTION TYPE: NONE
EXPANSION FACTOR: 1.0X
SIGNIFICANT LENGTH: 1708

(dc){2}b{3}e{2}bd{2}cdcebce{2}b{2}c{2}ecbecebe{3}b{2}dc{2}bc{2}d{2}e{2}becedebe{2}cebd{2}bc{2}b(ed){2}c{2}bebdcd{4}bcbc{4}de{2}dcdedb{2}|ce{2}ceb{2}dcecbedcbdebd{2}bcbecd{2}becdbdc{2}bc{6}bc{3}e{2}bdc(bc){2}d{2}c{2}d{2}cd{3}bc{2}b{3}dec{2}dbedede{2}becdcedce{2}d|bdecde{2}dbe{2}dceb(ec){2}bcd{2}b{2}dbede{2}dcb{2}cd{2}e{2}dcb{2}(de){2}c{2}b{2}ec{2}dbecedb{2}cb{3}(cde){2}d{2}bcd{2}cecdbe|edecbdb{2}cdebcedbd{2}b{4}e{2}c{3}db{2}(e{2}d){2}bdeb{3}d{2}ebcde{2}ce{4}d{2}(eb){2}db{2}decdeb{2}edebedcb{3}edbebe{2}b{4}c|bcd{2}cbec(ece){2}dbe{3}bc{2}b{2}dcedeb{2}ec{2}be(edb){2}decd(bc){2}cdbdcebce{2}b{2}cdcb(cbde){2}(dc){2}e{2}bc(dc){2}e{3}|edcdbede{3}becb{3}dc{2}edcbec{4}dcecbdbcdc{2}becd{2}bc(dc){3}bdc{2}d{2}cdc{2}(bd){2}edbec(be){3}d{2}bebdcec{4}de{2}bdedc|cde{2}cedbedcbcdbd{2}b{2}db(edb){2}cb{2}de{2}cbecebd{2}c{2}bedeb(dc){3}ecbe{3}dbe{2}bcbedc{2}(e{2}d){2}cbcebe{2}c{3}d{5}|bc{2}ed{4}b{2}cde(dc){2}edebcecd(c{2}e){2}dc{2}eb{2}ed{2}ce{3}c{3}ecbd(ce){2}d{2}b{2}c{2}ebecdbcecdc{2}edbde{2}cecdced|cbdc{2}e{3}debced{3}ecdeb{3}dcdb{2}deced{2}bde{2}cb{2}e{3}c{3}ed{2}cde{2}dceb{2}dedce(c{2}e){2}db{2}cdc{2}b{3}c{3}|b{2}d{3}bed{3}cbedce{2}bcedecdbcd{2}ed{5}cedced{2}bce{2}bdbd{2}bcdbc{2}de(dc){2}e{2}bed(cb){3}dc{3}de{3}(cb{2}){2}|e{2}c{2}bcbeb(cd){2}e{3}cdcbced{2}ede{2}cebcdbd{2}ed{3}c{2}ecdedbebd{4}cecdcd{2}bdbc{2}d{2}bcbdbd{2}ebce{2}bd|d{2}(bd){2}eb{2}dc{2}ebcd{2}e(db){2}d{2}ecd{3}b{2}deb{2}ecbc(db){2}ce{2}b{4}(ebd){2}bcedbc{2}bd{4}bcbdc|de{3}deb{2}db{3}de{2}bec{2}be{2}d{2}ebec{2}edebc{2}d{2}e{2}bce{3}decebedbe{4}c{2}edbe{2}dc{2}b{2}cbdb|bced(ecd{2}){2}cb(bd){2}c{2}d{2}bcbde{3}b{2}c{2}db{2}ecdbdec(eb){2}(bd){2}ce{3}c{2}d{2}b(de){2}bcb|dce{3}c{3}e{3}dcbe{2}cbcd{2}ecb{2}dce{2}bdcedec{2}dbcbe{2}(de){2}c{2}bd(bdb){2}edcecd{2}bcdecd{2}|dcb(de){2}b{2}(ed){2}cbde{3}cbdbcebcd{2}ecbd{2}ed{2}cb{2}debe{2}cbdb{2}e{2}c(cde){2}ecbc{3}db{2}|cbc{3}d{2}cbdb{2}e(ce){3}(eb){2}db{3}d{2}bec{2}ebcd{2}ce{2}dcd(ec{2}){2}(bc){2}c{2}edc{2}dbeb|dbde{3}bcec{2}ed{2}e{2}dbcecbd{2}bdc{2}beb(edc){2}db(bed){2}ce{2}bed{3}c{2}d{2}edbdeb{3}|d{2}cecbe{2}cdbdeb{2}e{4}db{3}c{4}ebecdbedcd{2}bedcded{2}cebec{2}bed{3}ede{2}bdc{2}d{2}|e(db){2}db{2}e(be){2}dc{2}ecb{2}c{2}bde{3}bdbedbdebcbe{2}db{3}dcb{2}ce{2}d{2}ec{4}ecdbd|eb{3}d{2}edb{2}cdedce{5}cdecb{2}decbd{2}b{3}cdecbdcd{2}ecb{2}cbd{2}e{2}c{2}be{2}c|dbe{3}d(ec){2}cedebed(ec){2}d{2}bdcdedbdced{2}ebc{2}b{2}edbe{2}dbdcbdedb{2}dec{2}|(dc){2}b{3}ecdbc{2}e{2}cdbcdc{3}e{2}(bd){2}cbcd(bd){2}ebcdeb(ec{2}){2}b{3}ed|decded{3}bede{3}d{2}cdbc{2}be{3}bcedbe{2}cdc(cd){2}ecedbdcd{2}ec{2}e(cd){2}|c{2}e{2}dcb{2}ecbeb{2}cec{2}dede{4}d{3}b{3}(db){2}ec{2}dcbcdec{2}bdcecdec|dcbd{3}cbcd{2}cdbecede{2}d{2}cedbc(bdc){2}cdced{2}c{2}b(bce){2}e|cbecd{2}cbdec{2}dcedb{2}de{3}be{2}cbebce{3}c{2}d{2}bec{2}bc|ed{2}cdb{3}c{2}db{2}d{2}cbdec{4}e{2}cdce{2}cdc{2}d{2}bdcbd|ecbe{3}c{3}dbecdcbede{3}cbdb(b{2}c){2}e(be){2}e{2}b{2}dceb|dedb{2}cdebcebdbe{2}c{3}ecbc{2}bedc{3}ebdbcb(ce){2}|cdc{3}de{2}cdb(dbc{2}){2}bc{2}de{2}(ec){2}be{2}|dcedebc{2}eb{2}ce{2}dcedb{2}dc{2}e{2}b{2}c|cdc(eb){3}(cb){2}db{2}edcebedeb(ebd){2}b|c{3}d{3}cd{2}bdc(ce){2}ecdcbdeced{3}cde|cdbd{2}ebdbcede{2}d{2}cbedebcdbeb{2}cd|d{3}cdbe(de){2}e{2}dcebd{2}bedce{2}bd|bdcbdbeb{2}cebc{3}dbeb(ec){2}b{2}cdc|cbdb{2}e{2}c{2}d{2}(db){2}b(be){2}e|cbdbc{2}e{2}cb{2}debecdc{2}|c{2}dedbe(bd{3}c){2}cbcedb|e{2}c{2}decbc{2}ed{3}ecb|ced{2}b{3}cd{2}|d{2}cbec{3}b{2}|b{5}ebe{2}dcd|dec{2}bebdbc|cd{2}b{4}|bd{2}|b{2}|dcdb|c