About

This has not been done before—it’s a breakthrough in Computer Science and Machine Learning.

This solves the Regular Expression Induction (REI) problem in a significant and practical way.

Up to 23 regexes are learned for each input set—providing a choice between Optimality, Readability, and Abstractions.

Descriptive: The input strings can be reconstructed from the learned regex.
Optimal: The shortest regex based on Significant Length.
Executable: Compatible with standard regex engines.
Matching: Matches all input strings.
Exact Matching: Matches all and only the input strings.
Abstractions: Use of character classes (e.g., \d, \w) and ranges (e.g., [3-5bg-j]).
Plain Length: Total characters in the regex.
Significant Length: Count of input string characters in the regex.
Expansion Factor: (matched strings count) ÷ (original input strings count). For exact matches: 1.0X.

Purpose:
1. As close to optimal as possible
2. Executable
3. Readable by humans
Supports human analysis of string/sequences
Introduces a new form of explainable machine learning
Shortest regex determined by Significant Length

Significant vs. Plain Length example for input string aab:

Id	Regex	Significant Length	Plain Length
1	`aab`	3: aab	3: aab
2	`a{2}b`	2: a{2}b	5: a{2}b