thinking regularly, thinking universally, thinking mathematically
🎇 'Pattern-Collector'
List of Patterns The 'awesome List' of Patterns *(the only repo of it's kind)
Please edit this draft wildy
Exploration: Patterns make bite-sized tools
1. Regular Expressions(=Search Patterns=Data format definitions.)
Regex are most common & most efficient to type. (Despite they are one of the oldest dicsiplines in programming to make sense of data, convert it, clean it or spell-check it. https://en.wikipedia.org/wiki/Regular_expression)
Regex are versatile, because they work in most languages and editors and many apps.
Common Data Formats² | match | replacement | comment/justify | extra³_ |
---|---|---|---|---|
ISBN | ||||
Youtube Video ID | [^\w-]([\w-]{11})[^\w-] |
$1 | 11char base64 is almost unique | (?:https?://|//)?(?:www\.|m\.)?youtu/?be(?:\.com)?/(?:embed/|v/|watch\/?\?[&\w=]{,128}v=([\w-]{11})[^\w-] |
Hashes, Public Keys, Signatures | match | |||
MD6 | ||||
SHA256, Bitcoin, ... | ||||
Convert | match | replacement | ||
MarkDown links to HTML links | \[([^\]]*)\]\(([^\)]*)\) |
<a href="$2">$1</a> |
||
this table2Javascript | |`([^\`]*)`\|`([^\`]*)`| | replaceAll(/$1/g, "$2").replaceAll("\|","|") |
||
Javascript 2 Python | ... | $1$2$3 |
² date, postal code, formal greeting, formal __, ...
³extra: match typos too (common) and/or add precision ('no false positives' / perfectionism)
[we could add 1000s]
1.1 Automatic pattern generation / AI
Currently little of this is automated. Solutions such as Microsoft Power Automate for Desktop (Windows 11) want to change some of it.
1.2 Pre-processing Patterns
A raw text / data source material - or a list or category of patterns - can sometimes be analyzed for similarities and thus be combined in one preprocessing step. i.e. Preprocessing might Reduce Input data by 90% already in a fraction of the time / CPU
2. Contextual & Semantic patterns
word-lists, topics, frequencies, thesaurus, antonyms, semantic dictionaries, psychologic & sentiment dictionaries
wordnet, framenet, google ngrams, google trends, ....
Google Search:
~synonyms a|b AROUND(3) c|d -e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z
https://ahrefs.com/blog/google-advanced-search-operators/
Human Grammar & Natural language processing (NLP):
https://github.com/edobashira/speech-language-processing#readme
3. Structured Data. Querying Public Databases & the internet. SPARQL, SQL, NoSQL
Semantic web
WikiData
AWS public databases
4. Merging the above "1.-3."
vs 5. Human work VS machine learning models
All Patterns
https://docs.google.com/spreadsheets/d/1EjeZ2RtNpM_mANdO1VPXmZmbIb5vANUXodPBFtdg3zU/edit
-
Others Lists // potential Sources: ___ , ___ , ___ , ____ ,____ , ( not a list but 1 repo per regex: https://github.com/regexhq, takes clicks to see one: regexhq/youtube-regex/index.js)
-
Compare: https://www.mulesoft.com/exchange/?type=connector&view=list (>10000 'enterprise converts')
Name | pattern match | replacement | language | comment/justify | raw³ | extra context/precision |
---|---|---|---|---|---|---|
regex | ||||||
css | ||||||