Regular expression

	This is actually... helpful? You can read this for more info about the topic

A regular expression (or regex, regexp) or rational expression is a sequence of characters to match a pattern in text. It is not a programming language like other languages would be, but is often heavily integrated into other programming language standard libraries.

Use on the sharty[edit | edit source]

It can be used on the sharty for:

Data mining during raids over large databases for names (doxing)
Setting up filters for post filtering
Word filters

Metacharacters[edit | edit source]

Regular expressions use special characters, called metacharacters, to control how patterns are matched. Common ones include:

. — Matches any single character except newline (wildcard)
^ — Matches the start of a string (start anchor)
$ — Matches the end of a string (end anchor)
* — Matches zero or more of the preceding element
+ — Matches one or more of the preceding element
? — Matches zero or one of the preceding element (also used for non-greedy quantifiers)
{n}, {n,}, {n,m} — Matches a specific number or range of repetitions
- {n} — Specifically n repetitions
- {n,} — n or more repetitions
- {n,m} — At least n repetitions, but not more than m repetitions
[…] — Defines a character class, e.g. [aeiou] matches vowels
[^…] — Negated character class, e.g. [^0-9] matches anything except digits
() — Groups expressions and captures matches
(?: ) — Groups expressions without capturing
| — Alternation (logical OR), e.g. cat|dog
\\ — Escapes a metacharacter to match it literally
\d, \w, \s — Common shorthand classes (digits, word chars, whitespace)
- \d — Digit characters
- \w — Alphanumeric characters
- \s — Whitespace characters
- \b — Word boundaries
- \z — Matches the end of a string, but not an internal line
\D, \W, \S — Negated versions of the shorthand classes
- \D — Non-digit characters
- \W — Non-alphanumeric characters
- \S — Non-whitespace characters
- \A — Matches the beginning of a string, but not an internal line

These metacharacters can be combined to form complex and powerful search patterns.

Character Classes[edit | edit source]

These examples show how classes, anchors, quantifiers, groups, and lookarounds can be combined and negated to create flexible patterns.

[a-zA-Z0-9]+ — matches one or more alphanumeric characters
\b\w{3,5}\b — matches words of 3 to 5 letters
(cat|dog)s? — matches "cat", "cats", "dog", or "dogs"
\d{2,4}-\d{2}-\d{2} — matches dates like 2025-10-04 or 25-10-04
[^aeiou]{3,} — matches three or more consecutive non-vowel characters
\b(?:Mr|Ms|Dr)\. [A-Z][a-z]+\b — matches titles like "Mr. Smith" or "Dr. Jones"
^(?:https?|ftp)://[^\s/$.?#].[^\s]*$ — matches a basic URL

Example[edit | edit source]

This is an example of using a regex pattern to validate an email address in Java.

^[a-zA-Z0-9_+&*-]+: Matches the username part (letters, digits, underscores, plus, etc.).
(?:\\.[a-zA-Z0-9_+&*-]+)*: Matches optional parts for periods between characters in the username.
@: Matches the "@" symbol separating username and domain.
(?:[a-zA-Z0-9-]+\\.)+: Matches the domain name, allowing subdomains.
[a-zA-Z]{2,7}$: Ensures the domain ends with a valid top-level domain (TLD), like .com, .org, etc.

package party.soyjak.example;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexExample {
    public static void main(String[] args) {
        // Sample email to test
        String email = "test@example.com";

        // Regex pattern for a basic email validation
        String emailPattern = "^[a-zA-Z0-9_+&*-]+(?:\\.[a-zA-Z0-9_+&*-]+)*@(?:[a-zA-Z0-9-]+\\.)+[a-zA-Z]{2,7}$";
        Pattern pattern = Pattern.compile(emailPattern);
        Matcher matcher = pattern.matcher(email);

        if (matcher.matches()) {
            System.out.printf("%s is a valid email address.%n", email);
        } else {
            System.out.printf("%s is not a valid email address.%n", email);
        }
    }
}

If you want to create your own patterns and test them interactively, try RegExr, an online regex tester and debugger. It breaks down patterns step by step and also has a library of community-made regex snippets you can use.