English |

Version: post-3.2.8 (CVS)
Last doc update: 2010-05-17

Head coder/maintainer: Syzop
Coders: binki
Previous coders & contributors: Stskeeps, codemastr, Luke, aquanight, WolfSage, McSkaf, Zogg, NiQuiL, assyrian, chasm, DrBin, llthangel, Griever, nighthawk
Documentation: CKnight^ (initial documentation), Syzop (major rewrite), codemastr, and many contributors

To view this documentation you must have a compatible browser, which are listed below. Up to date docs are available at http://www.vulnscan.org/UnrealIRCd/unreal32docs.html and a FAQ at http://www.vulnscan.org/UnrealIRCd/faq/.

-- 3.12. Anti-flood features
-- 3.13. Ban types
6. User & Channel Modes
7. User & Oper Commands
A. Regular Expressions
---A.1. Literals
---A.2. Dot Operator
---A.3. Repetition Operators
---A.4. Bracket Expressions
---A.5. Assertions
---A.6. Alternation
---A.7. Subexpressions
---A.8. Back References
---A.9. Case Sensitivity

3.12 - Anti-Flood features

Channel modes
There are also some channel modes which can be very effective against floods. To name a few:
K = no /knock, N = no nickchanges, C = no CTCPs, M = only registered users can talk, j = join throttling (per-user basis)
As of beta18 there's also a much more advanced channelmode +f...
Channel mode f
Instead of using scripts and bots to protect against channel floods it is now build into the ircd.
An example +f mode is: *** Blah sets mode: +f [10j]:15
This means 10 joins per 15 seconds are allowed in the channel, if the limit is hit, the channel will be set +i automatically.
The following floodtypes are available:
type:name:default action:other avail. actions:comments
cCTCPsauto +Cm, M 
jjoinsauto +iR 
kknocksauto +K (counted for local clients only)
mmessages/noticesauto +mM 
nnickchangesauto +N  
ttextkickbper-user messages/notices like the old +f. Will kick or ban the user.


*** ChanOp sets mode: +f [20j,50m,7n]:15
<ChanOp> lalala
*** Evil1 (~fdsdsfddf@Clk-17B4D84B.blah.net) has joined #test
*** Evil2 (~jcvibhcih@Clk-3472A942.xx.someispcom) has joined #test
*** Evil3 (~toijhlihs@Clk-38D374A3.aol.com) has joined #test
*** Evil4 (~eihjifihi@Clk-5387B42F.dfdfd.blablalba.be) has joined #test
-- snip XX lines --
*** Evil21 (~jiovoihew@Clk-48D826C3.e.something.org) has joined #test
-server1.test.net:#test *** Channel joinflood detected (limit is 20 per 15 seconds), putting +i
*** server1.test.net sets mode: +i
<Evil2> fsdjfdshfdkjfdkjfdsgdskjgsdjgsdsdfsfdujsflkhsfdl
<Evil12> fsdjfdshfdkjfdkjfdsgdskjgsdjgsdsdfsfdujsflkhsfdl
<Evil15> fsdjfdshfdkjfdkjfdsgdskjgsdjgsdsdfsfdujsflkhsfdl
<Evil10> fsdjfdshfdkjfdkjfdsgdskjgsdjgsdsdfsfdujsflkhsfdl
<Evil8> fsdjfdshfdkjfdkjfdsgdskjgsdjgsdsdfsfdujsflkhsfdl
-- snip XX lines --
-server1.test.net:#test *** Channel msg/noticeflood detected (limit is 50 per 15 seconds), putting +m
*** server1.test.net sets mode: +m
*** Evil1 is now known as Hmmm1
*** Evil2 is now known as Hmmm2
*** Evil3 is now known as Hmmm3
*** Evil4 is now known as Hmmm4
*** Evil5 is now known as Hmmm5
*** Evil6 is now known as Hmmm6
*** Evil7 is now known as Hmmm7
*** Evil8 is now known as Hmmm8
-server1.test.net:#test *** Channel nickflood detected (limit is 7 per 15 seconds), putting +N
*** server1.test.net sets mode: +N
In fact, it can get even more advanced/complicated:
Instead of the default action, you can for some floodtypes specify another one, for example: +f [20j#R,50m#M]:15
This will set the channel +R if the joinlimit is reached (>20 joins in 15 seconds), and will set the channel +M if the msg limit is reached (>50 messages in 15 seconds).

There's also a "remove mode after X minutes" feature: +f [20j#R5]:15 will set the channel +R if the limit is reached and will set -R after 5 minutes.
A server can have a default unsettime (set::modef-default-unsettime), so if you type +f [20j]:15 it could get transformed into +f [20j#i10]:15, it's just a default, you can still set [20j#i2]:15 or something like that, and you can also disable the remove-chanmode completely by doing a +f [20j#i0]:15 (an explicit 0).

The old +f mode (msgflood per-user) is also still available as 't', +f 10:6 is now called +f [10t]:6 and +f *20:10 is now +f [20t#b]:10. Currently the ircd will automatically convert old +f mode types to new ones. Note that there's no unsettime feature available for 't' bans ([20t#b30]:15 does not work).

What the best +f mode is heavily depends on the channel... how many users does it have? do you have a game that makes users msg a lot (eg: trivia) or do users often use popups? is it some kind of mainchannel or in auto-join? etc..
There's no perfect channelmode +f that is good for all channels, but to get you started have a look at the next example and modify it to suit your needs:
+f [30j#i10,40m#m10,7c#C15,10n#N15,30k#K10]:15
30 joins per 15 seconds, if limit is reached set channel +i for 10 minutes
40 messages per 15 seconds, if limit is reached set channel +m for 10 minutes
7 ctcps per 15 seconds, if limit is reached set channel +C for 15 minutes
10 nickchanges per 15 seconds, if limit is reached set channel +N for 15 minutes
30 knocks per 15 seconds, if limit is reached set channel +K for 10 minutes
If it's some kind of large user channel (>75 users?) you will want to increase the join sensitivity (to eg: 50) and the message limit as well (to eg: 60 or 75).
Especially the remove-mode times are a matter of taste.. you should think like.. what if no op is available to handle the situation, do I want to have the channel locked for like 15 minutes (=not nice for users) or 5 minutes (=likely the flooders will just wait 5m and flood again). It also depends on the floodtype, users unable to join (+i) or speak (+m) is worse than having them unable to change their nick (+N) or send ctcps to the channel (+C) so you might want to use different removal times.
Channel mode j
The +f mode includes a feature to prevent join floods, however this feature is "global." For example, if it is set to 5:10 and 5 different users join in 10 seconds, the flood protection is triggered. Channel mode +j is different. This mode works on a per-user basis. Rather than protecting against join floods, it is designed to protect against join-part floods (revolving door floods). The mode takes a parameter of the form X:Y where X is the number of joins and Y is the number of seconds. If a user exceeds this limit, he/she will be prevented from joining the channel.

3.13 - Ban types

Basic bantypes and cloaked hosts
UnrealIRCd supports the basic bantypes like +b nick!user@host.
Also, if a masked host of someone is 'rox-ACB17294.isp.com' and you place a ban *!*@rox-ACB17294.isp.com, then if the user sets himself -x (and his hosts becomes for example 'dial-123.isp.com) then the ban will still match. Bans are always checked against real hosts AND masked hosts.
IP bans are also available (eg: *!*@128.*) and are also always checked.

Bans on cloaked IPs require some explanation:
If a user has the IP his cloaked host could be 341C6CEC.8FC6128B.303AEBC6.IP.
If you ban *!*@341C6CEC.8FC6128B.303AEBC6.IP you would ban *!*@ (obvious...)
If you ban *!*@*.8FC6128B.303AEBC6.IP you ban *!*@1.2.3.*
If you ban *!*@*.303AEBC6.IP you ban *!*@1.2.*
This information might be helpful to you when deciding how broad a ban should be.

Extended bantypes
Extended bans look like ~<type>:<parameter>.
They let you ban (or exempt) based on things other than the traditional nick!user@host mask. They also provide support for things like 'quieting' users.

These bantypes specify which actions are affected by a ban:

~qquietPeople matching these bans can join but are unable to speak, unless they have +v or higher. Ex: ~q:*!*@blah.blah.com
~nnickchangePeople matching these bans cannot change nicks, unless they have +v or higher. Ex: ~n:*!*@*.aol.com
~jjoinIf a user matches this, he may not join the channel. He may perform all other activities if he is already on the channel, such as speaking and changing his nick. Ex: ~j:*!*@*.aol.com
Could be useful to prevent people from one ISP from joining, but still make them able to speak/nickchange freely once in the channel, like after an /INVITE.

These bantypes introduce new criteria which can be used:

~cchannelIf the user is in this channel then (s)he is unable to join. A prefix can also be specified (+/%/@/&/~) which means that it will only match if the user has that rights or higher on the specified channel. Ex: +b ~c:#lamers, +e ~c:@#trusted
~rrealnameIf the realname of a user matches this then (s)he is unable to join.
Ex: ~r:*Stupid_bot_script*
NOTE: an underscore ('_') matches both a space (' ') and an underscore ('_'), so this ban would match 'Stupid bot script v1.4'.
~RregisteredIf a user has identified to services (usually NickServ) and matches this nickname, then this ban will match. This means this ban is really only useful for ban exemptions (+e).
Ex: +e ~R:Nick will allow Nick in the channel, regardless of other bans, if he identified to NickServ and is using the nickname Nick.

You may stack extended bans from the 1st group with the 2nd, such as +b ~q:~c:#lamers, which would quiet all users who have joined #lamers.

Modules can add other extended ban types.

6 – User & Channel Modes

Set by the /mode #channelname command. For instance, /mode #Atrium +S to set mode +S on #Atrium, which would strip color codes from text sent to the channel.

Channel Modes
Only Administrators may join
a <nick>
Makes the user a channel admin
b <nick!user@host>
Bans the given user from the channel
No ANSI color can be sent to the channel
No CTCP's allowed in the channel
e <nick!user@host>
Exception ban – If someone matches this, they can join a channel even if they match an existing ban
f [<number><type>]:<seconds>
Channel flood protection. See section 3.12 above for an extended description.
Makes channel G rated. Checks for words listed in the Badword Blocks, and replaces them with the words specified
h <nick>
Gives half-op status to the user
Invite required
I <nick!user@host>
Invite exceptions ("invex") - if someone matches this, they can bypass +i requirements to enter the channel.
j <joins:seconds>
Throttles joins per-user to joins per seconds seconds
/knock is not allowed
k <key>
Sets a key needed to join
l <##>
Sets max number of users
L <Chan>
If the amount set by +l has been reached, users will be sent to this channel
A registered nickname (+r) is required to talk
Moderated channel. Only +v/o/h users may speak
No nick name changes permitted
No messages from outside channels
Only IRCops may join
o <nick>
Gives a user channel operator status
Makes channel private
q <nick>
Sets channel owner
Only U:Lined servers can kick users
This channel is registered (only settable by services)
Requires a registered nickname to join
Strips all incoming colors
Makes channel secret
Only chanops can set topic
No NOTICE's allowed in the channel
Auditorium – Makes /names and /who #channel only show channel ops
/invite is not allowed
v <nick>
Gives a voice to users. (May speak in +m channels)
Only clients on a Secure (SSL) Connection may join


Mode flags set on users. Many can be set by using the /mode Nickname command. For instance, if your nickname is "Simba", you could remove mode "x" from yourself by doing /mode Simba -x

User Modes
Server Admin (Set in Oper Block)
Services Admin (Set in Oper Block)
Marks you as being a Bot
Co-Admin (Set in Oper Block)
Makes it so you can not receive channel PRIVMSGs (with the exception of text prefixed with certain characters, see set::channel-command-prefix)
Filters out all the bad words per configuration
Can send & read globops and locops
Hide IRCop Status (IRCop Only)
Available for help (HelpOp) (Set in OperBlock)
Invisible (not shown in /who)
Network Administrator (Set in Oper Block)
Local IRC Operator (Set in Oper Block)
Global IRC Operator (Set in Oper Block)
Hides the channels you are in from /whois
Only U:Lines can kick you (Services Admins Only)
Allows you to only receive PRIVMSGs/NOTICEs from registered (+r) users
Identifies the nick as being registered
Used to protect Services Daemons
Can listen to server notices (see section 3.3 above for more information)
Prevents you from receiving CTCPs
Says you are using a /vhost
Marks you as a WebTV user
Receives infected DCC Send Rejection notices
Lets you see when people do a /whois on you (IRCops Only)
Can listen to wallop messages
Gives user a hidden hostname
Indicates that you are an SSL client

7 – User & Oper Commands Table

NOTE: the /helpop documentation is more up to date, use /helpop command (or /helpop ?command if you are oper) to get more information on a command.

nick <newnickname> Changes your online nick name. Alerts others to the change of your nick
whois <nick> Displays information of user requested. Includes Full Name, Host, Channels User is in, and Oper Status
who <mask> Who allows you to search for users. Masks include: nickname, #channel, hostmask (*.attbi.com)
whowas <nick> <maxreplies> Displays information on a nick that has logged off. The <max replies> field is optional, and limits how many records will be returned.
ison <nick1 nick2 nick3 ...> Allows you to check the online status of a user, or a list of users. Simple return, best used for scripts
join <channel1,channel2, ...> Allows you to join channels. Using the /join #channel1,#channel2,#channel3 will allow you to join more than one channel at a time. The /join 0 command makes you PART all channels. All
cycle <channel1, channel2, ...> Cycles the given channel(s). This command is equivalent to sending a PART then a JOIN command. All
motd <server> Displays the servers motd. Adding a server name allows you to view motd’s on other servers.
rules <server> Displays the ircd.rules of a server. Adding a server name allows you to view rules on other servers All
lusers <server> Displays current & max user loads, both global and local. Adding a server name allows you to view the statistics from other servers.
map Displays a network map All
quit <reason> Causes you to disconnect from the server. If you include a reason, it will be displayed on all channels as you quit All
ping <user> Sends a PING request to a user. Used for checking connection and lag. Servers issue pings on a timed basis to determine if users are still connected.
version <nick> Sends a CTCP Version request to the user. If configured to do so, their client will respond with the client version.
links Displays a list of all servers linked to the network All
Admin <server> Displays the admin info of a server. If a server name is included it will display the info of that server.
userhost <nick> Displays the userhost of the nick given. Generally used for scripts
topic <channel> <topic> Topic <channel> will display the current topic of the given channel. Topic <channel> <topic> will change the topic of the given channel.
invite <nick> <channel> Invites the given user to the given channel. (Must be a channel Op)
kick <channel, channel> <user, user> <reason> Kicks a user or users out of a channel, or channels. A reason may also be supplied.
away <reason> Marks you as being away. A reason may also be supplied.
Watch +-<nick> +-<nick>
Watch is a new notify-type system in UnrealIRCd which is both faster and uses less network resources than any old-style notify system. The server will send you a message when any nickname in your watch list logs on or off. The watch list DOES NOT REMAIN BETWEEN SESSIONS - you (or your script or client) must add the nicknames to your watch list every time you connect to an IRC server.
helpop ?<topic> or !<topic>
HelpOp is a new system of getting IRC Server help. You type either /HELPOP ? <help system topic> or /HELPOP ! <question> The "?" in /HELPOP means query the help system and if you get no response you can choose '!' to send it to the Help Operators online. Using neither ? nor ! will mean the command will be first queried within the help system and if no match if found , it will be forwarded to the help operators All
list <search string> If you don't include a search string, the default is to send you the entire unfiltered list of channels. Below are the options you can use, and what channels LIST will return when you use them.
>number List channels with more than <number> people.
<number List channels with less than <number> people.
C>number List channels created between now and <number> minutes ago.
C<number List channels created earlier than <number> minutes ago.
T>number List channels whose topics are older than <number> minutes (Ie., they have not changed in the last <number> minutes.
T<number List channels whose topics are newer than <number> minutes.
*mask* List channels that match *mask*
!*mask* List channels that do not match *mask*
Knock <channel> <message>
Allows you to ‘knock’ on an invite only channel and ask for access. Will not work if channel has one of the following modes set: +K +V. Will also not work if you are banned
setname Allows users to change their ‘Real Name’ without reconnecting
vhost <login> <password> Hides your host name by using a vhost provided by the server.
mode <chan/nick> <mode>
Lets you set channel and user modes. See User & Channel Modes for a list.
credits Lists credits for everyone that has helped create UnrealIRCd
license Displays the GNU License All
time <server> Displays the servers date and time. Including a server name allows you to check other servers.
botmotd <server>
Displays the servers bot message of the day. Including a server name allows you to check other servers All
identify <password> Sends your password to the services system to identify to your nick.
identify <channel> <password> Sends your password to the services system to identify as the founder of a channel.
dns <option> Returns information about the IRC server's DNS cache. Note, since most clients have a built-in DNS command, you will most likely need to use /raw DNS to use this. Opers may specify an l as the first parameter to the command to receive a list of entries in the DNS cache. All
userip <nick>
Returns the IP address of the user in question. All
oper <userid> <password>
Command to give a user operator status if they match an Oper Block
wallops <message> Sends a message to all users with umode +w IRCop
globops <message> Sends a message to all global IRCops IRCop
chatops <message> Sends a message to all IRCops (local and global) IRCop
locops <message> Sends a message to all local IRCops IRCop
adchat <message> Sends a message to all Admins IRCop
nachat <message> Sends a message to all Net Admins IRCop
kill <nick> <reason> Kills a user from the network IRCop
kline [+|-]<user@host | nick> [<time to ban> <reason>] Bans the hostmask from the server it is issued on. A kline is not a global ban.
time to ban is either: a) a value in seconds, b) a time value, like '1d' is 1 day or c) '0' for permanent. Time and reason are optional, if unspecified set::default-bantime (default: 0/permanent) and 'no reason' are used.
To remove a kline use /kline -user@host
zline [+|-]<*@ip> [<time to ban> <reason>] Bans an IP Address from the local server it is issued on (not global). See kline for more syntax info. Use /zline -*@ip to remove.
gline [+|-]<user@host | nick> [<time to ban> <reason>]
Adds a global ban to anyone that matches. See kline for more syntax info. Use /gline -user@host to remove.
shun [+|-]<user@host | nick> [<time to shun> <reason>]
Prevents a user from executing ANY commands and prevents them from speaking. Shuns are global (like glines). See kline for more syntax info. Use /shun -user@host to remove a shun.
gzline [+|-]<ip> <time to ban> :<reason>
Adds a global zline. See kline for more syntax info. Use /gzline -*@ip to remove a gzline.
rehash <server> <flags> Rehashes the servers config file. Including a server name allows you to rehash a remote servers config file. Several flags are also available. They Include
-dns - Reinitializes and reloads the resolver
-motd - Only re-read all MOTD, BOTMOTD, OPERMOTD and RULES files (including those in tld{} blocks)
-garbage - Force garbage collection
-ssl - Reloads SSL certificates
restart <password> <reason>
Restarts the IRCD Process. Password is required if drpass { } is present. You may also include a reason.
die <password>
Terminates the IRCD Process. Password is required if drpass { } is present. IRCop
lag <server>
This command is like a Sonar or Traceroute for IRC server. You type in /LAG irc.fyremoon.net and it will reply from every server it passes with time and so on. Useful for looking where lag is and optional TS future/past travels
sethost <newhost> Lets you change your vhost to what ever you want it to be.
setident <newident>
Lets you set your ident to what ever you want it to be
chghost <nick> <newhost>
Lets you change the host name of a user currently on the system
chgident <nick> <newident>
Lets you change the ident of a user currently on the system
chgname <nick> <newname>
Lets you change the realname of a user currently on the system
squit <server>
Disconnects a server from the network
connect <server> <port> <server> If only one server is given, it will attempt to connect the server you are ON to the given server. If 2 servers are given, it will attempt to connect the 2 servers together. Put the leaf server as the first, and the hub server as the second.
dccdeny <filemask> <reason>
Adds a DCCDENY for that filemask. Preventing that file from being sent.
undccdeny <filemask>
Removes a DCCDENY IRCop
sajoin <nick> <channel>, <channel>
Forces a user to join a channel(s). Available to services & network admins only IRCop
sapart <nick> <channel>, <channel>
Forces a user to part a channel(s). Available to services & network admins only.
samode <channel> <mode>
Allows Network & Services admins to change modes of a channel without having ChanOps.
rping <servermask>
Will calculate in milliseconds the lag between servers
trace <servermask|nickname>
When used on a user it will give you class and lag info. If you use it on a server it gives you class/version/link info.
Displays the servers OperMotd File
addmotd :<text>
Will add the given text to the end of the Motd
addomotd :<text>
Will add the given text to the end of the OperMotd
sdesc <newdescription>
Allows server admins to change the description line of their server without restarting.
addline <text>
Appends the specified text to unrealircd.conf. You must load the m_addline module to use this command since unrealircd-3.2.9.
mkpasswd <auth-type> <password>
Will encrypt <password> using the <auth-type> hashing method. Available hash methods:
  • crypt [Windows support requires SSL]
  • md5
  • sha1 [requires SSL]
  • ripemd160 [requires SSL]

tsctl offset +/- <time>
Adjust the IRCD’s Internal clock (Do NOT use if you do not understand EXACTLY what it does)
tsctl time
Will give a TS Report IRCop
tsctl alltime Will give a TS Report of ALL servers IRCop
tsctl svstime <timestamp>
Sets the TS time of all servers (Do NOT use if you do not understand EXACTLY what it does)
htm <option>
Controls settings related to high traffic mode. High Traffic Mode (HTM) basically disables certain user commands such as: list whois who etc in response to extremely high traffic on the server. Options include:
-ON Forces server into HTM
-OFF Forces server out of HTM
-NOISY Sets the server to notify users/admins when in goes in and out of HTM
-QUIET Sets the server to NOT notify when going in and out of HTM
-TO <value> Tell HTM at what incoming rate to activate HTM
stats <option>
B - banversion - Send the ban version list
b - badword - Send the badwords list
C - link - Send the link block list
d - denylinkauto - Send the deny link (auto) block list
D - denylinkall - Send the deny link (all) block list
e - exceptthrottle - Send the except throttle block list
E - exceptban - Send the except ban and except tkl block list
f - spamfilter - Send the spamfilter list
F - denydcc - Send the deny dcc block list
G - gline - Send the gline and gzline list
  Extended flags: [+/-mrs] [mask] [reason] [setby]
    m Return glines matching/not matching the specified mask
    r Return glines with a reason matching/not matching the specified reason
    s Return glines set by/not set by clients matching the specified name
I - allow - Send the allow block list
j - officialchans - Send the offical channels list
K - kline - Send the ban user/ban ip/except ban block list
l - linkinfo - Send link information
L - linkinfoall - Send all link information
M - command - Send list of how many times each command was used
n - banrealname - Send the ban realname block list
O - oper - Send the oper block list
P - port - Send information about ports
q - sqline - Send the SQLINE list
Q - bannick - Send the ban nick block list
r - chanrestrict - Send the channel deny/allow block list
R - usage - Send usage information
S - set - Send the set block list
s - shun - Send the shun list
  Extended flags: [+/-mrs] [mask] [reason] [setby]
    m Return shuns matching/not matching the specified mask
    r Return shuns with a reason matching/not matching the specified reason
    s Return shuns set by/not set by clients matching the specified name
t - tld - Send the tld block list
T - traffic - Send traffic information
u - uptime - Send the server uptime and connection count
U - uline - Send the ulines block list
v - denyver - Send the deny version block list
V - vhost - Send the vhost block list
X - notlink - Send the list of servers that are not current linked
Y - class - Send the class block list
z - zip - Send compression information about ziplinked servers (if compiled with ziplinks support)
Z - mem - Send memory usage information
Lists all loaded modules All
This command will disconnect all unknown connections from the IRC server. IRCOp

A feature that isn't widely known is that normal users can also set some limited snomasks, namely +s +sk. By this they can see things like rehashes, kills and various other messages.
To disable this you can use set::restrict-usermodes like this: set { restrict-usermodes "s"; };.

Of course all of this is "information hiding", so it's not "true" security. It will however make it more difficult / increase the effort needed to attack/hack.

9 – Frequently Asked Questions (FAQ)

The FAQ is available online here

A Regular Expressions

Regular expressions are used in many places in Unreal, including badwords, spamfilter, and aliases. Regular expressions are a very complex tool used for pattern matching. They are sometimes referred to as "regexp" or "regex." Unreal uses the TRE regular expression library for its regex. This library supports some very complex and advanced expressions that may be confusing. The information below will help you understand how regexps work. If you are interested in more technical and detailed information about the regexp syntax used by Unreal, visit the TRE homepage.

A.1 Literals

Literals are the most basic component of a regexp. Basically, they are characters that are treated as plaintext. For example, the pattern "test" consists of the four literals, "t," "e," "s," and "t." In Unreal, literals are treated as case insensitive, so the previous regex would match "test" as well as "TEST." Any character that is not a "meta character" (discussed in the following sections) is treated as a literal. You can also explicitely make a character a literal by using a backslash (\). For example, the dot (.) is a metacharacter. If you wish to include a literal ., simply use \. and Unreal will treat this as a period. It is also possible that you want to check for a character that is not easily typed, say ASCII character 3 (color). Rather than having to deal with using an IRC client to create this character, you can use a special sequence, the \x. If you type \x3, then it is interpretted as being the ASCII character 3. The number after the \x is represented as hexidecimal and can be in the range from \x0 to \xFF.

A.2 Dot Operator

The dot (.) operator is used to match "any character." It matches a single character that has any value. For example, the regex "a.c" will match "abc," "adc," etc. However, it will not match "abd" because the "a" and "c" are literals that must match exactly.

A.3 Repetition Operators

One of the common mistakes people make with regex is assuming that they work just like wildcards. That is, the * and ? characters will match just like in a wildcard. While these characters do have similar meaning in a regex, they are not exactly the same. Additionaly, regular expressions also support other, more advanced methods of repetition.

The most basic repetition operator is the ? operator. This operator matches 0 or 1 of the previous character. This, "of the previous character," is where the ? in regex differs from a wildcard. In a wildcard, the expression, "a?c" matches an "a" followed by any character, followed by a "c". In regex it has a different meaning. It matches 0 or 1 of the letter "a" followed by the letter "c". Basically, the ? is modifying the a by specifying how many a's may be present. To emulate the ? in a wildcard, the . operator is used. The regex "a.c" is equivilent to the previously mentioned wildcard. It matches the letter "a" followed by any character, followed by a "c".

The next repetition operator is the *. Again, this operator is similar to a wildcard. It matches 0 or more of the previous character. Note that this "of the previous character" is something that is characteristic of all repetition operators. The regex "a*c" matches 0 or more a's followed by a "c". For example, "aaaaaac" matches. Once again, to make this work like a wildcard, you would use "a.*c" which will cause the * to modify the . (any character) rather than the "a."

The + operator is very similar to the *. However, instead of matching 0 or more, it matches 1 or more. Basically, "a*c" will match "c" (0 a's followed by a c), where as "a+c" would not. The "a+" states that there must be "at least" 1 a. So "c" does not match but "ac" and "aaaaaaaaac" do.

The most advanced repetition operator is known as a "boundary." A boundary lets you set exact constraints on how many of the previous character must be present. For example, you may want to require exactly 8 a's, or at least 8 a's, or between 3 and 5 a's. The boundary allows you to accomplish all of these. The basic syntax is {M,N} where M is the lower bound, and N is the upper bound. For example, the match between 3 and 5 a's, you would do "a{3,5}". However, you do not have to specify both numbers. If you do "a{8}" it means there must be exactly 8 a's. Therefore, "a{8}" is equivilent to "aaaaaaaa." To specify the "at least" example, you basically create a boundary that only has a lower bound. So for at least 8 a's, you would do "a{8,}".

By default, all of the repetition operators are "greedy." Greediness is a somewhat complex idea. Basically, it means that an operator will match as many characters as it can. This is best explained by an example.

Say we have the following text:
And the following regex:

In this example, you might think that the .+ matches "HE." However, this is incorrect. Because the + is greedy, it matches "HEL." The reason is, it chooses the largest portion of the input text that can be matched while still allowing the entire regex to match. In this example, it chose "HEL" because the only other requirement is that the character after the text matched by .+ must be an "L". Since the text is "HELLO", "HEL" is followed by an "L," and therefore it matches. Sometimes, however, it is useful to make an operator nongreedy. This can be done by adding a ? character after the repetition operator. Modifying the above to, ".+?L" the .+? will now match "HE" rather than "HEL" since it has been placed in a nongreedy state. The ? can be added to any repetition character: ??, *?, +?, {M,N}?.

A.4 Bracket Expressions

Bracket expressions provide a convenient way to do an "or" operator. For example, if you want to say "match an a or a b." The bracket expression gets its name from the fact that it is enclosed in brackets ([]). The basic syntax is that the expression includes a series of characters. These characters are then treated as though there were an "or" between them. As an example, the expression "[abc]" matches an "a," a "b," or a "c." Therefore, the regexp "a[bd]c" matches "abc" and "adc" but not "acc."

One very common thing to do is to check for things such as, a letter, or a digit. Rather than having to do, for example, "[0123456789]", the bracket operator supports ranges. Ranges work by specifying the beginning and ending point with a - between them. Therefore, a more simplistic way to test for a digit is to simply do "[0-9]". The same thing can be used on letters, or in fact, any range of ASCII values. If you want to match a letter, simply do "[a-z]" since Unreal is case insensitive, this will match all letters. You can also include multiple ranges in the same expression. To match a letter or a number, "[0-9a-z]". One complication that this creates is that the - is a special character in a bracket expression. To have it match a literal -, the easiest way is to place it as either the first or last character in the expression. For example, "[0-9-]" matches a digit or a -.

To make things even more simple, there are several "character classes" that may be used within a bracket expression. These character classes eliminate the need to define certain ranges. Character classes are written by enclosing their name in :'s. For example, "[0-9]" could also be written as "[:isdigit:]". The list below shows all of the available character classes and what they do:

One important note about character classes is that they MUST be the only element in the expression. For example, [:isdigit:-] is NOT legal. Instead, you can accomplish this same goal by nesting the expressions, for example, to do the same thing as "[0-9-]" using a character class, you could do "[[:isdigit:]-]".

The last feature of the bracket expression is negation. Sometimes it is useful to say "anything except these characters." For example, if you want to check if the character is "not a letter," it is easier to list a-z and say "not these," than it is to list all the non-letters. Bracket expressions allow you to handle this through negation. You negate the expression by specifying a "^" as the first character. For example, "[^a-z]" would match any non-letter. As with the -, if you want to include a literal ^, do not place it in the first position, "[a-z^]". Also, to negate a character class, you must once again use nesting, "[^[:isdigit:]]" would match any non-digit.

A.5 Assertions

Assertions allow you to test for certain conditions that are not representable by character strings, as well as providing shortcuts for some common bracket expressions.

The ^ character is referred to as the "left anchor." This character matches the beginning of a string. If you simply specify a regex such as "test", it will match, for example "this is a test" since that string contains "test." But, sometimes it is useful to ensure that the string actually starts with the pattern. This can be done with ^. For example "^test" means that the text must start with "test." Additionally, the $ character is the "right anchor." This character matches the end of the string. So if you were to do "^test$", then the string must be exactly the word "test."

Similar tests also exist for words. All of the other assertions are specified using a \ followed by a specific character. For example, to test for the beginning and ending of a word, you can use \< and \> respectively.

The remaining assertions all come with two forms, a positive and a negative. These assertions are listed below:

A.6 Alternation

Alternation is a method of saying "or." The alternation operator is the vertical bar (|). For example, if you wanted to say "a or b" you could do "a|b". For normal letters, this could be replaced by a bracket expression, but alternation can also be used with subexpressions (discussed in the next section).

A.7 Subexpressions

Subexpressions are a portion of of a regex that is treated as a single entity. There are two ways to create a subexpression. The two methods differ with regard to "back references," which will be explained later. To declare a subexpression that uses back references, simply enclose it in parentheses (). To create a subexpression that does not use back references, replace the open-parenthesis with, "(?:". For example, "([a-z])" and "(?:[a-z])". The reason subexpressions are useful is you can then apply operators to the expression. All of the repetition operators, for example, that were mentioned as "X or more of the previous character," can also be used for "X or more of the previous subexpression." For example, if you have a regex of "[0-9][a-z][0-9]", to match a digit, followed by a letter, followed by a digit, and then you decided you wanted to match this sequence twice. Normally, you would do, "[0-9][a-z][0-9][0-9][a-z][0-9]". With subexpressions, however, you can simply do "([0-9][a-z][0-9]){2}".

A.8 Back References

Back references allow you to reference the string that matched one of the subexpressions of the regexp. You use a back reference by specifying a backslash (\) followed by a number, 0-9, for example \1. \0 is a special back reference that refers to the entire regexp, rather than a subexpression. Back references are useful when you want to match something that contains the same string twice. For example, say you have a nick!user@host. You know that there is a trojan that uses a nickname and username that matches "[0-9][a-z]{5}", and both the nickname and username are the same. Using "[0-9][a-z]{5}![0-9][a-z]{5}@.+" will not work because it would allow the nickname and username to be different. For example, the nickname could be 1abcde and the username 2fghij. Back references allow you to overcome this limitation. Using, "([0-9][a-z]{5})!\1@.+" will work exactly as expected. This searches for the nickname matching the given subexpressions, then it uses a back reference to say that the username must be the same text.

Since you can only have 9 back references, this is the reason why the (?:) notation is useful. It allows you to create a subexpression without wasting a back reference. Additionally, since back reference information does not need to be saved, it is also faster. Because of this, non-back reference subexpressions should be used whenever back references are not needed.

A.9 Case Sensitivity

As was already mentioned, Unreal makes all regexps case insensitive by default. The main reason for this is, there seem to be many more instances where you want case insensitive searching rather than sensitive, for example, if you block the text "www.test.com," you presumably want to block "WWW.TEST.COM" as well. However, there are instances where you may want case sensitivity, for example, matching for certain trojans. Because of this, a method is provided to dynamically turn case insensitivity on/off. To turn it off, simply use "(?-i)" and to turn it on, "(?i)". For example, "(?-i)[a-z](?i)[a-z]" will match a lowercase letter (case insensitivity is off) followed by either an uppercase or lowercase letter (case insensitivity is on). Additionally, rather than having to always remember to turn the flag back on when you are finished, you can also specify that the flag change should only apply to a subexpression, for example, "(?-i:[a-z])[a-z]" is equivilent to the previous regexp because the -i only applies to the given subexpression.