WO2006128224A1 - A method for filtering online chat - Google Patents
A method for filtering online chat Download PDFInfo
- Publication number
- WO2006128224A1 WO2006128224A1 PCT/AU2006/000724 AU2006000724W WO2006128224A1 WO 2006128224 A1 WO2006128224 A1 WO 2006128224A1 AU 2006000724 W AU2006000724 W AU 2006000724W WO 2006128224 A1 WO2006128224 A1 WO 2006128224A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- components
- online chat
- verb
- text
- noun
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/02—Details
- H04L12/16—Arrangements for providing special services to substations
- H04L12/18—Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
- H04L12/1813—Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
- H04L12/1822—Conducting the conference, e.g. admission, detection, selection or grouping of participants, correlating users to one or more conference sessions, prioritising transmission
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
- H04L51/04—Real-time or near real-time messaging, e.g. instant messaging [IM]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
- H04L51/21—Monitoring or handling of messages
- H04L51/212—Monitoring or handling of messages using filtering or selective blocking
Definitions
- the present invention is directed at a method for filtering online chat, where online chat is text entered by one person on a computing device such as a personal computer, games console, mobile phone or interactive television, such that the text entered is made visible to others, who are connected to each other via some type of network such as the Internet or a telephone network. Whilst the principles of the present invention apply equally well to text so entered in any language, the examples shown and discussed herein will be restricted to the English language.
- the present invention seeks to overcome the limitations of the methods currently in use for filtering online chat, and in so doing the present invention will enable the providers of online chat facilities to offer them such that they are completely free of foul and unsavoury language, and perhaps more importantly where it would be all but impossible for a paedophile to make contact with a child using those facilities. It is expected that the present invention will become the dominant technique for filtering online chat, whenever the participants are predominantly children. Essentially the present invention seeks to filter by inclusion (that is by stating explicitly what is permitted), whereas the current technology seeks to filter by exclusion (that is by stating explicitly what is not permitted).
- the present invention is directed to a method for filtering online chat, where online chat entered by each participant is automatically filtered prior to display to the other participants, wherein:
- the text forming the online chat entered by each participant is broken down into its components, where, in the case of alphabet based languages such as English, these components are words, but in character based languages like Japanese, the components may be individual Kanji characters;
- Each and every component (word in the case of English) comprising the online chat is then verified for suitability by checking that it is exists in a predefined dictionary of allowed components (words in the case of English);
- the filtering method then returns text containing only those components from the original chat verified as present in the predefined dictionary of allowed components;
- the method of the present invention thereby ensures that, provided the predefined dictionary of components does not contain any foul or unsavoury language, it is not possible for the online chat made visible to other participants in the facility, to contain such language.
- any supported punctuation marks would need to be removed before the dictionary look-up is performed and reinstated when forming the filtered text.
- the aforementioned description of the filtering method of the present invention makes no mention of text that might be entered that contains numeric data, such as numbers, monetary values, dates and times. Clearly such data cannot be looked up in the predefined dictionary of components, as there is an infinite number of possibilities. In one aspect of the present invention it may be desirable to allow all such numeric data to be retained as entered without filtering it at all. In another aspect however, particularly in a situation where the participants of the online chat facility are predominantly children, it may be preferred to deliberately discard all such numeric data.
- numeric data is discarded
- a code for conveying numeric data using components that exist in the predefined dictionary of allowed components For example, English Capital letters may be used to indicate numeric data, where for example:
- 0 (zero) might be represented by O
- the present invention was developed when the principals of Crossout Pty Ltd were developing the already patented word game Crossout ® .
- a multi-player crossword solving game aimed at primary school children, it was felt that it would be desirable for players, playing a game against each other across the Internet, to be able to chat to each other.
- the chat facility on the Crossout ® WebSite is called FlickChat TM, and the dictionary of allowed components is called the Flictionary TM.
- Flickchat.php defines a function "flickchat" that takes two arguments, the first being a string of characters entered by a participant in the online chat facility, the second being an optional array of additional words to be added to the Flictionary TM for this invocation only.
- this optional argument is used to temporarily add the username of the participant's opponent in the game to the Flictionary TM so that the participant is allowed to use their opponent's username in their entered chat. It is defined as a proper noun.
- This feature could be used to temporarily add the usernames of all the participants in multi-user chat facility. Such temporary additions to the Flictionary TM do not become permanent entries, and will change dynamically depending on which and how many participants are communicating in the chat facility.
- the function "flickchat" returns an array of two character strings, the first element of which is the user entered chat filtered using one implementation of the method of the present invention.
- the second element is a string of characters being those components (words) separated by spaces that were filtered out of the user entered chat by the method of the present invention.
- Flictionary.php is a piece of code that defines the permanent dictionary of components (words) allowed in this particular implementation of the method of the present invention. It defines the elements of the PHP associative array $flictionary, in which each key is the lower-case version of the word, whilst each corresponding value defines what "type" of word this is. Whilst roughly representing the part of speech of each word, the specially coded allowed "types" are defined in the comments in flickchat.php. Their purpose is to enable a number of rules to be defined as to how derivatives of each primary word may be formed and still be seen as acceptable (and therefore would not be filtered out). These rules, whilst possibly imperfect, do allow the size of the Flictionary TM to be significantly reduced.
- # inflict checks if the supplied $word, optionally truncated by $shortby characters,
- $wdlen strlen($word) - abs($shortby); if ($wdlen ⁇ 1) return ""; else
- # The value of an element in that array indicates the type of word as follows: #
- Adjectives (or adverbs) whose comparatives and superlatives are formed in a standard way
- # 7 Word terminating punctuation is restricted to the use of the following marks: . ? ! and ,.
- 813-word dictionary of allowed components shown in flictionary.php is not by any means complete, and just serves as an example.
- the actual dictionary of allowed components used is immaterial to the method of the present invention and any dictionary of allowed components could be substituted.
- a one-word dictionary would result in very meaningless online chat, whilst a 50,000-word dictionary might make it too easy to construct "acceptable" unsavoury or otherwise unsuitable messages.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Machine Translation (AREA)
Abstract
The present invention is directed at a method for filtering online chat, where online chat is text entered by one person on a computing device such as a personal computer, games console, mobile phone or interactive television, such that the text entered is made visible to others, who are connected to each other via some type of network such as the Internet or a telephone network. The method for filtering such online chat described by the present invention requires the chat to be broken down into its components each of which is validated against a predefined dictionary of allowed components before being made visible to other participants in the online chat facility.
Description
A Method for Filtering Online Chat
The present invention is directed at a method for filtering online chat, where online chat is text entered by one person on a computing device such as a personal computer, games console, mobile phone or interactive television, such that the text entered is made visible to others, who are connected to each other via some type of network such as the Internet or a telephone network. Whilst the principles of the present invention apply equally well to text so entered in any language, the examples shown and discussed herein will be restricted to the English language.
Most systems for supporting online chat permit the participants complete freedom to enter whatever text they wish. This clearly enables people to enter chat that might offend or upset other participants. In particular it may expose participants to foul language to which they would prefer not to be exposed. This can become a serious problem where the participants are children who might not only be exposed to foul or otherwise unsuitable language, but where some other unscrupulous participants in the online chat facility might be paedophiles trying to make contact with these unsuspecting and vulnerable children. There are currently three methods in use to assist with this problem, none of which is particularly satisfactory. They are:
Education
Children are educated by their parents and teachers on techniques for recognising and avoiding chatting with participants who use such language and in particular who attempt to make contact with them. They are advised to never enter any identifying information. Because children's use of these online chat networks cannot always be monitored one can never really be sure that children are putting these principles into practice or that they won't still fall prey to the cunning deployed by some paedophiles.
Moderation
Many of the online chat networks provided by big well-funded Internet portals employ moderators to monitor the chat being entered by participants and deleting messages posted that they deem to be unsuitable. This is not only an extremely difficult and expensive exercise, but it is highly unlikely that the moderators will be able to remove all the unsuitable material. It is also almost impossible for them to remove the unsuitable messages before at least some participants in the network have already been exposed to them.
Filtering
Some providers of online chat, especially those where the participants in the network are typically children, attempt to provide automatic filtering such that unsavoury and foul language is either removed or where messages containing any such language are not posted and therefore not made visible to other participants. However we have all seen with how frequently junk email defeats the spam-filters that these techniques are never entirely successful. Furthermore a paedophile attempting to make contact with a child could very easily do this without resorting to the use of unsavoury language.
The present invention seeks to overcome the limitations of the methods currently in use for filtering online chat, and in so doing the present invention will enable the providers of online chat facilities to offer them such that they are completely free of foul and unsavoury language, and perhaps more importantly where it would be all but impossible for a paedophile to make contact with a child using those facilities. It is expected that the present invention will become the dominant technique for filtering online chat, whenever the participants are predominantly children. Essentially the present invention seeks to filter by inclusion (that is by stating explicitly what is permitted), whereas the current technology seeks to filter by exclusion (that is by stating explicitly what is not permitted).
Accordingly, in one aspect the present invention is directed to a method for filtering online chat, where online chat entered by each participant is automatically filtered prior to display to the other participants, wherein:
• The text forming the online chat entered by each participant is broken down into its components, where, in the case of alphabet based languages such as English, these components are words, but in character based languages like Japanese, the components may be individual Kanji characters;
. Each and every component (word in the case of English) comprising the online chat is then verified for suitability by checking that it is exists in a predefined dictionary of allowed components (words in the case of English);
• The filtering method then returns text containing only those components from the original chat verified as present in the predefined dictionary of allowed components;
• Only the filtered text would be made visible to other participants in the online chat facility.
A second list of the components filtered out because they were not present in the predefined dictionary of allowed components, could be made available for logging purposes but would not be made visible to other participants in the online chat facility. This may be useful to those maintaining the online chat facility in deciding whether or not some new components should be added to the predefined dictionary.
The method of the present invention thereby ensures that, provided the predefined dictionary of components does not contain any foul or unsavoury language, it is not possible for the online chat made visible to other participants in the facility, to contain such language.
By also not allowing the many proper nouns that form the majority of place and street names to be in the predefined dictionary, it would be extremely difficult for participants to arrange a physical meeting using the facility.
In breaking down the entered online chat text into its components, any supported punctuation marks would need to be removed before the dictionary look-up is performed and reinstated when forming the filtered text.
The aforementioned description of the filtering method of the present invention makes no mention of text that might be entered that contains numeric data, such as numbers, monetary values, dates and times. Clearly such data cannot be looked up in the predefined dictionary of components, as there is an infinite number of possibilities. In one aspect of the present invention it may be desirable to allow all such numeric data to be retained as entered without filtering it at all. In another aspect however, particularly in a situation where the participants of the online chat facility are predominantly children, it may be preferred to deliberately discard all such numeric data. Filtering out all numeric data would make it all but impossible for participants in the online chat facility to communicate to other participants their telephone number, their address, or a time, date and place for a meeting. Whilst we may educate our children in not providing such identifying information, if the chat facility filtered out all numeric data as suggested by this aspect of the present invention, it would make it extremely difficult for participants to do so.
It would be possible with a chat facility using the filtering method of the present invention, as hereinbefore described, and in particular with respect to the aspect of it, where numeric data is discarded, for participants to develop a code for conveying numeric data using components that exist in the predefined dictionary of allowed components. For example, English Capital letters may be used to indicate numeric data, where for example:
0 (zero) might be represented by O,
I by l,
2 by Z,
3 by E,
4 by X, 5 by S,
6 by G,
7 by Y,
8 by B and
9 by P.
Using such a code to indicate the telephone number 9876-5432, the following message could be used, "PuBlicitY GetS eXpErt seiZed". Different capitalisation of the same meaningless phrase, thus "publicity getS EXpert seized" might also be used by some participants as a method of conveying unsavoury or foul language, within the scope of the filtering method of the present invention as hereinbefore described.
Accordingly in another aspect of the filtering method of the present invention, it may be desirable, as a means of thwarting attempts to use codes of the nature described in the preceding paragraph, to ignore the case of the letters forming the words in the online chat entered by a participant, and use a standard set of rules for determining the case of the letters comprising components verified as existing in the predefined dictionary of allowed components. One possible set of rules that would make the use of such codes extremely difficult is to return, as a part of the filtering method, all letters as lower-case letters, regardless of how the participant entered them. One might further make an exception and as is the convention in normal written English, always return the first person singular subjective pronoun "I" as upper case. Furthermore if punctuation is supported one might again adopt the normal written English convention of always beginning a sentence with a capital letter, regardless of whether or not the participant did so. If the predefined dictionary of components also identifies the part-of-speech of its components, one might also always return words defined as proper nouns, such that they begin with an upper case (or capital) letter, which is again the usual written English convention.
It might also be desirable in another aspect of the present invention to permit the use of some heavily used well recognised standard SMS style abbreviations such as C for see, U for you, 18r for later, g2g for "got to go" and lol for "laughing out loud", by including them (or similar equivalents in languages other than English) explicitly in the dictionary of allowed components. Furthermore one could if one desired permit (by adding them to the dictionary) well understood slang words such as "gonna", "wanna" and "dunno". Alternatively one might find leaving such things out of the dictionary of allowed components a desirable feature.
AN IMPLEMENTATION
The present invention was developed when the principals of Crossout Pty Ltd were developing the already patented word game Crossout ®. In developing the WebSite for Crossout ®, a multi-player crossword solving game aimed at primary school children, it was felt that it would be desirable for players, playing a game against each other across the Internet, to be able to chat to each other. However after researching the methods currently in use for making such online chat safe for children, it was discovered that there was no method currently in use that provided the level of protection the principals of Crossout Pty Ltd were looking for. They were about to give up and not bother to include a chat facility, when their 10 year old daughter Felicity, who was disappointed that Crossout ® would not have chat, came up with the idea that is the basis of the present invention. Accordingly the chat facility on the Crossout ® WebSite is called FlickChat ™, and the dictionary of allowed components is called the Flictionary ™.
The code shown on the following pages is an early development stage of the computer programs deployed on the Crossout ® WebSite to implement FlickChat ™. They are written in the common and very popular Web programming language PHP. Flickchat.php defines a function "flickchat" that takes two arguments, the first being a string of characters entered by a participant in the online chat facility, the second being an optional array of additional words to be added to the Flictionary ™ for this invocation only. In Crossout ® this optional argument is used to temporarily add the username of the participant's opponent in the game to the Flictionary ™ so that the participant is allowed to use their opponent's username in their entered chat. It is defined as a proper noun. This feature could be used to temporarily add the usernames of all the participants in multi-user chat facility. Such temporary additions to the Flictionary ™ do not become permanent entries, and will change dynamically depending on which and how many participants are communicating in the chat facility. The function "flickchat" returns an array of two character strings, the first element of which is the user entered chat filtered using one implementation of the method of the present invention. The second element is a string of characters being those components (words) separated by spaces that were filtered out of the user entered chat by the method of the present invention.
Flictionary.php is a piece of code that defines the permanent dictionary of components (words) allowed in this particular implementation of the method of the present invention. It defines the elements of the PHP associative array $flictionary, in which each key is the lower-case version of the word, whilst each corresponding value defines what "type" of word this is. Whilst roughly representing the part of speech of each word, the specially coded allowed "types" are defined in the comments in flickchat.php. Their purpose is to enable a number of rules to be defined as to how derivatives of each primary word may be formed and still be seen as acceptable (and therefore would not be filtered out). These rules, whilst possibly imperfect, do allow the size of the Flictionary ™ to be significantly reduced. If one isn't concerned about the size of the Flictionary ™ (eg if you have no intention of displaying its contents), one can simply set the value of each element in the $flictionary associative array to some arbitrary value and list all derivative words separately.
flickchat.php
<?ρhp #
# Crossout & the Logo are Registered Trademarks of Crossout Pty Ltd.
# FlickChat & Flictionary are Trademarks of F, G & E Shalless.
# Crossout Pty Ltd holds UK Patent No GB2341106, US Patent No 6,378,867
# and Australian Patent No 746678 for the game known as Crossout. #
# Copyright F, G & E Shalless 2002-2005 # function inflict($word, $shortby=0, $reqdtyρe="", $extra- '")
{
# inflict checks if the supplied $word, optionally truncated by $shortby characters,
# and optionally extended by the characters in $extra, exists in the array $flictionary,
# and if $reqdtype is supplied that Sword's value (type or part of speech) is contained in
# $reqdtype. It returns the Sword's value (type of part of speech) which will evaluate
# true if the $word is found, or the empty string (which will evaluate false) if not.
# If $word exists in the associative array $flictionary it will be a key to that array. #
# $shortby always truncates whether positive or negative. global $flictionary;
$wdlen=strlen($word) - abs($shortby); if ($wdlen < 1) return ""; else
{ $wordfound=$partofspeech=$flictionary[strtolower(substr($word, 0, $wdlen).$extra)] ; if ($reqdtype && $wordfound) $wordfound=strpos("#".$reqdtype, $partofspeech); if ($wordfound) return $partofspeech; else return "";
} }
function doubleconst($c, $vowel)
{
# returns the character $c, if it is a consonant that is allowed to be doubled
# when forming derivatives, and if $vowel is a vowel. Returns the empty string if not. if (strpos("#bdglmnprstvz", $c) && strpos("#aeiou", $vowel)) return $c; else return ""; } function endsentence($punctuation)
{
# returns a positive integer (which evaluates true) if $punctuation is a recognised
# sentence ending punctuation mark, and returns zero (which evaluates false) otherwise. if ($punctuation) return strpos("#.?!", $punctuation); else return 0; } function validpunct($punctuation)
{
# returns a positive integer (which evaluates true) if $punctuation is an allowed
# punctuation mark, and returns zero (which evaluates false) otherwise. if ($punctuation) return strpos("#.?!,", $punctuation); else return 0; }
function flickchat($chatline, $extxawords=NULL)
{
# returns a two element array of strings:
# the first element being the filtered $chatline
# the second element being all rejected words #
# $extrawords is an optional associative array of additional words allowed on this invocation #
# Words are acceptable if their lower case form is a key to the $flictionary associative array
# The value of an element in that array indicates the type of word as follows: #
# adj = adjective taking ~er ~est
# adv = adverb taking ~er ~est
# ajlcd = adjective taking ~er ~est with last consonant doubled
# ajx = adjective - any derivatives must be separately listed
# art = article
# avlcd = adverb taking ~er ~est with last consonant doubled
# avx = adverb - any derivatives must be separately listed
# conj = conjunction
# contr = contraction
# int = interjection
# noun = noun plural ~s
# ne = noun plural ~es
# nx = noun - any derivatives must be separately listed
# prep = preposition
# pron = pronoun taking 1Il (will), 'd (would) and Ve (have)
# propn = proper noun
# prox = pronoun - any derivatives must be separately listed
# verb = verb present tense ~s
# vbe = verb present tense ~es
# vblcd = verb where last consonant is doubled forming participles
# vbx = verb - any derivatives must be separately listed #
# To ignore all derivative rules and only allow words explicitly listed in $flictionary set
# the value to any of the values with x in it, or any non-listed value that evaluates true #
# Otherwise the following Rules apply: #
# 1 Only words (spelt exactly as they appear) in the Flictionary are allowed.
# 2 Verbs whose participles are formed in a standard way only require the primary verb to
# be listed.
# 3 Nouns whose plurals are formed in a standard way only require the singular to be listed.
# 4 Adjectives (or adverbs) whose comparatives and superlatives are formed in a standard way
# only require the primary word to be listed.
# 5 Pronouns taking all of the '11 (will), Ve (have) and 'd (would) contractions only require
# the primary pronoun to be listed.
# 6 User entered case is ignored. Capitals will only be used for defined Proper Nouns, the word
# "I" and to begin sentences.
# 7 Word terminating punctuation is restricted to the use of the following marks: . ? ! and ,.
# 8 Apostrophes may be used as expected with nouns to indicate possession, but for contractions
# only if separately listed.
# 9 Hyphenated words are permitted only if listed (with the hyphen). #10 Numbers written either as digits or in words are not permitted.
#11 All consecutive white-space characters are replaced with a single space. #
# Note: although not correct (except when a verb is also a noun) possessive apostrophes
# are allowed on verbs as if they were all also nouns. global $flictionary; include("flictionary.php"); foreach ($extrawords as $extraword => $partofspeech) $flictionary[strtolower($extraword)]=$partofspeech;
if ($chatline)
{
$chatword=preg_split("Λs+/",$chatline); $chatwords=count($chatword); $chatlost=$flickchat=""; $endsentence=l; for ($i=0; $i < $chatwords; $i++)
{
$thisword=strtolower(stripslashes($chatword[$i])); if($thisword)
{
$lastc=$thisword{strlen($thisword)-l } ; $last2=$thisword{strlen($thisword)-2}.$lastc; if (validpunct($lastc))
{
$punct=$lastc; $thisword=substr($thisword, 0, strlen($thisword)-l);
} else $punct="M;
if ( $thisword="i" || substr($thisword,052)="im || $endsentence || inflict($thisword)=="propn" || (inflict($thisword, -2)="propn" && $last2=='"s") ) $thisword=ucwords($thisword);
if(inflict($thisword)) $flickchat .= $thisword . $punct . " "; else
{
$lastc=strtolower($thisword{strlen($thisword)-l}); $c2ndlast=strtolower($thisword{strlen($thisword)-2}); $last2= $c2ndlast . $lastc;
$c3rdlast=strtolower($thisword{strlen($thisword)-3}); $last3= $c3rdlast . $last2;
$c4thlast=strtolower($thisword{strlen($thisword)-4}); $last4= $c4thlast . $last3;
$c5thlast=strtolower($thisword{strlen($thisword)-5}); $last5= $c5thlast . $last4; $c6thlast=strtolo wer($this word { strlen($this word)-6 } ) ;
if ($lastc=="s" && inflict($thisword, -1, "noun|verb" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last2=="es" && inflict($thisword, -2, "ne|vbe" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last2=="ed" &&
(inflict($thisword, -2, "verb|vbe" ) || inflict($thisword, -1, "verb" )))
$flickchat .= $thisword . $punct . " "; elseif ($last2=='"s" && inflict($thisword, -2, "noun|proρn|ne|nx|verb|vbe" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last3=="ing" && inflict($thisword, -3, "verb|vbe" ))
$flickchat .= $thisword . $punct . " "; elseif ($last2=='"d" && inflict($thisword, -2, "pron" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last3=='"H" && inflict($thisword, -3, "pron" ))
$flickchat .= $thisword . $punct . " "; elseif ($last3=='"ve" && $last5!="he've" && inflict($thisword, -3, "pron" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last2=="ed" && doubleconst($c3rdlast,$c5thlast)==$c4thlast && inflict($thisword, -3, "vblcd" ))
$flickchat .= $thisword . $punct . " "; elseif ($last3=="ing" && doubleconst($c4thlast,$c6thlast)==$c5thlast && inflict($thisword, -4, "vblcd" ))
$flickchat .= $thisword . $punct . " "; elseif ($last3=="ies" && inflict($thisword, -3, "noun|verb", "y"))
Sfiickchat .= $thisword . $punct . " "; elseif ($last3=="ied" && inflict($thisword, -3, "verb", "y"))
$flickchat .= $thisword . $punct . " "; elseif ($last3=="ing" && inflict($thisword, -3, "verb", "e"))
Sfiickchat .= $thisword . $punct . " "; elseif ($last2=="er" &&
(inflict($thisword, -2, "adj|adv" ) || inflict($thisword, -1, "adj|adv")))
Sfiickchat .= $thisword . $punct . " "; elseif ($last3="est" &&
(inflict($thisword, -3, "adj|adv" ) || inflict($thisword, -2, "adj|adv")))
Sfiickchat .= $thisword . $punct . " "; elseif ($last3="ier" && inflict($thisword, -3, "adj|adv", "y"))
Sfiickchat .= Sthisword . Spunct . " "; elseif ($last4=="iest" && inflict($thisword, -4, "adj|adv", "y"))
Sfiickchat .= Sthisword . Spunct . " "; elseif ($last2="er" && doubleconst($c3rdlast,$c5thlast)==$c4thlast && inflict($thisword, -3, "ajlcφvlcd" ))
Sfiickchat .= Sthisword . Spunct . " "; elseif ($last3="est" && doubleconst($c4thlast,$c6thlast)==$c5thlast && inflict($thisword, -4, "ajlcd|avlcd" ))
Sfiickchat .= Sthisword . Spunct . " "; elseif ( ($last2=="s'" && inflict($thisword, -2, "noun|verb")) || (Slast3=="es'" && inflict($thisword, -3, "ne|vbe")) || ($last4="iesin && inflict($thisword, -4, "noun|verb", "y")) )
Sfiickchat .= Sthisword . Spunct . " ";
else
{ if ($punct) $flickchat .= $punct . " "; if ($thisword) $chatlost .= $thisword . " "; }
} $endsentence=endsentence($punct);
} } if ($flickchat{strlen($flickchat)-l }==" ")
$flickchat=substr($flickchat, 0, strlen($flickchat)-l); if ($chatlost) $chatlost=strtolower($chatlost); return array($flickchat, $chatlost);
} else return array("", ""); }
?>
flictionary.php
, "anything" => "nx" , "anythmg's" => "nx" , "apologise" => "verb" , "apology" => "noun"
, "are" => "vbx" , "argue" => "verb" , "argument" => "noun" , "around" => "avx"
, "art" => "noun" , "as" => "prep" , "ask" => "verb" , "asleep" => "avx" , "assume" => "verb"
, "at" => "prep" , "ate" => "vbx" , "australia" => "propn" , "australian" => "propn"
, "automatic" => "ajx" , "automatically" => "avx" , "awake" => "avx" , "away" => "avx"
, . " "aawwffuull"" ==>> " "aajixx""
, "back" => "verb" , "bad" => "ajx" , "badly" =o "avx" , "ball" => "verb" , "baseball" => "noun"
, "basketball" => "noun" , "bat" => "vblcd" , "bath" => "verb" , "bathe" => "verb"
, "bathroom" => "noun" , "be" => "vbx" , "beautiful" => "ajx" , "because" => "prep"
, "boy" => "noun" , "break" => "vbx" , "breaking" => "vbx" , "breaks" => "vbx"
, "breath" => "noun" , "breathe" => "verb" , "bring" => "vbx" , "bringing" => "vbx"
, "brings" => "vbx" , "broke" => "vbx" , "broken" => "ajx" , "brother" => "noun"
, "brought" => "vbx" , "budge" => "verb" , "bus" => "ne" , "business" => "ne" , "busy" => "adj"
, "but" => "vblcd" , "buy" => "vbx" , "buying" => "vbx" , "buys" => "vbx" , "by" => "prep"
, "bye" =» "noun"
, "cage" => "verb" , "can" => "vblcd" , "cannot" => "vbx" , "can't" => "contr" , "car" => "noun"
, "care" => "verb" , "carry" => "verb" , "casual" => "ajx" , "cat" => "noun" , "catch" => "vbe"
, "catching" => "vbx" , "caught" => "vbx" , "centering" => "vbx" , "centre" => "verb"
, "certain" => "ajx" , "certainty" => "noun" , "change" => "verb" , "child" => "nx"
, "children" => "nx" , "city" => "noun" , "class" => "vbe" , "clean" => "verb" , "close" => "verb"
, "closer" => "ajx" , "closest" => "ajx" , "cloth" => "noun" , "clothe" => "verb"
, "clue" => "verb" , "cold" => "adj" , "colour" => "verb" , "comb" => "verb"
, "common" => "adj" , "complete" => "verb" , "complex" => "ne" , "complicate" => "verb"
, "conquer" => "verb" , "cook" => "verb" , "cool" => "adj" , "correct" => "verb"
, "could" => "vbx" , "couldn't" => "contr" , "could've" => "contr" , "count" => "verb"
, "country" => "noun" , "cricket" => "noun" , "crossout" => "propn" , "crossword" => "noun"
, "cryptic" => "ajx" , "cut" => "vblcd" , "dad" => "noun" , "day" => "noun" , "deep" => "adj"
, "did" => "vbx" , "didn't" => "vbx" , "different" => "ajx" , "disappoint" => "verb"
, "do" => "ne" , "doesn't" => "vbx" , "doing" => "noun" , "done" => "vbx" , "don't" => "contr"
, "doubt" => "verb" , "down" => "verb" , "draw" => "noun" , "drawing" => "noun"
, "drawn" => "vbx" , "dream" => "verb" , "drew" => "vbx" , "drier" => "ajx" , "driest" => "ajx"
, "drink" => "verb" , "dry" => "verb" , "during" => "prep" , "each" => "ajx" , "ear" => "noun"
, "early" => "adj" , "earth" => "verb" , "easy" =o "adj" , "eat" => "vbx" , "eating" =» "vbx"
, "eats" => "vbx" , "ecstatic" => "ajx" , "either" => "prep" , "else" => "prep" , "enable" => "verb"
, "end" => "verb" , "english" => "propn" , "enough" => "ajx" , "enter" => "verb"
, "entry" => "noun" , "error" => "noun" , "especially" => "avx" , "even" => "adj"
, "evening" => "noun" , "ever" => "avx" , "every" => "ajx" , "everybody" => "nx"
, "everybody's" => "nx" , "everyone" => "nx" , "everyone's" => "nx" , "everything" => "nx"
, "everything's" => "nx" , "except" => "verb" , "excite" => "verb" , "eye" => "verb"
, "face" => "verb" , "fact" => "noun" , "fair" => "adj" , "fall" => "noun" , "falling" => "vbx"
, "family" => "noun" , "fare" => "verb" , "farewell" => "noun" , "fast" => "verb"
, "faster" => "ajx" , "fastest" => "ajx" , "father" => "noun" , "favour" => "verb"
, "favourite" =>"ajx" , "fed" => "ajx" , "feed" -> "noun" , "feeding" => "noun" , "feel" => "verb"
, "feeling" => "noun" , "feet" => "nx" , "fell" => "verb" , "few" => "adj" , "field" => "verb"
, "fight" => "noun" , "fighting" => "vbx" , "film" => "verb" , "final" => "ajx"
, "finally" => "avx" , "find" =o "noun" , "finding" => "noun" , "fine" => "verb" , "finer" => "ajx"
, "finest" => "ajx" , "finish" => "verb" , "fire" => "verb" , "first" => "ajx" , "fish" => "vbe"
, "fishy" => "adj" , "fix" => "vbe" , "flickchat" => "propn" , "flictionary" => "propn"
, "focus" => "vblcd" , "follow" => "verb" , "food" => "noun" , "fool" => "verb"
, "foot" => "verb" , "football" => "noun" , "for" => "prep" , "forget" => "noun"
, "forgetting" => "vbx" , "forgot" => "vbx" , "forgotten" => "vbx" , "form" => "verb"
, "fortunate" => "ajx" , "fought" => "vbx" , "found" => "vbx" , "free" => "ajx" , "freer" => "ajx"
, "freest" => "ajx" , "friend" => "noun" , "friendly" => "adj" , "from" =» "prep"
, "front" => "verb" , "full" => "adj" , "fun" => "ajx" , "funny" => "adj" , "future" => "noun"
, "game" => "verb" , "gave" => "vbx" , "get" => "noun" , "getting" => "vbx" , "gift" => "verb"
, "girl" => "noun" , "give" => "noun" , "given" => "vbx" , "giving" =o "vbx" , "go" => "ne"
, "goal" => "noun" , "going" => "noun" , "gone" => "vbx" , "good" => "noun"
' => "vblcd" , "rare" => "adj" ise" => "verb" , "really" => "avx" 1 " , "recognise" => "verb"
"remember" => "verb"
1 => "vbx" , "running" => "vbx"
, "work" => "verb" , "world" => "noun" , "worn" => "vbx" , "worse" => "ajx" , " 'wwoornst .M" = _>-^ "ajx , "would" => "vbx" , "wouldn't" => "contr" , "would've" => "contr" , "write" => "noui , "writing" => "noun" , "written" => "vbx" , "wrong" => "ajx" , "wrote" => "vbx" , "year" => "noun" , "yell" => "verb" , "yellow" => "verb" , "yellower" => "ajx" , "yellowest" => "ajx" , "yes" => "ne" , "you" => "pron" , "your" => "prox" , "you're" ' => "contr" ,, ""yyoouurrss"" ==>> ""pprrooxx"" , "zip" => "vblcd" ) ?>
The following PHP Code segment shows how the flickchat function might be called. In this code segment $chatline is the user entered online chat. The filtered chat is saved, any words filtered out are logged, and if no words were filtered out the player is returned to the game:
$opponent[$otherplayername] = "propn";
$oldchat=$existparams["CHATLINE"]; if($chatline)
{ list($flickchat, $chatlost) = flickchat($chatline, $opρonent); if ($flickchat!=$oldchat)
{
$existparams["CHATLINE"]=$flickchat; $existparams["CHATTIME"]=time(); savehashfile("$playerdata/$username",$existparams);
} if($chatlost)
{ crossoutlogC'CHATLOST'V'Ssessionname^usernamejSchatlost"); $errormessage="One or more words not in Flictionary™";
} else
{ header("Location: grid.php"); exit;
} } else
$chatline=$oldchat;
The table below shows some examples of what the flickchat function would return, where let's say:
$otherplayername = "felicity" and thus $opponent["felicity"] = "propn"
So after the call:
list($flickchat, $chatlost) = flickchat($chatline, $opponent);
The following values for $chatline would return $flickchat and $chatlost as shown in the table:
Note the following points of interest in the results of the flickchat function as shown above:
• The misspelled word "doin" is discarded.
• Although the word "felicity" was not defined in flictionary.php it was passed to the flickchat function as a temporary extra word (via the optional second argument) and because it was defined as a proper noun it is always returned with a capital first letter.
• The words "meet" and "call" have been deliberately left out of flictionary.php to reduce the chances of participants trying to arrange a meeting, or to call one another. These words are lost and would not be displayed to the other participants.
• Attempts to give telephone numbers and addresses are thwarted.
• "Hurrying" and "bored" are not explicitly listed in flictionary.php but are allowed through because "hurry" and "bore" are defined as type "verb" which means the standard rules for forming participles are used.
• hi the last example an attempt to convey a coded unsuitable message is thwarted, by ignoring the user-entered case of the letters.
Please note that the 813-word dictionary of allowed components shown in flictionary.php is not by any means complete, and just serves as an example. The actual dictionary of allowed components used is immaterial to the method of the present invention and any dictionary of allowed components could be substituted. Obviously a one-word dictionary would result in very meaningless online chat, whilst a 50,000-word dictionary might make it too easy to construct "acceptable" unsavoury or otherwise unsuitable messages.
It should be understood that various modifications and variations may be made to the method as hereinbefore described without departing from the spirit and ambit of the present invention which basically hinges on the previously unused technique of filtration by inclusion of allowed components rather than filtration by exclusion of disallowed ones.
Claims
1. A method for filtering online chat, where online chat entered by participants in an online chat facility is automatically filtered prior to display to other participants in the facility, wherein:
. The text forming the online chat entered by each participant is broken down into its components, where, in the case of alphabet based languages such as English, these components are words, but in character based languages like Japanese, the components may be individual Kanji characters;
• Each and every component (word in the case of English) comprising the online chat is then verified for suitability by checking that it is exists in a predefined dictionary of allowed components (words in the case of English);
• The filtering method then returns a text string containing only those components from the original chat verified as present in the predefined dictionary of allowed components;
• Only the text filtered according to the method of this claim would be made visible to other participants in the online chat facility.
2. The method of Claim 1 but where those components filtered out because they were not present in the predefined dictionary of allowed components, would be made available for logging purposes but which are not intended to be made visible to other participants in the online chat facility. This may be useful to those maintaining the online chat facility in deciding whether or not some new components should be added to the predefined dictionary.
3. The method of Claims 1 and 2 where characters in the entered online chat text that are allowed punctuation marks are removed before the dictionary look-up is performed and subsequently reinstated when forming the filtered text.
4. The method of Claims 1, 2 and 3 where if a component is not present in the predefined dictionary of allowed components, before filtering it out, an attempt is made to see if it is a valid derivative of a component that is present, such as: the plural of a predefined noun; the present or past participle or the present or past tense of a predefined verb; or the comparative or superlative of a predefined adjective or adverb, where those derivatives are formed in a standard way.
5. The method of Claims 1, 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is retained as entered without filtering it at all.
6. The method of Claims 1, 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is always filtered out unless explicitly listed in the predefined dictionary of allowed components.
7. The method of Claims 1, 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is always filtered out.
8. The method of Claims I5 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is filtered according to a set of rules.
9. The method of Claims 1, 2, 3 and 4 where after verification in the dictionary of allowed components, the capitalisation of those components entered by the participant in the online chat facility and found to be present, is retained as entered.
10. The method of Claims 1, 2, 3 and 4 where after verification in the dictionary of allowed components, the capitalisation of those components entered by the participant in the online chat facility and found to be present, is ignored and instead a set of capitalisation rules is used to determine the case of the validated text.
11. The method of Claim 10 where the capitalisation rules force validated text to be returned as lower case, except in the following cases where the first letter of the validated component is returned as upper case including: when the component begins a sentence; when the component is the first person subjective pronoun "I"; or the component is defined in the predefined dictionary of allowed components as a proper noun.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2005902786A AU2005902786A0 (en) | 2005-05-31 | A Method For Filtering Online Chat | |
AU2005902786 | 2005-05-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006128224A1 true WO2006128224A1 (en) | 2006-12-07 |
Family
ID=37481131
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/AU2006/000724 WO2006128224A1 (en) | 2005-05-31 | 2006-05-31 | A method for filtering online chat |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2006128224A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009109046A1 (en) * | 2008-03-04 | 2009-09-11 | Ganz | Multiple-layer chat filter system and method |
US8323068B2 (en) | 2010-04-23 | 2012-12-04 | Ganz | Villagers in a virtual world with upgrading via codes |
US8380725B2 (en) | 2010-08-03 | 2013-02-19 | Ganz | Message filter with replacement text |
US8458602B2 (en) | 2009-08-31 | 2013-06-04 | Ganz | System and method for limiting the number of characters displayed in a common area |
US8719730B2 (en) | 2010-04-23 | 2014-05-06 | Ganz | Radial user interface and system for a virtual world game |
US8788943B2 (en) | 2009-05-15 | 2014-07-22 | Ganz | Unlocking emoticons using feature codes |
US9022868B2 (en) | 2011-02-10 | 2015-05-05 | Ganz | Method and system for creating a virtual world where user-controlled characters interact with non-player characters |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001033371A1 (en) * | 1999-11-05 | 2001-05-10 | Surfmonkey.Com, Inc. | System and method of filtering adult content on the internet |
WO2001080214A1 (en) * | 2000-04-18 | 2001-10-25 | Genesys Telecommunication Laboratories, Inc. | Method and apparatus for summarizing previous threads in a communication-center chat session |
AU5365400A (en) * | 2000-08-25 | 2002-02-28 | Gala Incorporated | Electronic bulletin board system |
US20020198940A1 (en) * | 2001-05-03 | 2002-12-26 | Numedeon, Inc. | Multi-tiered safety control system and methods for online communities |
US20040154022A1 (en) * | 2003-01-31 | 2004-08-05 | International Business Machines Corporation | System and method for filtering instant messages by context |
US6842773B1 (en) * | 2000-08-24 | 2005-01-11 | Yahoo ! Inc. | Processing of textual electronic communication distributed in bulk |
-
2006
- 2006-05-31 WO PCT/AU2006/000724 patent/WO2006128224A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001033371A1 (en) * | 1999-11-05 | 2001-05-10 | Surfmonkey.Com, Inc. | System and method of filtering adult content on the internet |
WO2001080214A1 (en) * | 2000-04-18 | 2001-10-25 | Genesys Telecommunication Laboratories, Inc. | Method and apparatus for summarizing previous threads in a communication-center chat session |
US6842773B1 (en) * | 2000-08-24 | 2005-01-11 | Yahoo ! Inc. | Processing of textual electronic communication distributed in bulk |
AU5365400A (en) * | 2000-08-25 | 2002-02-28 | Gala Incorporated | Electronic bulletin board system |
US20020198940A1 (en) * | 2001-05-03 | 2002-12-26 | Numedeon, Inc. | Multi-tiered safety control system and methods for online communities |
US20040154022A1 (en) * | 2003-01-31 | 2004-08-05 | International Business Machines Corporation | System and method for filtering instant messages by context |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009109046A1 (en) * | 2008-03-04 | 2009-09-11 | Ganz | Multiple-layer chat filter system and method |
US8316097B2 (en) | 2008-03-04 | 2012-11-20 | Ganz | Multiple-layer chat filter system and method |
US8321513B2 (en) | 2008-03-04 | 2012-11-27 | Ganz | Multiple-layer chat filter system and method |
US8788943B2 (en) | 2009-05-15 | 2014-07-22 | Ganz | Unlocking emoticons using feature codes |
US8458602B2 (en) | 2009-08-31 | 2013-06-04 | Ganz | System and method for limiting the number of characters displayed in a common area |
US9403089B2 (en) | 2009-08-31 | 2016-08-02 | Ganz | System and method for limiting the number of characters displayed in a common area |
US8323068B2 (en) | 2010-04-23 | 2012-12-04 | Ganz | Villagers in a virtual world with upgrading via codes |
US8719730B2 (en) | 2010-04-23 | 2014-05-06 | Ganz | Radial user interface and system for a virtual world game |
US9050534B2 (en) | 2010-04-23 | 2015-06-09 | Ganz | Achievements for a virtual world game |
US8380725B2 (en) | 2010-08-03 | 2013-02-19 | Ganz | Message filter with replacement text |
US9022868B2 (en) | 2011-02-10 | 2015-05-05 | Ganz | Method and system for creating a virtual world where user-controlled characters interact with non-player characters |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Evans | The emoji code: How smiley faces, love hearts and thumbs up are changing the way we communicate | |
Coleman | The life of slang | |
WO2006128224A1 (en) | A method for filtering online chat | |
McGregor | Focal and optional ergative marking in Warrwa (Kimberley, Western Australia) | |
Martin | Grammatical conspiracies in Tagalog: Family, face and fate—with regard to Benjamin Lee Whorf | |
Gil | Riau Indonesian sama: Explorations in macrofunctionality | |
Gaufman et al. | The Trump Carnival: Populism, Transgression and the Far Right | |
Lasky | The language of journalism: Volume 1, newspaper culture | |
Jussinoja | Life-cycle of Internet trolls | |
Lejsek | Anglicismy v českém jazyce | |
Kim | The North-South Divide in Gorboduc: Fratricide Remembered and Forgotten | |
Turton | Trufax about discussion group netspeak: an historical analysis of semantic change in the English slang of newsgroups and web forums | |
Pearlman | Ezra Pound: America's Wandering Jew | |
Rut-Kluz et al. | Irony inside and outside memes: A case of meme series within relevance theory. | |
Byers | # LetShaCarriRun: A Thematic Analysis of the Twitter Discourse Surrounding Sha'Carri Richardson's Absence from the 2020 Olympics | |
Thatcher | Saving Our Prepositions | |
Salami et al. | Thomas Pynchon’s Against the Day: A Deleuzian Reading of Pynchon’s Language | |
Guo | Produce Wu Lei: a national icon or a product? a case study on national identity and sports media in China | |
Tellou | An Analysis of Vocative Markers in the Quran | |
Garner | The Year 2021 in Language, Grammar, and Writing | |
Papić | THE FIGURATIVE COMPOUND EPITHET IN ZADIE SMITH’S NOVELS | |
Schlackman et al. | Attack mail: the silent killer. | |
Smith | Piloting Princes: Hugh Clifford and the Malay Rulers | |
Garrett | Cutting Edge: New Stories of Mystery and Crime by Women Writers. | |
Glynne-Jones | The book of words |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
NENP | Non-entry into the national phase |
Ref country code: RU |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: RU |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 06752601 Country of ref document: EP Kind code of ref document: A1 |