[go: up one dir, main page]

WO2006128224A1 - A method for filtering online chat - Google Patents

A method for filtering online chat Download PDF

Info

Publication number
WO2006128224A1
WO2006128224A1 PCT/AU2006/000724 AU2006000724W WO2006128224A1 WO 2006128224 A1 WO2006128224 A1 WO 2006128224A1 AU 2006000724 W AU2006000724 W AU 2006000724W WO 2006128224 A1 WO2006128224 A1 WO 2006128224A1
Authority
WO
WIPO (PCT)
Prior art keywords
components
online chat
verb
text
noun
Prior art date
Application number
PCT/AU2006/000724
Other languages
French (fr)
Inventor
Felicity Shalless
Original Assignee
Shalless, Greg
Shalless, Elizabeth
Shalless, Shane
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2005902786A external-priority patent/AU2005902786A0/en
Application filed by Shalless, Greg, Shalless, Elizabeth, Shalless, Shane filed Critical Shalless, Greg
Publication of WO2006128224A1 publication Critical patent/WO2006128224A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1822Conducting the conference, e.g. admission, detection, selection or grouping of participants, correlating users to one or more conference sessions, prioritising transmission
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/04Real-time or near real-time messaging, e.g. instant messaging [IM]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/212Monitoring or handling of messages using filtering or selective blocking

Definitions

  • the present invention is directed at a method for filtering online chat, where online chat is text entered by one person on a computing device such as a personal computer, games console, mobile phone or interactive television, such that the text entered is made visible to others, who are connected to each other via some type of network such as the Internet or a telephone network. Whilst the principles of the present invention apply equally well to text so entered in any language, the examples shown and discussed herein will be restricted to the English language.
  • the present invention seeks to overcome the limitations of the methods currently in use for filtering online chat, and in so doing the present invention will enable the providers of online chat facilities to offer them such that they are completely free of foul and unsavoury language, and perhaps more importantly where it would be all but impossible for a paedophile to make contact with a child using those facilities. It is expected that the present invention will become the dominant technique for filtering online chat, whenever the participants are predominantly children. Essentially the present invention seeks to filter by inclusion (that is by stating explicitly what is permitted), whereas the current technology seeks to filter by exclusion (that is by stating explicitly what is not permitted).
  • the present invention is directed to a method for filtering online chat, where online chat entered by each participant is automatically filtered prior to display to the other participants, wherein:
  • the text forming the online chat entered by each participant is broken down into its components, where, in the case of alphabet based languages such as English, these components are words, but in character based languages like Japanese, the components may be individual Kanji characters;
  • Each and every component (word in the case of English) comprising the online chat is then verified for suitability by checking that it is exists in a predefined dictionary of allowed components (words in the case of English);
  • the filtering method then returns text containing only those components from the original chat verified as present in the predefined dictionary of allowed components;
  • the method of the present invention thereby ensures that, provided the predefined dictionary of components does not contain any foul or unsavoury language, it is not possible for the online chat made visible to other participants in the facility, to contain such language.
  • any supported punctuation marks would need to be removed before the dictionary look-up is performed and reinstated when forming the filtered text.
  • the aforementioned description of the filtering method of the present invention makes no mention of text that might be entered that contains numeric data, such as numbers, monetary values, dates and times. Clearly such data cannot be looked up in the predefined dictionary of components, as there is an infinite number of possibilities. In one aspect of the present invention it may be desirable to allow all such numeric data to be retained as entered without filtering it at all. In another aspect however, particularly in a situation where the participants of the online chat facility are predominantly children, it may be preferred to deliberately discard all such numeric data.
  • numeric data is discarded
  • a code for conveying numeric data using components that exist in the predefined dictionary of allowed components For example, English Capital letters may be used to indicate numeric data, where for example:
  • 0 (zero) might be represented by O
  • the present invention was developed when the principals of Crossout Pty Ltd were developing the already patented word game Crossout ® .
  • a multi-player crossword solving game aimed at primary school children, it was felt that it would be desirable for players, playing a game against each other across the Internet, to be able to chat to each other.
  • the chat facility on the Crossout ® WebSite is called FlickChat TM, and the dictionary of allowed components is called the Flictionary TM.
  • Flickchat.php defines a function "flickchat" that takes two arguments, the first being a string of characters entered by a participant in the online chat facility, the second being an optional array of additional words to be added to the Flictionary TM for this invocation only.
  • this optional argument is used to temporarily add the username of the participant's opponent in the game to the Flictionary TM so that the participant is allowed to use their opponent's username in their entered chat. It is defined as a proper noun.
  • This feature could be used to temporarily add the usernames of all the participants in multi-user chat facility. Such temporary additions to the Flictionary TM do not become permanent entries, and will change dynamically depending on which and how many participants are communicating in the chat facility.
  • the function "flickchat" returns an array of two character strings, the first element of which is the user entered chat filtered using one implementation of the method of the present invention.
  • the second element is a string of characters being those components (words) separated by spaces that were filtered out of the user entered chat by the method of the present invention.
  • Flictionary.php is a piece of code that defines the permanent dictionary of components (words) allowed in this particular implementation of the method of the present invention. It defines the elements of the PHP associative array $flictionary, in which each key is the lower-case version of the word, whilst each corresponding value defines what "type" of word this is. Whilst roughly representing the part of speech of each word, the specially coded allowed "types" are defined in the comments in flickchat.php. Their purpose is to enable a number of rules to be defined as to how derivatives of each primary word may be formed and still be seen as acceptable (and therefore would not be filtered out). These rules, whilst possibly imperfect, do allow the size of the Flictionary TM to be significantly reduced.
  • # inflict checks if the supplied $word, optionally truncated by $shortby characters,
  • $wdlen strlen($word) - abs($shortby); if ($wdlen ⁇ 1) return ""; else
  • # The value of an element in that array indicates the type of word as follows: #
  • Adjectives (or adverbs) whose comparatives and superlatives are formed in a standard way
  • # 7 Word terminating punctuation is restricted to the use of the following marks: . ? ! and ,.
  • 813-word dictionary of allowed components shown in flictionary.php is not by any means complete, and just serves as an example.
  • the actual dictionary of allowed components used is immaterial to the method of the present invention and any dictionary of allowed components could be substituted.
  • a one-word dictionary would result in very meaningless online chat, whilst a 50,000-word dictionary might make it too easy to construct "acceptable" unsavoury or otherwise unsuitable messages.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

The present invention is directed at a method for filtering online chat, where online chat is text entered by one person on a computing device such as a personal computer, games console, mobile phone or interactive television, such that the text entered is made visible to others, who are connected to each other via some type of network such as the Internet or a telephone network. The method for filtering such online chat described by the present invention requires the chat to be broken down into its components each of which is validated against a predefined dictionary of allowed components before being made visible to other participants in the online chat facility.

Description

A Method for Filtering Online Chat
The present invention is directed at a method for filtering online chat, where online chat is text entered by one person on a computing device such as a personal computer, games console, mobile phone or interactive television, such that the text entered is made visible to others, who are connected to each other via some type of network such as the Internet or a telephone network. Whilst the principles of the present invention apply equally well to text so entered in any language, the examples shown and discussed herein will be restricted to the English language.
Most systems for supporting online chat permit the participants complete freedom to enter whatever text they wish. This clearly enables people to enter chat that might offend or upset other participants. In particular it may expose participants to foul language to which they would prefer not to be exposed. This can become a serious problem where the participants are children who might not only be exposed to foul or otherwise unsuitable language, but where some other unscrupulous participants in the online chat facility might be paedophiles trying to make contact with these unsuspecting and vulnerable children. There are currently three methods in use to assist with this problem, none of which is particularly satisfactory. They are:
Education
Children are educated by their parents and teachers on techniques for recognising and avoiding chatting with participants who use such language and in particular who attempt to make contact with them. They are advised to never enter any identifying information. Because children's use of these online chat networks cannot always be monitored one can never really be sure that children are putting these principles into practice or that they won't still fall prey to the cunning deployed by some paedophiles.
Moderation
Many of the online chat networks provided by big well-funded Internet portals employ moderators to monitor the chat being entered by participants and deleting messages posted that they deem to be unsuitable. This is not only an extremely difficult and expensive exercise, but it is highly unlikely that the moderators will be able to remove all the unsuitable material. It is also almost impossible for them to remove the unsuitable messages before at least some participants in the network have already been exposed to them.
Filtering
Some providers of online chat, especially those where the participants in the network are typically children, attempt to provide automatic filtering such that unsavoury and foul language is either removed or where messages containing any such language are not posted and therefore not made visible to other participants. However we have all seen with how frequently junk email defeats the spam-filters that these techniques are never entirely successful. Furthermore a paedophile attempting to make contact with a child could very easily do this without resorting to the use of unsavoury language. The present invention seeks to overcome the limitations of the methods currently in use for filtering online chat, and in so doing the present invention will enable the providers of online chat facilities to offer them such that they are completely free of foul and unsavoury language, and perhaps more importantly where it would be all but impossible for a paedophile to make contact with a child using those facilities. It is expected that the present invention will become the dominant technique for filtering online chat, whenever the participants are predominantly children. Essentially the present invention seeks to filter by inclusion (that is by stating explicitly what is permitted), whereas the current technology seeks to filter by exclusion (that is by stating explicitly what is not permitted).
Accordingly, in one aspect the present invention is directed to a method for filtering online chat, where online chat entered by each participant is automatically filtered prior to display to the other participants, wherein:
• The text forming the online chat entered by each participant is broken down into its components, where, in the case of alphabet based languages such as English, these components are words, but in character based languages like Japanese, the components may be individual Kanji characters;
. Each and every component (word in the case of English) comprising the online chat is then verified for suitability by checking that it is exists in a predefined dictionary of allowed components (words in the case of English);
• The filtering method then returns text containing only those components from the original chat verified as present in the predefined dictionary of allowed components;
• Only the filtered text would be made visible to other participants in the online chat facility.
A second list of the components filtered out because they were not present in the predefined dictionary of allowed components, could be made available for logging purposes but would not be made visible to other participants in the online chat facility. This may be useful to those maintaining the online chat facility in deciding whether or not some new components should be added to the predefined dictionary.
The method of the present invention thereby ensures that, provided the predefined dictionary of components does not contain any foul or unsavoury language, it is not possible for the online chat made visible to other participants in the facility, to contain such language.
By also not allowing the many proper nouns that form the majority of place and street names to be in the predefined dictionary, it would be extremely difficult for participants to arrange a physical meeting using the facility.
In breaking down the entered online chat text into its components, any supported punctuation marks would need to be removed before the dictionary look-up is performed and reinstated when forming the filtered text. The aforementioned description of the filtering method of the present invention makes no mention of text that might be entered that contains numeric data, such as numbers, monetary values, dates and times. Clearly such data cannot be looked up in the predefined dictionary of components, as there is an infinite number of possibilities. In one aspect of the present invention it may be desirable to allow all such numeric data to be retained as entered without filtering it at all. In another aspect however, particularly in a situation where the participants of the online chat facility are predominantly children, it may be preferred to deliberately discard all such numeric data. Filtering out all numeric data would make it all but impossible for participants in the online chat facility to communicate to other participants their telephone number, their address, or a time, date and place for a meeting. Whilst we may educate our children in not providing such identifying information, if the chat facility filtered out all numeric data as suggested by this aspect of the present invention, it would make it extremely difficult for participants to do so.
It would be possible with a chat facility using the filtering method of the present invention, as hereinbefore described, and in particular with respect to the aspect of it, where numeric data is discarded, for participants to develop a code for conveying numeric data using components that exist in the predefined dictionary of allowed components. For example, English Capital letters may be used to indicate numeric data, where for example:
0 (zero) might be represented by O,
I by l,
2 by Z,
3 by E,
4 by X, 5 by S,
6 by G,
7 by Y,
8 by B and
9 by P.
Using such a code to indicate the telephone number 9876-5432, the following message could be used, "PuBlicitY GetS eXpErt seiZed". Different capitalisation of the same meaningless phrase, thus "publicity getS EXpert seized" might also be used by some participants as a method of conveying unsavoury or foul language, within the scope of the filtering method of the present invention as hereinbefore described.
Accordingly in another aspect of the filtering method of the present invention, it may be desirable, as a means of thwarting attempts to use codes of the nature described in the preceding paragraph, to ignore the case of the letters forming the words in the online chat entered by a participant, and use a standard set of rules for determining the case of the letters comprising components verified as existing in the predefined dictionary of allowed components. One possible set of rules that would make the use of such codes extremely difficult is to return, as a part of the filtering method, all letters as lower-case letters, regardless of how the participant entered them. One might further make an exception and as is the convention in normal written English, always return the first person singular subjective pronoun "I" as upper case. Furthermore if punctuation is supported one might again adopt the normal written English convention of always beginning a sentence with a capital letter, regardless of whether or not the participant did so. If the predefined dictionary of components also identifies the part-of-speech of its components, one might also always return words defined as proper nouns, such that they begin with an upper case (or capital) letter, which is again the usual written English convention. It might also be desirable in another aspect of the present invention to permit the use of some heavily used well recognised standard SMS style abbreviations such as C for see, U for you, 18r for later, g2g for "got to go" and lol for "laughing out loud", by including them (or similar equivalents in languages other than English) explicitly in the dictionary of allowed components. Furthermore one could if one desired permit (by adding them to the dictionary) well understood slang words such as "gonna", "wanna" and "dunno". Alternatively one might find leaving such things out of the dictionary of allowed components a desirable feature.
AN IMPLEMENTATION
The present invention was developed when the principals of Crossout Pty Ltd were developing the already patented word game Crossout ®. In developing the WebSite for Crossout ®, a multi-player crossword solving game aimed at primary school children, it was felt that it would be desirable for players, playing a game against each other across the Internet, to be able to chat to each other. However after researching the methods currently in use for making such online chat safe for children, it was discovered that there was no method currently in use that provided the level of protection the principals of Crossout Pty Ltd were looking for. They were about to give up and not bother to include a chat facility, when their 10 year old daughter Felicity, who was disappointed that Crossout ® would not have chat, came up with the idea that is the basis of the present invention. Accordingly the chat facility on the Crossout ® WebSite is called FlickChat ™, and the dictionary of allowed components is called the Flictionary ™.
The code shown on the following pages is an early development stage of the computer programs deployed on the Crossout ® WebSite to implement FlickChat ™. They are written in the common and very popular Web programming language PHP. Flickchat.php defines a function "flickchat" that takes two arguments, the first being a string of characters entered by a participant in the online chat facility, the second being an optional array of additional words to be added to the Flictionary ™ for this invocation only. In Crossout ® this optional argument is used to temporarily add the username of the participant's opponent in the game to the Flictionary ™ so that the participant is allowed to use their opponent's username in their entered chat. It is defined as a proper noun. This feature could be used to temporarily add the usernames of all the participants in multi-user chat facility. Such temporary additions to the Flictionary ™ do not become permanent entries, and will change dynamically depending on which and how many participants are communicating in the chat facility. The function "flickchat" returns an array of two character strings, the first element of which is the user entered chat filtered using one implementation of the method of the present invention. The second element is a string of characters being those components (words) separated by spaces that were filtered out of the user entered chat by the method of the present invention.
Flictionary.php is a piece of code that defines the permanent dictionary of components (words) allowed in this particular implementation of the method of the present invention. It defines the elements of the PHP associative array $flictionary, in which each key is the lower-case version of the word, whilst each corresponding value defines what "type" of word this is. Whilst roughly representing the part of speech of each word, the specially coded allowed "types" are defined in the comments in flickchat.php. Their purpose is to enable a number of rules to be defined as to how derivatives of each primary word may be formed and still be seen as acceptable (and therefore would not be filtered out). These rules, whilst possibly imperfect, do allow the size of the Flictionary ™ to be significantly reduced. If one isn't concerned about the size of the Flictionary ™ (eg if you have no intention of displaying its contents), one can simply set the value of each element in the $flictionary associative array to some arbitrary value and list all derivative words separately. flickchat.php
<?ρhp #
# Crossout & the Logo are Registered Trademarks of Crossout Pty Ltd.
# FlickChat & Flictionary are Trademarks of F, G & E Shalless.
# Crossout Pty Ltd holds UK Patent No GB2341106, US Patent No 6,378,867
# and Australian Patent No 746678 for the game known as Crossout. #
# Copyright F, G & E Shalless 2002-2005 # function inflict($word, $shortby=0, $reqdtyρe="", $extra- '")
{
# inflict checks if the supplied $word, optionally truncated by $shortby characters,
# and optionally extended by the characters in $extra, exists in the array $flictionary,
# and if $reqdtype is supplied that Sword's value (type or part of speech) is contained in
# $reqdtype. It returns the Sword's value (type of part of speech) which will evaluate
# true if the $word is found, or the empty string (which will evaluate false) if not.
# If $word exists in the associative array $flictionary it will be a key to that array. #
# $shortby always truncates whether positive or negative. global $flictionary;
$wdlen=strlen($word) - abs($shortby); if ($wdlen < 1) return ""; else
{ $wordfound=$partofspeech=$flictionary[strtolower(substr($word, 0, $wdlen).$extra)] ; if ($reqdtype && $wordfound) $wordfound=strpos("#".$reqdtype, $partofspeech); if ($wordfound) return $partofspeech; else return "";
} } function doubleconst($c, $vowel)
{
# returns the character $c, if it is a consonant that is allowed to be doubled
# when forming derivatives, and if $vowel is a vowel. Returns the empty string if not. if (strpos("#bdglmnprstvz", $c) && strpos("#aeiou", $vowel)) return $c; else return ""; } function endsentence($punctuation)
{
# returns a positive integer (which evaluates true) if $punctuation is a recognised
# sentence ending punctuation mark, and returns zero (which evaluates false) otherwise. if ($punctuation) return strpos("#.?!", $punctuation); else return 0; } function validpunct($punctuation)
{
# returns a positive integer (which evaluates true) if $punctuation is an allowed
# punctuation mark, and returns zero (which evaluates false) otherwise. if ($punctuation) return strpos("#.?!,", $punctuation); else return 0; }
function flickchat($chatline, $extxawords=NULL)
{
# returns a two element array of strings:
# the first element being the filtered $chatline
# the second element being all rejected words #
# $extrawords is an optional associative array of additional words allowed on this invocation #
# Words are acceptable if their lower case form is a key to the $flictionary associative array
# The value of an element in that array indicates the type of word as follows: #
# adj = adjective taking ~er ~est
# adv = adverb taking ~er ~est
# ajlcd = adjective taking ~er ~est with last consonant doubled
# ajx = adjective - any derivatives must be separately listed
# art = article
# avlcd = adverb taking ~er ~est with last consonant doubled
# avx = adverb - any derivatives must be separately listed
# conj = conjunction
# contr = contraction
# int = interjection
# noun = noun plural ~s
# ne = noun plural ~es
# nx = noun - any derivatives must be separately listed
# prep = preposition
# pron = pronoun taking 1Il (will), 'd (would) and Ve (have)
# propn = proper noun
# prox = pronoun - any derivatives must be separately listed
# verb = verb present tense ~s
# vbe = verb present tense ~es
# vblcd = verb where last consonant is doubled forming participles
# vbx = verb - any derivatives must be separately listed #
# To ignore all derivative rules and only allow words explicitly listed in $flictionary set
# the value to any of the values with x in it, or any non-listed value that evaluates true #
# Otherwise the following Rules apply: #
# 1 Only words (spelt exactly as they appear) in the Flictionary are allowed.
# 2 Verbs whose participles are formed in a standard way only require the primary verb to
# be listed.
# 3 Nouns whose plurals are formed in a standard way only require the singular to be listed.
# 4 Adjectives (or adverbs) whose comparatives and superlatives are formed in a standard way
# only require the primary word to be listed.
# 5 Pronouns taking all of the '11 (will), Ve (have) and 'd (would) contractions only require
# the primary pronoun to be listed.
# 6 User entered case is ignored. Capitals will only be used for defined Proper Nouns, the word
# "I" and to begin sentences.
# 7 Word terminating punctuation is restricted to the use of the following marks: . ? ! and ,.
# 8 Apostrophes may be used as expected with nouns to indicate possession, but for contractions
# only if separately listed.
# 9 Hyphenated words are permitted only if listed (with the hyphen). #10 Numbers written either as digits or in words are not permitted.
#11 All consecutive white-space characters are replaced with a single space. #
# Note: although not correct (except when a verb is also a noun) possessive apostrophes
# are allowed on verbs as if they were all also nouns. global $flictionary; include("flictionary.php"); foreach ($extrawords as $extraword => $partofspeech) $flictionary[strtolower($extraword)]=$partofspeech;
if ($chatline)
{
$chatword=preg_split("Λs+/",$chatline); $chatwords=count($chatword); $chatlost=$flickchat=""; $endsentence=l; for ($i=0; $i < $chatwords; $i++)
{
$thisword=strtolower(stripslashes($chatword[$i])); if($thisword)
{
$lastc=$thisword{strlen($thisword)-l } ; $last2=$thisword{strlen($thisword)-2}.$lastc; if (validpunct($lastc))
{
$punct=$lastc; $thisword=substr($thisword, 0, strlen($thisword)-l);
} else $punct="M; if ( $thisword="i" || substr($thisword,052)="im || $endsentence || inflict($thisword)=="propn" || (inflict($thisword, -2)="propn" && $last2=='"s") ) $thisword=ucwords($thisword);
if(inflict($thisword)) $flickchat .= $thisword . $punct . " "; else
{
$lastc=strtolower($thisword{strlen($thisword)-l}); $c2ndlast=strtolower($thisword{strlen($thisword)-2}); $last2= $c2ndlast . $lastc;
$c3rdlast=strtolower($thisword{strlen($thisword)-3}); $last3= $c3rdlast . $last2;
$c4thlast=strtolower($thisword{strlen($thisword)-4}); $last4= $c4thlast . $last3;
$c5thlast=strtolower($thisword{strlen($thisword)-5}); $last5= $c5thlast . $last4; $c6thlast=strtolo wer($this word { strlen($this word)-6 } ) ;
if ($lastc=="s" && inflict($thisword, -1, "noun|verb" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last2=="es" && inflict($thisword, -2, "ne|vbe" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last2=="ed" &&
(inflict($thisword, -2, "verb|vbe" ) || inflict($thisword, -1, "verb" )))
$flickchat .= $thisword . $punct . " "; elseif ($last2=='"s" && inflict($thisword, -2, "noun|proρn|ne|nx|verb|vbe" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last3=="ing" && inflict($thisword, -3, "verb|vbe" ))
$flickchat .= $thisword . $punct . " "; elseif ($last2=='"d" && inflict($thisword, -2, "pron" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last3=='"H" && inflict($thisword, -3, "pron" ))
$flickchat .= $thisword . $punct . " "; elseif ($last3=='"ve" && $last5!="he've" && inflict($thisword, -3, "pron" ))
$flickchat .= $thisword . $ρunct . " "; elseif ($last2=="ed" && doubleconst($c3rdlast,$c5thlast)==$c4thlast && inflict($thisword, -3, "vblcd" ))
$flickchat .= $thisword . $punct . " "; elseif ($last3=="ing" && doubleconst($c4thlast,$c6thlast)==$c5thlast && inflict($thisword, -4, "vblcd" ))
$flickchat .= $thisword . $punct . " "; elseif ($last3=="ies" && inflict($thisword, -3, "noun|verb", "y"))
Sfiickchat .= $thisword . $punct . " "; elseif ($last3=="ied" && inflict($thisword, -3, "verb", "y"))
$flickchat .= $thisword . $punct . " "; elseif ($last3=="ing" && inflict($thisword, -3, "verb", "e"))
Sfiickchat .= $thisword . $punct . " "; elseif ($last2=="er" &&
(inflict($thisword, -2, "adj|adv" ) || inflict($thisword, -1, "adj|adv")))
Sfiickchat .= $thisword . $punct . " "; elseif ($last3="est" &&
(inflict($thisword, -3, "adj|adv" ) || inflict($thisword, -2, "adj|adv")))
Sfiickchat .= $thisword . $punct . " "; elseif ($last3="ier" && inflict($thisword, -3, "adj|adv", "y"))
Sfiickchat .= Sthisword . Spunct . " "; elseif ($last4=="iest" && inflict($thisword, -4, "adj|adv", "y"))
Sfiickchat .= Sthisword . Spunct . " "; elseif ($last2="er" && doubleconst($c3rdlast,$c5thlast)==$c4thlast && inflict($thisword, -3, "ajlcφvlcd" ))
Sfiickchat .= Sthisword . Spunct . " "; elseif ($last3="est" && doubleconst($c4thlast,$c6thlast)==$c5thlast && inflict($thisword, -4, "ajlcd|avlcd" ))
Sfiickchat .= Sthisword . Spunct . " "; elseif ( ($last2=="s'" && inflict($thisword, -2, "noun|verb")) || (Slast3=="es'" && inflict($thisword, -3, "ne|vbe")) || ($last4="iesin && inflict($thisword, -4, "noun|verb", "y")) )
Sfiickchat .= Sthisword . Spunct . " "; else
{ if ($punct) $flickchat .= $punct . " "; if ($thisword) $chatlost .= $thisword . " "; }
} $endsentence=endsentence($punct);
} } if ($flickchat{strlen($flickchat)-l }==" ")
$flickchat=substr($flickchat, 0, strlen($flickchat)-l); if ($chatlost) $chatlost=strtolower($chatlost); return array($flickchat, $chatlost);
} else return array("", ""); }
?>
flictionary.php
Figure imgf000012_0001
, "anything" => "nx" , "anythmg's" => "nx" , "apologise" => "verb" , "apology" => "noun"
, "are" => "vbx" , "argue" => "verb" , "argument" => "noun" , "around" => "avx"
, "art" => "noun" , "as" => "prep" , "ask" => "verb" , "asleep" => "avx" , "assume" => "verb"
, "at" => "prep" , "ate" => "vbx" , "australia" => "propn" , "australian" => "propn"
, "automatic" => "ajx" , "automatically" => "avx" , "awake" => "avx" , "away" => "avx"
, . " "aawwffuull"" ==>> " "aajixx""
, "back" => "verb" , "bad" => "ajx" , "badly" =o "avx" , "ball" => "verb" , "baseball" => "noun"
, "basketball" => "noun" , "bat" => "vblcd" , "bath" => "verb" , "bathe" => "verb"
, "bathroom" => "noun" , "be" => "vbx" , "beautiful" => "ajx" , "because" => "prep" , "boy" => "noun" , "break" => "vbx" , "breaking" => "vbx" , "breaks" => "vbx"
, "breath" => "noun" , "breathe" => "verb" , "bring" => "vbx" , "bringing" => "vbx"
, "brings" => "vbx" , "broke" => "vbx" , "broken" => "ajx" , "brother" => "noun"
, "brought" => "vbx" , "budge" => "verb" , "bus" => "ne" , "business" => "ne" , "busy" => "adj"
, "but" => "vblcd" , "buy" => "vbx" , "buying" => "vbx" , "buys" => "vbx" , "by" => "prep"
, "bye" =» "noun"
, "cage" => "verb" , "can" => "vblcd" , "cannot" => "vbx" , "can't" => "contr" , "car" => "noun"
, "care" => "verb" , "carry" => "verb" , "casual" => "ajx" , "cat" => "noun" , "catch" => "vbe"
, "catching" => "vbx" , "caught" => "vbx" , "centering" => "vbx" , "centre" => "verb"
, "certain" => "ajx" , "certainty" => "noun" , "change" => "verb" , "child" => "nx"
, "children" => "nx" , "city" => "noun" , "class" => "vbe" , "clean" => "verb" , "close" => "verb"
, "closer" => "ajx" , "closest" => "ajx" , "cloth" => "noun" , "clothe" => "verb"
, "clue" => "verb" , "cold" => "adj" , "colour" => "verb" , "comb" => "verb"
, "common" => "adj" , "complete" => "verb" , "complex" => "ne" , "complicate" => "verb"
, "conquer" => "verb" , "cook" => "verb" , "cool" => "adj" , "correct" => "verb"
, "could" => "vbx" , "couldn't" => "contr" , "could've" => "contr" , "count" => "verb"
, "country" => "noun" , "cricket" => "noun" , "crossout" => "propn" , "crossword" => "noun"
, "cryptic" => "ajx" , "cut" => "vblcd" , "dad" => "noun" , "day" => "noun" , "deep" => "adj"
, "did" => "vbx" , "didn't" => "vbx" , "different" => "ajx" , "disappoint" => "verb"
, "do" => "ne" , "doesn't" => "vbx" , "doing" => "noun" , "done" => "vbx" , "don't" => "contr"
, "doubt" => "verb" , "down" => "verb" , "draw" => "noun" , "drawing" => "noun"
, "drawn" => "vbx" , "dream" => "verb" , "drew" => "vbx" , "drier" => "ajx" , "driest" => "ajx"
, "drink" => "verb" , "dry" => "verb" , "during" => "prep" , "each" => "ajx" , "ear" => "noun"
, "early" => "adj" , "earth" => "verb" , "easy" =o "adj" , "eat" => "vbx" , "eating" =» "vbx"
, "eats" => "vbx" , "ecstatic" => "ajx" , "either" => "prep" , "else" => "prep" , "enable" => "verb"
, "end" => "verb" , "english" => "propn" , "enough" => "ajx" , "enter" => "verb"
, "entry" => "noun" , "error" => "noun" , "especially" => "avx" , "even" => "adj"
, "evening" => "noun" , "ever" => "avx" , "every" => "ajx" , "everybody" => "nx"
, "everybody's" => "nx" , "everyone" => "nx" , "everyone's" => "nx" , "everything" => "nx"
, "everything's" => "nx" , "except" => "verb" , "excite" => "verb" , "eye" => "verb"
, "face" => "verb" , "fact" => "noun" , "fair" => "adj" , "fall" => "noun" , "falling" => "vbx"
, "family" => "noun" , "fare" => "verb" , "farewell" => "noun" , "fast" => "verb"
, "faster" => "ajx" , "fastest" => "ajx" , "father" => "noun" , "favour" => "verb"
, "favourite" =>"ajx" , "fed" => "ajx" , "feed" -> "noun" , "feeding" => "noun" , "feel" => "verb"
, "feeling" => "noun" , "feet" => "nx" , "fell" => "verb" , "few" => "adj" , "field" => "verb"
, "fight" => "noun" , "fighting" => "vbx" , "film" => "verb" , "final" => "ajx"
, "finally" => "avx" , "find" =o "noun" , "finding" => "noun" , "fine" => "verb" , "finer" => "ajx"
, "finest" => "ajx" , "finish" => "verb" , "fire" => "verb" , "first" => "ajx" , "fish" => "vbe"
, "fishy" => "adj" , "fix" => "vbe" , "flickchat" => "propn" , "flictionary" => "propn"
, "focus" => "vblcd" , "follow" => "verb" , "food" => "noun" , "fool" => "verb"
, "foot" => "verb" , "football" => "noun" , "for" => "prep" , "forget" => "noun"
, "forgetting" => "vbx" , "forgot" => "vbx" , "forgotten" => "vbx" , "form" => "verb"
, "fortunate" => "ajx" , "fought" => "vbx" , "found" => "vbx" , "free" => "ajx" , "freer" => "ajx"
, "freest" => "ajx" , "friend" => "noun" , "friendly" => "adj" , "from" =» "prep"
, "front" => "verb" , "full" => "adj" , "fun" => "ajx" , "funny" => "adj" , "future" => "noun"
, "game" => "verb" , "gave" => "vbx" , "get" => "noun" , "getting" => "vbx" , "gift" => "verb"
, "girl" => "noun" , "give" => "noun" , "given" => "vbx" , "giving" =o "vbx" , "go" => "ne"
, "goal" => "noun" , "going" => "noun" , "gone" => "vbx" , "good" => "noun"
, "good-bye" => "noun" , "got" => "vbx" , "gotten" => "vbx" , "grade" => "verb"
Figure imgf000014_0001
Figure imgf000015_0001
' => "vblcd" , "rare" => "adj" ise" => "verb" , "really" => "avx" 1 " , "recognise" => "verb"
"remember" => "verb"
1 => "vbx" , "running" => "vbx"
Figure imgf000015_0002
Figure imgf000016_0001
, "work" => "verb" , "world" => "noun" , "worn" => "vbx" , "worse" => "ajx" , " 'wwoornst .M" = _>-^ "ajx , "would" => "vbx" , "wouldn't" => "contr" , "would've" => "contr" , "write" => "noui , "writing" => "noun" , "written" => "vbx" , "wrong" => "ajx" , "wrote" => "vbx" , "year" => "noun" , "yell" => "verb" , "yellow" => "verb" , "yellower" => "ajx" , "yellowest" => "ajx" , "yes" => "ne" , "you" => "pron" , "your" => "prox" , "you're" ' => "contr" ,, ""yyoouurrss"" ==>> ""pprrooxx"" , "zip" => "vblcd" ) ?>
The following PHP Code segment shows how the flickchat function might be called. In this code segment $chatline is the user entered online chat. The filtered chat is saved, any words filtered out are logged, and if no words were filtered out the player is returned to the game:
$opponent[$otherplayername] = "propn";
$oldchat=$existparams["CHATLINE"]; if($chatline)
{ list($flickchat, $chatlost) = flickchat($chatline, $opρonent); if ($flickchat!=$oldchat)
{
$existparams["CHATLINE"]=$flickchat; $existparams["CHATTIME"]=time(); savehashfile("$playerdata/$username",$existparams);
} if($chatlost)
{ crossoutlogC'CHATLOST'V'Ssessionname^usernamejSchatlost"); $errormessage="One or more words not in Flictionary&#0153;";
} else
{ header("Location: grid.php"); exit;
} } else
$chatline=$oldchat; The table below shows some examples of what the flickchat function would return, where let's say:
$otherplayername = "felicity" and thus $opponent["felicity"] = "propn"
So after the call:
list($flickchat, $chatlost) = flickchat($chatline, $opponent);
The following values for $chatline would return $flickchat and $chatlost as shown in the table:
Figure imgf000017_0001
Note the following points of interest in the results of the flickchat function as shown above:
• The misspelled word "doin" is discarded.
• Although the word "felicity" was not defined in flictionary.php it was passed to the flickchat function as a temporary extra word (via the optional second argument) and because it was defined as a proper noun it is always returned with a capital first letter.
• The words "meet" and "call" have been deliberately left out of flictionary.php to reduce the chances of participants trying to arrange a meeting, or to call one another. These words are lost and would not be displayed to the other participants.
• Attempts to give telephone numbers and addresses are thwarted.
• "Hurrying" and "bored" are not explicitly listed in flictionary.php but are allowed through because "hurry" and "bore" are defined as type "verb" which means the standard rules for forming participles are used.
• hi the last example an attempt to convey a coded unsuitable message is thwarted, by ignoring the user-entered case of the letters.
Please note that the 813-word dictionary of allowed components shown in flictionary.php is not by any means complete, and just serves as an example. The actual dictionary of allowed components used is immaterial to the method of the present invention and any dictionary of allowed components could be substituted. Obviously a one-word dictionary would result in very meaningless online chat, whilst a 50,000-word dictionary might make it too easy to construct "acceptable" unsavoury or otherwise unsuitable messages.
It should be understood that various modifications and variations may be made to the method as hereinbefore described without departing from the spirit and ambit of the present invention which basically hinges on the previously unused technique of filtration by inclusion of allowed components rather than filtration by exclusion of disallowed ones.

Claims

1. A method for filtering online chat, where online chat entered by participants in an online chat facility is automatically filtered prior to display to other participants in the facility, wherein:
. The text forming the online chat entered by each participant is broken down into its components, where, in the case of alphabet based languages such as English, these components are words, but in character based languages like Japanese, the components may be individual Kanji characters;
• Each and every component (word in the case of English) comprising the online chat is then verified for suitability by checking that it is exists in a predefined dictionary of allowed components (words in the case of English);
• The filtering method then returns a text string containing only those components from the original chat verified as present in the predefined dictionary of allowed components;
• Only the text filtered according to the method of this claim would be made visible to other participants in the online chat facility.
2. The method of Claim 1 but where those components filtered out because they were not present in the predefined dictionary of allowed components, would be made available for logging purposes but which are not intended to be made visible to other participants in the online chat facility. This may be useful to those maintaining the online chat facility in deciding whether or not some new components should be added to the predefined dictionary.
3. The method of Claims 1 and 2 where characters in the entered online chat text that are allowed punctuation marks are removed before the dictionary look-up is performed and subsequently reinstated when forming the filtered text.
4. The method of Claims 1, 2 and 3 where if a component is not present in the predefined dictionary of allowed components, before filtering it out, an attempt is made to see if it is a valid derivative of a component that is present, such as: the plural of a predefined noun; the present or past participle or the present or past tense of a predefined verb; or the comparative or superlative of a predefined adjective or adverb, where those derivatives are formed in a standard way.
5. The method of Claims 1, 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is retained as entered without filtering it at all.
6. The method of Claims 1, 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is always filtered out unless explicitly listed in the predefined dictionary of allowed components.
7. The method of Claims 1, 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is always filtered out.
8. The method of Claims I5 2, 3 and 4 where text consisting of numeric data, such as numbers, monetary values, dates and times, is filtered according to a set of rules.
9. The method of Claims 1, 2, 3 and 4 where after verification in the dictionary of allowed components, the capitalisation of those components entered by the participant in the online chat facility and found to be present, is retained as entered.
10. The method of Claims 1, 2, 3 and 4 where after verification in the dictionary of allowed components, the capitalisation of those components entered by the participant in the online chat facility and found to be present, is ignored and instead a set of capitalisation rules is used to determine the case of the validated text.
11. The method of Claim 10 where the capitalisation rules force validated text to be returned as lower case, except in the following cases where the first letter of the validated component is returned as upper case including: when the component begins a sentence; when the component is the first person subjective pronoun "I"; or the component is defined in the predefined dictionary of allowed components as a proper noun.
PCT/AU2006/000724 2005-05-31 2006-05-31 A method for filtering online chat WO2006128224A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2005902786A AU2005902786A0 (en) 2005-05-31 A Method For Filtering Online Chat
AU2005902786 2005-05-31

Publications (1)

Publication Number Publication Date
WO2006128224A1 true WO2006128224A1 (en) 2006-12-07

Family

ID=37481131

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2006/000724 WO2006128224A1 (en) 2005-05-31 2006-05-31 A method for filtering online chat

Country Status (1)

Country Link
WO (1) WO2006128224A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009109046A1 (en) * 2008-03-04 2009-09-11 Ganz Multiple-layer chat filter system and method
US8323068B2 (en) 2010-04-23 2012-12-04 Ganz Villagers in a virtual world with upgrading via codes
US8380725B2 (en) 2010-08-03 2013-02-19 Ganz Message filter with replacement text
US8458602B2 (en) 2009-08-31 2013-06-04 Ganz System and method for limiting the number of characters displayed in a common area
US8719730B2 (en) 2010-04-23 2014-05-06 Ganz Radial user interface and system for a virtual world game
US8788943B2 (en) 2009-05-15 2014-07-22 Ganz Unlocking emoticons using feature codes
US9022868B2 (en) 2011-02-10 2015-05-05 Ganz Method and system for creating a virtual world where user-controlled characters interact with non-player characters

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001033371A1 (en) * 1999-11-05 2001-05-10 Surfmonkey.Com, Inc. System and method of filtering adult content on the internet
WO2001080214A1 (en) * 2000-04-18 2001-10-25 Genesys Telecommunication Laboratories, Inc. Method and apparatus for summarizing previous threads in a communication-center chat session
AU5365400A (en) * 2000-08-25 2002-02-28 Gala Incorporated Electronic bulletin board system
US20020198940A1 (en) * 2001-05-03 2002-12-26 Numedeon, Inc. Multi-tiered safety control system and methods for online communities
US20040154022A1 (en) * 2003-01-31 2004-08-05 International Business Machines Corporation System and method for filtering instant messages by context
US6842773B1 (en) * 2000-08-24 2005-01-11 Yahoo ! Inc. Processing of textual electronic communication distributed in bulk

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001033371A1 (en) * 1999-11-05 2001-05-10 Surfmonkey.Com, Inc. System and method of filtering adult content on the internet
WO2001080214A1 (en) * 2000-04-18 2001-10-25 Genesys Telecommunication Laboratories, Inc. Method and apparatus for summarizing previous threads in a communication-center chat session
US6842773B1 (en) * 2000-08-24 2005-01-11 Yahoo ! Inc. Processing of textual electronic communication distributed in bulk
AU5365400A (en) * 2000-08-25 2002-02-28 Gala Incorporated Electronic bulletin board system
US20020198940A1 (en) * 2001-05-03 2002-12-26 Numedeon, Inc. Multi-tiered safety control system and methods for online communities
US20040154022A1 (en) * 2003-01-31 2004-08-05 International Business Machines Corporation System and method for filtering instant messages by context

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009109046A1 (en) * 2008-03-04 2009-09-11 Ganz Multiple-layer chat filter system and method
US8316097B2 (en) 2008-03-04 2012-11-20 Ganz Multiple-layer chat filter system and method
US8321513B2 (en) 2008-03-04 2012-11-27 Ganz Multiple-layer chat filter system and method
US8788943B2 (en) 2009-05-15 2014-07-22 Ganz Unlocking emoticons using feature codes
US8458602B2 (en) 2009-08-31 2013-06-04 Ganz System and method for limiting the number of characters displayed in a common area
US9403089B2 (en) 2009-08-31 2016-08-02 Ganz System and method for limiting the number of characters displayed in a common area
US8323068B2 (en) 2010-04-23 2012-12-04 Ganz Villagers in a virtual world with upgrading via codes
US8719730B2 (en) 2010-04-23 2014-05-06 Ganz Radial user interface and system for a virtual world game
US9050534B2 (en) 2010-04-23 2015-06-09 Ganz Achievements for a virtual world game
US8380725B2 (en) 2010-08-03 2013-02-19 Ganz Message filter with replacement text
US9022868B2 (en) 2011-02-10 2015-05-05 Ganz Method and system for creating a virtual world where user-controlled characters interact with non-player characters

Similar Documents

Publication Publication Date Title
Evans The emoji code: How smiley faces, love hearts and thumbs up are changing the way we communicate
Coleman The life of slang
WO2006128224A1 (en) A method for filtering online chat
McGregor Focal and optional ergative marking in Warrwa (Kimberley, Western Australia)
Martin Grammatical conspiracies in Tagalog: Family, face and fate—with regard to Benjamin Lee Whorf
Gil Riau Indonesian sama: Explorations in macrofunctionality
Gaufman et al. The Trump Carnival: Populism, Transgression and the Far Right
Lasky The language of journalism: Volume 1, newspaper culture
Jussinoja Life-cycle of Internet trolls
Lejsek Anglicismy v českém jazyce
Kim The North-South Divide in Gorboduc: Fratricide Remembered and Forgotten
Turton Trufax about discussion group netspeak: an historical analysis of semantic change in the English slang of newsgroups and web forums
Pearlman Ezra Pound: America's Wandering Jew
Rut-Kluz et al. Irony inside and outside memes: A case of meme series within relevance theory.
Byers # LetShaCarriRun: A Thematic Analysis of the Twitter Discourse Surrounding Sha'Carri Richardson's Absence from the 2020 Olympics
Thatcher Saving Our Prepositions
Salami et al. Thomas Pynchon’s Against the Day: A Deleuzian Reading of Pynchon’s Language
Guo Produce Wu Lei: a national icon or a product? a case study on national identity and sports media in China
Tellou An Analysis of Vocative Markers in the Quran
Garner The Year 2021 in Language, Grammar, and Writing
Papić THE FIGURATIVE COMPOUND EPITHET IN ZADIE SMITH’S NOVELS
Schlackman et al. Attack mail: the silent killer.
Smith Piloting Princes: Hugh Clifford and the Malay Rulers
Garrett Cutting Edge: New Stories of Mystery and Crime by Women Writers.
Glynne-Jones The book of words

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

NENP Non-entry into the national phase

Ref country code: RU

WWW Wipo information: withdrawn in national office

Country of ref document: RU

122 Ep: pct application non-entry in european phase

Ref document number: 06752601

Country of ref document: EP

Kind code of ref document: A1