BE1025360B1

BE1025360B1 - MANAGING AND USING LEGAL FISCAL DOCUMENTS FROM LAW

Info

Publication number: BE1025360B1
Application number: BE2017/6017A
Authority: BE
Inventors: Thierry Lauwers
Original assignee: Advocatenkantoor Thierry Lauwers Bv Bvba
Priority date: 2017-12-28
Filing date: 2017-12-28
Publication date: 2019-01-30

Abstract

Een computer-geïmplementeerde methode voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak, waarbij uit deze verzameling juridisch-fiscale documenten uit de rechtspraak een lijst van responsieve documenten wordt verkregen, waarbij het ordenen van de lijst een stap omvat waarbij aan elk juridisch-fiscaal document een gewichtsparameter wordt toegeschreven door het maken van een samenvatting welke getoetst wordt aan ten minste één criterium, waarbij deze criteria gecategoriseerd zijn.A computer-implemented method for sorting a collection of legal-tax documents from the case-law, whereby from this collection of legal-tax documents from the case-law a list of responsive documents is obtained, wherein the ordering of the list comprises a step in which each legal-tax document a weight parameter is attributed by making a summary that is tested against at least one criterion, these criteria being categorized.

Description

BEHEREN EN AANWENDEN VAN JURI Dl SCH-FI SCALE DOCUMENTEN UIT DE RECHTSPRAAKMANAGING AND USING JURI Dl SCH-FI SCALE DOCUMENTS FROM LAW

TECHNISCH DOMEI NTECHNICAL DOMAIN N

De uitvinding heeft betrekking op een op een computer-geïmplementeerde methode voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak, en een tastbare niet-transitoire computer-leesbare informatiedrager omvattende een computerprogrammaproduct voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak.The invention relates to a computer-implemented method for sorting a collection of legal-tax documents from case law, and a tangible non-transitory computer-readable information carrier comprising a computer program product for sorting a collection of legal-tax documents from the case law.

STAND DER TECHNI EKSTAND OF THE TECHNIQUE

Het Belgische rechtssysteem, evenals een aantal andere rechtsstelsels over de hele wereld zijn sterk afhankelijk van schriftelijke juridische adviezen en de schriftelijke uitspraken van rechters om wetten betreffende de beslechting van geschillen uit te leggen of te interpreteren. Als gevolg daarvan zoeken rechters en advocaten naar de meest relevante vroegere opvattingen of jurisprudentie, zodoende geschillen op te lossen of te voorkomen, terwijl deze hoeveelheid informatie alsmaar blijft toenemen. Gevonden cases worden bestudeerd op relevantie en worden uiteindelijk geciteerd en besproken in documenten, procesmateriaal genoemd, die bijvoorbeeld pleiten voor gerechtelijke stappen, cliënten adviseren over waarschijnlijke gerechtelijke acties of cliënten en advocaten informeren over wetten in bepaalde jurisdicties.The Belgian legal system, as well as a number of other legal systems around the world, rely heavily on written legal advice and the written judgments of judges to explain or interpret laws on dispute resolution. As a result, judges and lawyers look for the most relevant past views or case law, thereby resolving or preventing disputes, while this amount of information continues to increase. Found cases are studied for relevance and are eventually quoted and discussed in documents, called litigation material, which for example advocate legal action, advise clients on probable legal actions or inform clients and lawyers about laws in certain jurisdictions.

Naast het rechtssysteem zijn in België ook voorafgaande beslissingen of rulings relevant. FOD Financiën bepaalt aan de hand van deze rulings hoe belastingwetten zullen worden toegepast op een specifieke situatie of verrichting die op fiscaal vlak nog geen uitwerking heeft gehad.In addition to the legal system, prior decisions or rulings are also relevant in Belgium. On the basis of these rulings, FPS Finance determines how tax laws will be applied to a specific situation or transaction that has not yet had an effect on tax matters.

EP2008/1 586 055 beschrijft een methode voor het beheren en aanwenden van informatie bij advocatenkantoren en potentieel andere ondernemingen. Eén systeem biedt bijvoorbeeld een gebruikersinterface voor het onderzoeken van jurisprudentie voor online juridisch onderzoek en voor het identificeren en verkrijgen van toegang tot advocatenkantoordocumenten. Via de interface kan een gebruiker, zoals een advocaat, een juridische onderzoeksopdracht initiëren of indienen en de zoekresultaten bekijken die niet alleen relevante externe documenten van de online juridische onderzoeksdienst identificeren, maar ook relevante interne documenten, zoals lastenboeken, klantbrieven, en juridische memoranda uit de eigen documentenverzameling van het advocatenkantoor.EP2008 / 1 586 055 describes a method for managing and using information at law firms and potentially other companies. For example, one system provides a user interface for investigating case law for online legal investigation and for identifying and gaining access to law firm documents. Through the interface a user, such as a lawyer, can initiate or submit a legal investigation assignment and view the search results that not only identify relevant external documents of the online legal investigation service, but also relevant internal documents such as specifications, customer letters, and legal memoranda from the own collection of documents from the law firm.

De gekende methodes richten zich echter enkel op gebruik door advocaten of ander hoog opgeleid personeel, terwijl toegang tot juridisch-fiscale documenten van relevante cases ook voor leken een informerend en adviserend karakter kan hebben.The known methods, however, only focus on use by lawyers or other highly trained personnel, while access to legal-tax documents of relevant cases can also be of an informative and advisory nature for laymen.

BE2017/6017BE2017 / 6017

Nog een probleem in de manier waarop deze juridisch-fiscale documenten worden geordend, is het ontbreken van een gebruiksvriendelijke computer-geïmplementeerde zoekterminal, waarbij de resultaten volgens relevantie aan de zoekopdracht worden geordend. Het booleaans zoekalgoritme zoals volgens Google is veeleer niet geschikt voor dergelijke ordening gezien de rechtspraak een dynamisch en discontinu systeem van wetten, uitspraken, rechters en rechtbanken betreft, waarbij interpretaties van wetten gebonden kunnen zijn aan een tijdsgeest, sociale of maatschappelijke omstandigheden, verschillende rechtbanken of rechters, anticipaties op toekomstige wetten, enzovoort.Another problem with the way these legal-tax documents are organized is the lack of a user-friendly computer-implemented search terminal, where the results are ordered according to relevance to the search. The boolean search algorithm such as that according to Google is rather unsuitable for such an order, since the case law concerns a dynamic and discontinuous system of laws, judgments, judges and courts, whereby interpretations of laws may be linked to a spirit of time, social or social circumstances, different courts or judges, anticipations of future laws, and so on.

De huidige uitvinding beoogt een oplossing te vinden voor ten minste enkele van bovenvermelde problemen.It is an object of the present invention to find a solution to at least some of the aforementioned problems.

SAMENVATTI NG VAN DE UITVINDI NGSUMMARY OF THE INVENTION

In een eerste aspect betreft de huidige uitvinding een computer-geïmplementeerde methode voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak, volgens conclusie 1.In a first aspect, the present invention relates to a computer-implemented method for sorting a collection of legal-tax documents from the case law, according to claim 1.

In een tweede aspect betreft de huidige uitvinding een computersysteem voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak, volgens conclusie 12.In a second aspect, the present invention relates to a computer system for sorting a collection of legal-tax documents from the case-law, according to claim 12.

In een derde aspect betreft de huidige uitvinding een tastbare niet-transitoire computerleesbare informatiedrager omvattende een computerprogrammaproduct geschikt voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak, volgens conclusie 13.In a third aspect, the present invention relates to a tangible non-transitory computer-readable information carrier comprising a computer program product suitable for sorting a collection of legal-tax documents from the case law, according to claim 13.

De huidige uitvinding is voordelig omdat een leek zich, door gebruik van het oomputersysteem, zonder tussenkomst van een advocaat, kan informeren betreffende juridisch-fiscale kwesties.The present invention is advantageous because a layman can, through the use of the computer system, without the intervention of a lawyer, inquire about legal-tax issues.

Daarnaast is de uitvinding voordelig omdat de fiscale-juridische documenten volgens actuele relevantie worden geordend.In addition, the invention is advantageous because the tax-legal documents are arranged according to current relevance.

GEDETAI LLEERDE BESCHRI JVI NGDETAILED DESCRIPTION

De uitvinding betreft een computer-geïmplementeerde methode en een tastbare niettransitoire computer-leesbare informatiedrager omvattende een computerprogrammaproduct voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak. In wat volgt, wordt de uitvinding in detail beschreven, worden voorkeursuitvoeringen toegelicht, en wordt de uitvinding geïllustreerd aan de hand van voorbeelden.The invention relates to a computer-implemented method and a tangible non-transitory computer-readable information carrier comprising a computer program product for sorting a collection of legal-tax documents from the case law. In the following, the invention is described in detail, preferred embodiments are explained, and the invention is illustrated on the basis of examples.

Tenzij anders gedefinieerd, hebben alle termen die gebruikt worden in de beschrijving van de uitvinding, ook technisch en wetenschappelijke termen, de betekenis zoals ze algemeenUnless defined otherwise, all terms used in the description of the invention, including technical and scientific terms, have the meaning as they are generally

BE2017/6017 begrepen worden door de vakman in het technisch veld van de uitvinding. Voor een betere beoordeling van de beschrijving van de uitvinding, worden de volgende termen expliciet uitgelegd.BE2017 / 6017 are understood by those skilled in the art in the technical field of the invention. For a better assessment of the description of the invention, the following terms are explicitly explained.

“Een”, ”de” en “het” refereren naar zowel het enkelvoud als het meervoud tenzij de context duidelijk anders veronderstelt. Bijvoorbeeld, “een segment” betekent een of meer dan een segment."One", "de" and "it" refer to both the singular and the plural unless the context clearly presupposes otherwise. For example, "a segment" means one or more than one segment.

De termen “omvatten”, “omvattende”, “bestaan uit”, “bestaande uit”, “voorzien van”, “bevatten”, “bevattende”, “behelzen”, “behelzende”, “inhouden”, “inhoudende” zijn synoniemen en zijn inclusieve of open termen die de aanwezigheid van wat volgt aanduiden, en die de aanwezigheid niet uitsluiten of beletten van andere componenten, kenmerken, elementen, leden, stappen, gekend uit of beschreven in de stand der techniek.The terms "include," "include," "consist of," "include," "include," "include," "include," "include," and are inclusive or open terms indicating the presence of what follows, and which do not preclude or prevent the presence of other components, features, elements, members, steps, known from or described in the prior art.

De term “actueel relevant” of “actuele relevantie” verwijst naar de mate waarop een bepaald document overeenkomt met het actueel dynamisch en discontinu systeem van wetten, uitspraken, rechters en rechtbanken, gezien interpretaties van wetten gebonden kunnen zijn aan een tijdsgeest, sociale of maatschappelijke omstandigheden, verschillende rechtbanken of rechters, anticipaties op toekomstige wetten, enzovoort. De actuele relevantie van een document wordt in deze uitvinding bepaald door de samenvatting ervan een numerieke waarde in de vorm van een gewichtsparameter toe te schrijven. Deze gewichtsparameter wordt bepaald door een wiskundige bewerking van meerdere numerieke wegingsfactoren, waarbij iedere wegingsfactor een bepaald criterium beschrijft. Het document waarvan de samenvatting de grootste gewichtsparameter heeft, zal zodoende het meest actueel relevant zijn. Tevens worden verwante criteria bij voorkeur gegroepeerd in categorieën.The term "current relevance" or "current relevance" refers to the extent to which a particular document corresponds to the current dynamic and discontinuous system of laws, judgments, judges and courts, as interpretations of laws may be tied to a spirit of time, social or social circumstances, different courts or judges, anticipations of future laws, and so on. The current relevance of a document is determined in this invention by attributing its summary to a numerical value in the form of a weight parameter. This weight parameter is determined by a mathematical operation of several numerical weighting factors, with each weighting factor describing a certain criterion. The document whose summary has the largest weight parameter will therefore be the most relevant today. Related criteria are also preferably grouped into categories.

De term “classifier” refereert naar de abstracte metaklasse van classificatieconcepten welke aangewend worden als mechanisme om elementen met gemeenschappelijke eigenschappen en/of inhoud in sets te verzamelen. In dit document refereert deze term naar een algoritme ontworpen om de samenvattingen en documenten te ordenen volgens hun karakteristieken.The term "classifier" refers to the abstract meta class of classification concepts that are used as a mechanism to collect elements with common properties and / or content in sets. In this document, this term refers to an algorithm designed to organize summaries and documents according to their characteristics.

De term “gebruiker” refereert naar een persoon die de inhoud van gegevensverzamelingen van een informatiesysteem gebruiken. In het bijzonder refereert de term in dit document naar de persoon die een inhoud van de responsieve juridisch-fiscale documenten raadpleegt na het doorgeven van instructies aan het computersysteem.The term "user" refers to a person who uses the content of data sets from an information system. In particular, the term in this document refers to the person who consults a content of the responsive legal-fiscal documents after passing instructions to the computer system.

De term “invoerapparaat” refereert naar een apparaat geschikt om input aan te leveren aan een gebruikerstoestel. Deze input is niet beperkt tot een bepaalde modaliteit en kan mechanische beweging, geluid, beelden, en dergelijke omvatten. De input kan discreet en/of continu zijn. De input is ook niet beperkt door het aantal vrijheidsgraden. De input kan zowel directe als indirecte input betreffen. Bij het voorzien van input met betrekking tot een positie of de verandering daarvan, zoals bv. een indicator op een scherm, kan de input zowel absoluutThe term "input device" refers to a device suitable for supplying input to a user device. This input is not limited to a certain modality and may include mechanical movement, sound, images, and the like. The input can be discreet and / or continuous. The input is also not limited by the number of degrees of freedom. The input can relate to both direct and indirect input. When providing input related to a position or its change, such as an indicator on a screen, the input can be absolute

BE2017/6017 als relatief zijn. Een niet-lim itatieve lijst van voorbeelden van invoerapparaten omvat een toetsenbord, een computermuis, een touchpad, een touchscreen, een camera, een scanner, een joystick, een microfoon, een lichtpen, een trackball, een geprojecteerd keyboard en een spelcontroller.BE2017 / 6017 as being relative. A non-exhaustive list of examples of input devices includes a keyboard, a computer mouse, a touch pad, a touchscreen, a camera, a scanner, a joystick, a microphone, a light pen, a trackball, a projected keyboard, and a game controller.

Een “juridisch-fiscaal document” refereert naar een schriftelijk of elektronisch bewijsstuk te maken met juridische aspecten, ofwel aspecten met betrekking op het recht, van belastingheffing.A "legal-tax document" refers to a written or electronic piece of evidence to make with legal aspects, or aspects relating to the law, of taxation.

De term “parsen” verwijst naar het ontleden van de grammaticale structuur van een invoer volgens een vastgelegde grammatica. Een parser converteert de ingevoerde tekst in een datastructuur vergelijkbaar met het invullen van een formulier met gegevens op de voorgegeven plaatsen in een voorgegeven tekstformaat.The term "parsing" refers to parsing the grammatical structure of an entry according to a fixed grammar. A parser converts the entered text into a data structure similar to filling in a form with data at the specified places in a specified text format.

De term “server” verwijst naar een computersysteem dat diensten verleent aan gebruikers gebruik makende van de voorgestelde computer-geïmplementeerde methode. De server bestaat bij voorkeur uit minstens een voeding, een geheugen, een processor, een harde schijf, een moederbord, en een behuizing.The term "server" refers to a computer system that provides services to users using the proposed computer-implemented method. The server preferably consists of at least one power supply, a memory, a processor, a hard disk, a motherboard, and a housing.

In een eerste aspect betreft de uitvinding een computer-geïmplementeerde methode voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak, met het kenmerk, dat uit deze verzameling juridisch-fiscale documenten uit de rechtspraak een lijst van responsieve documenten wordt verkregen, waarbij het ordenen van de lijst stappen omvat waarbij aan elk juridisch-fiscaal document een gewichtsparameter wordt toegeschreven door het maken van een samenvatting welke getoetst wordt aan ten minste één criterium, waarbij deze criteria gecategoriseerd zijn.In a first aspect, the invention relates to a computer-implemented method for sorting a collection of legal-tax documents from the case-law, characterized in that a list of responsive documents is obtained from this collection of legal-tax documents from the case-law, arranging the list comprises steps whereby a weight parameter is attributed to each legal-fiscal document by making a summary which is tested against at least one criterion, these criteria being categorized.

Om tot een samenvatting van een document te komen, doorloopt ieder document een aantal stappen. Per document wordt een samenvatting gemaakt door de tekst en de metadata uit de aangeleverde documenten te extraheren, en deze vervolgens verder te verwerken.To arrive at a summary of a document, each document goes through a number of steps. A summary is made for each document by extracting the text and metadata from the documents supplied, and then further processing them.

Vooreerst dient de platte tekst uit de documenten geëxtraheerd te worden. Platte tekst kan immers veel beter geïnterpreteerd worden door een computer. Tekstuele data bestaat onder verschillende vormen zoals HTML-pagina's, pdf-documenten, Word-documenten, etc, waarbij elk van deze vormen zijn eigen vorm en opmaak heeft om de representatie als het ware up te graden van platte tekst naar iets wat bruikbaar is binnen hun eigen omgeving. Zo worden bij HTML verschillende tags toegevoegd om de opmaak te verfraaien of wordt er bij pdfdocumenten een fixed-layout gemaakt om het document er onder alle omstandigheden hetzelfde te laten uitzien. Om met deze verschillende soorten van tekst te kunnen werken binnen het voorgestelde systeem is het belangrijk om de platte tekst uit deze verschillende documentenformaten te extraheren. Op die manier wordt een gemeenschappelijke basis gecreëerd die als invoer zal dienen voor de verdere verwerking die dan niet meer afhankelijkFirst of all, the plain text must be extracted from the documents. Plain text can be interpreted much better by a computer. Textual data exists in various forms such as HTML pages, PDF documents, Word documents, etc., each of which has its own form and format to upgrade the representation of flat text to something that can be used within their own environment. For example, with HTML different tags are added to enhance the layout or with PDF documents a fixed layout is made to make the document look the same under all circumstances. To be able to work with these different types of text within the proposed system, it is important to extract the plain text from these different document formats. In this way a common basis is created that will serve as input for further processing that will then no longer depend

BE2017/6017 is van het type van het document. Naast het extraheren van tekst wordt ook metadata verzameld. Deze data kunnen belangrijk zijn bij de keuze van het taalafhankelijke stem mingsalgoritme dat later in deze stap gebruikt moet worden, of laat toe om verschillende soorten documentformaten, zoals pdf, doc, doex, csv, html, etc, om te vormen tot platte tekst zonder opmaak. Tevens kan de metadata inhoud omvatten die van belang zijn voor het classificeren van de documenten, zoals data, aantal bladzijden, enzovoort.BE2017 / 6017 is of the type of the document. In addition to extracting text, metadata is also collected. This data can be important when choosing the language-dependent mood algorithm to be used later in this step, or allows you to convert different types of document formats, such as pdf, doc, doex, csv, html, etc, into plain text without layout. The metadata can also include content that is important for classifying the documents, such as data, number of pages, and so on.

Vervolgens wordt het document geparst, welke twee delen omvat. Het eerste deel is de taaldetectie en het tweede deel is het effectief parsen van de tekst. Hierbij wordt voor elk document de parser gekozen die voor dat type het beste resultaat zal opleveren. Om alle woorden te verkrijgen die gebruikt worden in een bepaalde tekst, wordt een tokenization proces toegepast. Dit proces zorgt ervoor dat een tekstdocument gesplitst wordt in een stroom van woorden door alle leestekens te verwijderen en alle tabs en niet-tekstuele karakters door spaties te vervangen. De set van verschillende woorden uit alle tekstdocumenten wordt samengevoegd tot het woordenboek van de documentencollectie. Bij voorkeur wordt een NGram tokenizer aangewend, gezien deze de mogelijkheid geeft om twee of drie woorden samen te nemen en die als één woord in de documentencollectie te beschouwen. Dit kan voordeel geven bij o.a. personennamen, welke vaak bestaan uit twee woorden die voor een tekst belangrijk zouden kunnen zijn. Uiteraard wordt een eigennaam ook opgepikt indien alle woorden apart beschouwd worden. Om de grootte van het woordenboek en dus de dimensionaliteit van de beschrijving van de documentencollectie te verkleinen, wordt de set van woorden verder gereduceerd door het toepassen van filters of stemmingsalgoritmes. Filters verwijderen woorden van het woordenboek en dus uit de samenvattingen. Zo kan een filtering worden toegepast om stopwoorden te verwijderen die weinig of geen inhoud hebben. Voorbeelden zijn lidwoorden, verbindingswoorden, voorzetsels, etc. Stemmingsalgoritmes proberen een woord om te vormen tot de standaardvorm van dat woord. Dit doen ze bijvoorbeeld door meervouden van zelfstandige naamwoorden naar het enkelvoud om te zetten of door werkwoorden naar hun stam te vereenvoudigen.The document is then parsed, which comprises two parts. The first part is language detection and the second part is effective parsing of the text. The parser is chosen for each document that will produce the best result for that type. To obtain all the words used in a certain text, a tokenization process is applied. This process ensures that a text document is split into a stream of words by removing all punctuation marks and replacing all tabs and non-textual characters with spaces. The set of different words from all text documents is merged into the document collection dictionary. An NGram tokenizer is preferably used, since it gives the possibility to take two or three words together and to consider them as one word in the document collection. This can be advantageous with, among other things, person names, which often consist of two words that could be important for a text. Of course a proper name is also picked up if all words are considered separately. To reduce the size of the dictionary and therefore the dimensionality of the description of the document collection, the set of words is further reduced by applying filters or mood algorithms. Filters remove words from the dictionary and therefore from the summaries. For example, filtering can be applied to remove stop words that have little or no content. Examples are articles, connection words, prepositions, etc. Mood algorithms try to transform a word into the standard form of that word. They do this, for example, by converting plural nouns to the singular or by simplifying verbs to their root.

Om het geheel van samenvattingen beter te kunnen begrijpen, moeten deze eerst worden omgevormd tot een karakteristieke vector die vervolgens als input kan dienen voor een computer. De platte tekst in de documentencollectie bevat op zich te weinig structuur om efficiënt verwerkt te worden door een computersysteem. Ondanks de simpele datastructuur zorgt het vectorruimtemodel ervoor dat grote collecties documenten efficiënt kunnen geanalyseerd worden. Het representeert documenten als vectoren in een m-dimensionale ruimte. Elk document d uit de documentencollectie wordt beschreven als een numerieke karakteristieke vector w(d) = (x(d, ti), x(d, t_m)) waarbij T= {ti, t_m} het woordenboek voorstelt. De hoofdtaak van de vectorruimterepresentatie van documenten is het vinden van een geschikte encodering van de karakteristieke vector. Elk element van de vector representeert meestal een woord (of groep van woorden) van de documentencollectie. De simpelste manier om een document te encoderen is om binaire term vectoren te gebruiken. Als een woord voorkomt in het document wordt het corresponderende element op één gezet,In order to better understand the whole of summaries, they must first be transformed into a characteristic vector that can then serve as input for a computer. The plain text in the document collection does not contain enough structure to be efficiently processed by a computer system. Despite the simple data structure, the vector space model ensures that large collections of documents can be analyzed efficiently. It represents documents as vectors in an m-dimensional space. Each document d from the document collection is described as a numerical characteristic vector w (d) = (x (d, ti), x (d, t _m )) where T = {ti, t _m } represents the dictionary. The main task of the vector space representation of documents is to find a suitable encoding of the characteristic vector. Each element of the vector usually represents a word (or group of words) from the document collection. The simplest way to encode a document is to use binary term vectors. If a word occurs in the document, the corresponding element is set to one,

BE2017/6017 komt het niet voor dan is het nul. De encodering wordt zo herleid tot een simpele Booleaanse vergelijking. Hierbij wordt de belangrijkheid van elk woord echter als gelijkwaardig beschouwd. Om de performantie te verbeteren worden wegingsfactoren gebruikt. Het gewicht dat toegekend wordt aan een woord is een indicatie van de belangrijkheid of relevantie van dat woord in een specifiek document of collectie. Een woord met hoge frequentie in bepaalde documenten, maar dat weinig of niet voorkomt in de volledige documentencollectie, wordt een groot gewicht toebedeeld. Een gewicht w(d, t) voor term t in document d wordt berekend als de term frequentie tf(d, t) vermenigvuldigd met de invers document frequentie idf(t) gedefinieerd als idf(t) = log (N/nt). Dit beschrijft de specificiteit van een bepaalde term in een documentencollectie. Naast term frequentie en invers document frequentie wordt een normalisatie toegepast om ervoor te zorgen dat alle documenten dezelfde kans hebben om gevonden te worden, onafhankelijk van hun lengte. Een meer complexe manier om een document te encoderen door middel van binaire term vectoren, is het aanwenden van een complexere term vector die numerieke waarden oplevert die kunnen verschillen van 0 of 1. Zo kan bijvoorbeeld een algoritme ontworpen worden dat het aantal keren een bepaald woord voorkomt in het document telt, maar tevens kan een nog complexer algoritme gebruikt worden dat aan de hand van de woorden een bepaalde interpretatie van de samenvatting in de vorm van een numerieke waarde weergeeft.BE2017 / 6017 does not occur, then it is zero. The encoding is thus reduced to a simple Boolean comparison. However, the importance of each word is considered to be equivalent. Weighting factors are used to improve performance. The weight assigned to a word is an indication of the importance or relevance of that word in a specific document or collection. A word with a high frequency in certain documents, but that does not or rarely occur in the entire document collection, is given a great weight. A weight w (d, t) for term t in document d is calculated as the term frequency tf (d, t) multiplied by the inverse document frequency idf (t) defined as idf (t) = log (N / nt). This describes the specificity of a specific term in a document collection. In addition to term frequency and inverse document frequency, a standardization is applied to ensure that all documents have the same chance of being found, regardless of their length. A more complex way to encode a document using binary term vectors is to use a more complex term vector that produces numeric values that can differ from 0 or 1. For example, an algorithm can be designed that the number of times a given word occurs in the document, but an even more complex algorithm can be used that uses the words to represent a certain interpretation of the summary in the form of a numerical value.

Tevens kunnen attributen die niet relevant zijn, worden verwijderd door te selecteren op karakteristieken. Zo wordt een subset van de karakteristieken gezocht die relevant zijn voor het doelconcept. Dit zorgt er o.a. voor dat de dataset kleiner wordt, waardoor minder rekenkracht en zoekruimte vereist is bij de effectieve verwerking van het vector model. Het reduceert het aantal karakteristieken, het verwijdert irrelevante of redundante data en het zorgt er bijgevolg voor dat de algoritmes sneller werken. Het zorgt er o.m. ook voor dat de voorspelde accuraatheid en begrijpbaarheid van de resultaten verbeterd wordt.Attributes that are not relevant can also be removed by selecting on characteristics. In this way, a subset of the characteristics that are relevant to the target concept is sought. This ensures, among other things, that the data set becomes smaller, so that less computing power and search space is required for the effective processing of the vector model. It reduces the number of characteristics, removes irrelevant or redundant data and therefore ensures that the algorithms work faster. It also ensures, among other things, that the predicted accuracy and comprehensibility of the results is improved.

Na de preprocessing wordt de tekst automatisch in een structuur van vooraf ontworpen categorieën geplaatst, waardoor informatie kan verkregen worden over teksten met gelijkaardige inhoud. Teksten die uit dezelfde categorie komen zullen de gebruiker sneller aanspreken en zijn dus mogelijk nuttig om samen aan te bieden. Voor classificatie is een vooraf bekende structuur nodig, welke in dit geval op juridisch-fiscale documenten toegespitst is. Op basis van een leerverzameling kan een classifier getraind worden om onder bepaalde categorieën te classificeren. Het is dus de bedoeling om deze classifiers zo te kiezen dat voor elke categorie een maximaal aantal samenvattingen juist geclassificeerd kan worden. Daartoe zal vooraf voor elke tak in de boomstructuur van een dataset een optimale classifier bepaald worden om zo tot een goed mogelijk en betrouwbaar resultaat te komen. Tevens kan de classificatie gebeuren aan de hand van de geëxtraheerde metadata, of een combinatie van beide methoden.After preprocessing, the text is automatically placed in a structure of pre-designed categories, so that information can be obtained about texts with similar content. Texts from the same category will appeal to the user faster and may therefore be useful to offer together. Classification requires a previously known structure, which in this case is focused on legal-tax documents. A classifier can be trained to classify under certain categories based on a learning set. It is therefore the intention to choose these classifiers in such a way that a maximum number of summaries can be classified correctly for each category. To this end, an optimal classifier will be determined for each branch in the tree structure of a data set in order to arrive at a good and reliable result. The classification can also be done on the basis of the extracted metadata, or a combination of both methods.

Deze uitvinding heeft als voordeel dat de methode kan aangewend worden voor het ordenen van documentlijsten van verschillende types zoekopdrachten. Enerzijds kan deze methodeThis invention has the advantage that the method can be used to organize document lists of different types of searches. On the one hand, this method is possible

BE2017/6017 aangewend worden voor een categoriale zoekopdracht, waarbij de gebruiker door middel van een invoerapparaat instructies doorgeeft om binnen een vooraf bepaalde categorie te zoeken. De gebruiker kan bijvoorbeeld zoeken op jaartal, datum of locatie, door zich aan de hand van het invoerapparaat naar deze categorieën te navigeren. Anderzijds kan deze methode ook aangewend worden voor een zoekopdracht waarbij de gebruiker als instructie door middel van een invoerapparaat zoekwoorden doorgeeft om de gehele of gedeeltelijke collectie juridischfiscale documenten te doorzoeken naar documenten enigszins verwant met de ingevoerde zoekwoorden. Het spreekt voor zich dat de huidige uitvinding zich niet beperkt tot de bovengenoemde types zoekopdrachten. Zo kunnen bijvoorbeeld beide bovengenoemde zoekopdrachten gecombineerd worden om binnen een categoriale zoekopdracht, een tweede zoekopdracht aan de hand van enkele zoekwoorden op te starten, of vice versa.BE2017 / 6017 can be used for a categorical search, where the user gives instructions to search within a predetermined category by means of an input device. For example, the user can search by year, date or location by navigating to these categories based on the input device. On the other hand, this method can also be used for a search query in which the user submits as an instruction through an input device keywords to search the entire or partial collection of legally-tax documents for documents somewhat related to the entered keywords. It goes without saying that the present invention is not limited to the above types of searches. For example, both above-mentioned searches can be combined to start a second search based on a few keywords within a categorical search, or vice versa.

De huidige uitvinding heeft als effect dat door het gebruik van het computersysteem door een leek zich, zonder tussenkomst van een advocaat of ander hoogopgeleid personeel, beter kan informeren betreffende juridisch-fiscale kwesties. Dit heeft als potentieel gevolg dat het aantal door de advocaat factureerbare uren gereduceerd kan worden, wat weerklinkt in een reductie in de kostprijs voor de klant en een potentieel grotere klantenportefeuille voor de advocaat.The present invention has the effect that through the use of the computer system by a layman, without the intervention of a lawyer or other highly trained personnel, he can better inform himself about legal-tax issues. The potential consequence of this is that the number of hours that can be invoiced by the lawyer can be reduced, which is reflected in a reduction in the cost price for the client and a potentially larger client portfolio for the lawyer.

In het bijzonder kan het computersysteem dienen als een bijkomstige bron van advies en informatie, temeer doordat er deze methode zich specifiek richt op een gebruiksvriendelijke en logische rankschikking van de juridisch-fiscale documenten.In particular, the computer system can serve as an additional source of advice and information, all the more so because this method focuses specifically on a user-friendly and logical ranking of the legal-tax documents.

In een voorkeursdragende uitvoeringsvorm bestaat de voorgenoemde samenvatting uit ten minste één categorie, waarbij iedere categorie bestaat uit ten minste één criterium, waarbij aan ieder criterium een numerieke wegingsfactor, wordt toegeschreven. De waarde van de wegingsfactor geeft de actuele relevantie van het overeenkomstige criterium voor de samenvatting weer. Hoe groter de numerieke waarde van de wegingsfactor is, hoe meer actueel relevant het overeenkomstig criterium is. Het criterium zelf wordt bepaald aan de hand van een karakteristieke vector, welke al dan niet een binaire term vector is.In a preferred embodiment, the aforementioned summary consists of at least one category, each category consisting of at least one criterion, with a numerical weighting factor attributed to each criterion. The value of the weighting factor indicates the current relevance of the corresponding criterion for the summary. The greater the numerical value of the weighting factor, the more relevant the relevant criterion is. The criterion itself is determined on the basis of a characteristic vector, which may or may not be a binary term vector.

In een verdere uitvoeringsvorm wordt de waarde van de numerieke gewichtsparameter van de samenvatting bepaald door het uitvoeren van een wiskundige bewerking op de wegingsfactoren. Bij verdere voorkeur betreft deze bewerking een optelling.In a further embodiment, the value of the numerical weight parameter of the summary is determined by performing a mathematical operation on the weighting factors. More preferably, this operation involves an addition.

In een verdere uitvoeringsvorm worden de fiscaal-juridische documenten volgens afnemende waarde van de gewichtsparameter van de samenvatting in een lijst geordend. Dit heeft als effect dat het bovenste document in de lijst tevens het meest actueel relevante document is. Alle daaropvolgende documenten zijn zodoende tevens geordend volgens afnemende actuele relevantie.In a further embodiment, the tax-legal documents are arranged according to decreasing value of the weight parameter of the summary in a list. This has the effect that the top document in the list is also the most current relevant document. All subsequent documents are therefore also ordered according to decreasing current relevance.

BE2017/6017BE2017 / 6017

In een voorkeurvorm is deze verzameling juridisch-fiscale documenten logisch gegroepeerd in een opslagfolder in een geheugeneenheid zodoende het onderhouden van de juridisch-fiscale documenten te vergemakkelijken.In a preferred form, this collection of legal-tax documents is logically grouped in a storage folder in a memory unit, thus facilitating the maintenance of the legal-tax documents.

In een verdere voorkeurvorm vormt deze geheugeneenheid een onderdeel van een computersysteem dat verbonden is met het internet. Dit heeft als effect dat de computergeïmplementeerde methode tevens vanop afstand te bereiken is door gebruikers, door bijvoorbeeld via een persoonlijke computer of smartphone een verbinding te maken met het computersysteem.In a further preferred form, this memory unit forms part of a computer system that is connected to the internet. This has the effect that the computer-implemented method can also be accessed remotely by users, for example by connecting to the computer system via a personal computer or smartphone.

Bij verdere voorkeur is dit computersysteem een server, welke als voordeel heeft dat deze ontworpen is om data te beheren, te verwerken, op te slaan en te verzenden, in een volcontinu systeem, waardoor de computer-geïmplementeerde methode potentieel 24 uren per dag beschikbaar is.Further preferably, this computer system is a server that has the advantage of being designed to manage, process, store and transmit data in a fully continuous system, whereby the computer-implemented method is potentially available 24 hours a day .

In een verdere voorkeurvorm wordt deze verzameling juridisch-fiscale documenten onderhouden wordt door een daartoe bevoegde onderneming. Dit heeft als effect dat de logische groepering van de juridisch-fiscale documenten kan worden uitgebreid, aangepast en/of onderhouden.In a further preferred form, this collection of legal-tax documents is maintained by an authorized company. The effect of this is that the logical grouping of the legal-tax documents can be expanded, adjusted and / or maintained.

In een uitvoeringsvorm bevatten de responsieve documenten informatievelden. Dit heeft als effect dat de gebruiker de individuele documenten van elkaar kan onderscheiden vaan de hand van deze informatie, zodoende de zoekopdracht gebruiksvriendelijker te laten verlopen. Bij voorkeur omvatten deze informatievelden titels, het artikelnummer, het hof en de datum betreffende het document.In one embodiment, the responsive documents contain information fields. This has the effect that the user can distinguish the individual documents from each other on the basis of this information, thus making the search more user-friendly. These information fields preferably comprise titles, the article number, the court and the date concerning the document.

In een uitvoeringsvorm bevatten de responsieve documenten actievelden waarmee de gebruiker toegang krijgt tot de inhoud van het document. Bij voorkeur is dit een actieveld dat een korte samenvatting oproept, en/of een actieveld dat de volledige inhoud van het document oproept.In one embodiment, the responsive documents include action fields that give the user access to the content of the document. This is preferably an action field that calls up a short summary, and / or an action field that calls up the entire content of the document.

In een verdere uitvoeringsvorm is de korte samenvatting van een responsief document gekenmerkt door bij voorkeur een titel, een artikel, een hof, een datum, een boek, en één of meerdere tags.In a further embodiment, the short summary of a responsive document is preferably characterized by a title, an article, a court, a date, a book, and one or more tags.

In een verdere uitvoeringsvorm verwijst de voorgenoemde tag naar een in de verzameling juridisch-fiscale documenten veelvoorkomend kernwoord. In een voorkeursvorm vormt deze tag een actieveld welke een zoekopdracht initieert volgens het betreffende kernwoord.In a further embodiment, the aforementioned tag refers to a keyword common in the collection of legal-tax documents. In a preferred form this tag forms an action field which initiates a search according to the relevant keyword.

In een tweede aspect betreft de uitvinding een computersysteem voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak volgens de conclusies 1-11, met het kenmerk, dat het computersysteem geconfigureerd is voor:In a second aspect, the invention relates to a computer system for sorting a collection of legal-tax documents from the case-law according to claims 1-11, characterized in that the computer system is configured for:

BE2017/6017 (a) het ontvangen van juridisch-fiscale documenten via ten minste één oom m unicatienetwerk;BE2017 / 6017 (a) receiving legal-tax documents via at least one unom unification network;

(b) het opslaan van deze documenten in een lokale niet-transitoire computer-leesbare inform atiedrager;(b) storing these documents in a local non-transitory computer-readable information carrier;

(c) het van de gebruiker via ten minste één communicatienetwerk verkrijgen van één of meerdere kernwoorden onder de vorm van een zoekopdrachtinstructies;(c) obtaining one or more keywords from the user via at least one communication network in the form of a search instruction;

(d) het selecteren van een brongroep met fiscaal-juridische documenten;(d) selecting a source group with tax-legal documents;

(e) het identificeren van zoekinformatie vervat in de brongroep (f) het creëren van een geordende resultatenlijst gebaseerd op de instructies van de gebruiker uit stap (c) volgens stap (e);(e) identifying search information contained in the source group (f) creating an ordered results list based on the user's instructions from step (c) according to step (e);

(g) het opslaan van de geordende resultatenlijst in een lokaal geheugen;(g) storing the ordered results list in a local memory;

(h) het ophalen van de geordende resultatenlijst opgeslagen in het lokaal geheugen.(h) retrieving the ordered results list stored in the local memory.

(i) het verzenden via ten minste één communicatienetwerk van de geordende resultatenlijst naar het computersysteem van de gebruiker uit stap (c).(i) sending via at least one communication network the ordered results list to the user's computer system from step (c).

In een derde aspect betreft de uitvinding een tastbare niet-transitoire computer-leesbare informatiedrager omvattende een oomputerprogrammaproduct geschikt voor het sorteren van een verzameling juridisch-fiscale documenten uit de rechtspraak volgens de conclusies 1-11, het computerprogrammaproduct omvattende instructies voor:In a third aspect the invention relates to a tangible non-transitory computer-readable information carrier comprising an computer program product suitable for sorting a collection of legal-tax documents from the case law according to claims 1-11, the computer program product comprising instructions for:

(a) het ontvangen van juridisch-fiscale documenten via ten minste één corn m unicatienetwerk;(a) receiving legal-tax documents via at least one communication network;

(b) het opslaan van deze documenten in een lokale niet-transistoire computer-leesbare inform atiedrager;(b) storing these documents in a local non-transistory computer-readable information carrier;

(c) het van de gebruiker via ten minste één communicatienetwerk verkrijgen van instructies;(c) obtaining instructions from the user via at least one communication network;

(e) het identificeren van informatie vervat in de brongroep;(e) identifying information contained in the source group;

(f) het creëren van een geordende lijst gebaseerd op de instructies van de gebruiker uit stap (c) volgens stap (e);(f) creating an ordered list based on the user's instructions from step (c) according to step (e);

(g) het opslaan van de geordende lijst in een lokaal geheugen;(g) storing the ordered list in a local memory;

(h) het ophalen van de geordende lijst opgeslagen in het lokaal geheugen.(h) retrieving the ordered list stored in the local memory.

(i) het verzenden via ten minste één communicatienetwerk van de geordende lijst naar het computersysteem van de gebruiker uit stap (c).(i) transmitting, via at least one communication network, the ordered list to the user's computer system from step (c).

In wat volgt, wordt de uitvinding beschreven a.d.h.v. niet-limiterende voorbeelden die de uitvinding illustreren, en die niet bedoeld zijn of geïnterpreteerd mogen worden om de omvang van de uitvinding te limiteren.In the following, the invention is described a.d.h.v. non-limiting examples illustrating the invention, and which are not intended or may be interpreted to limit the scope of the invention.

VOORBEELDEXAMPLE

De uitvinding zal nu verder worden toegelicht aan de hand van het volgende voorbeeld, zonder hiertoe overigens te worden beperkt.The invention will now be further elucidated with reference to the following example, without however being limited thereto.

BE2017/6017BE2017 / 6017

Het hieropvolgend voorbeeld betreft een uitvoeringsvorm waarbij de samenvatting opgesteld wordt aan de hand van de categorieën, criteria en wegingsfactoren zoals weergegeven in TabelThe following example concerns an embodiment in which the summary is drawn up on the basis of the categories, criteria and weighting factors as shown in Table

1. De criteria komen in dit geval overeen met de voorgenoemde karakteristieke vectoren, al dan niet binair.1. The criteria in this case correspond to the aforementioned characteristic vectors, binary or otherwise.

De samenvatting volgens tabel 1 bevat de vijf categorieën ‘rechter', ‘wet’, ‘tijd’, ‘inhoud' en ‘specificiteit'. Iedere categorie bestaat uit verschillende subcategorieën, hier criteria genoemd.The summary according to table 1 contains the five categories "judge", "law", "time", "content" and "specificity". Each category consists of different subcategories, called criteria here.

De categorie ‘rechter' omvat verschillende criteria welke betrekking hebben op de rechter die de rechterlijke uitspraak in dat document voortbracht. Hierbij wordt de inhoud van het document nagegaan op de naam van de rechter, zijn reputatie, of de uitspraak veranderingen in de rechtspraak teweeg bracht, of de uitspraak een precedent omvatte, en of de uitspraak plaatsvond in een hoger rechtscollege zoals het Hof van Beroep of Hof van Cassatie.The "judge" category includes various criteria that relate to the judge who produced the court ruling in that document. Hereby the content of the document is checked on the name of the judge, his reputation, whether the ruling brought about changes in the case law, whether the ruling included a precedent, and whether the ruling took place in a higher court such as the Court of Appeal or Court of Cassation.

De categorie ‘wet’ omvat verschillende criteria welke betrekking hebben op de wetgeving in het document vermeld. Hierbij wordt de inhoud van het document nagegaan of er veranderingen in de wetgeving vermeld staan, of een bepaalde wetgeving van toepassing was op het ogenblik van de uitspraak, en of een bepaalde wetgeving van toepassing was op het ogenblik van de feiten.The "law" category includes various criteria that relate to the legislation mentioned in the document. The content of the document is checked to see if there are changes in legislation, whether a certain legislation was applicable at the time of the judgment, and whether a certain legislation was applicable at the time of the facts.

De categorie ‘tijd’ omvat verschillende criteria welke betrekking hebben op data en tijdsduur. Hierbij wordt de inhoud van het document nagegaan op de datum van het pleidooien, de datum van de uitspraak, de datum van de feiten, de datum van betwisting, en de tijd tussen de feiten en uitspraak.The "time" category includes various criteria that relate to data and duration. Hereby the content of the document is checked on the date of the plea, the date of the decision, the date of the facts, the date of dispute, and the time between the facts and decision.

De categorie ‘inhoud' omvat verschillende criteria welke betrekking hebben op de algemene inhoud. Hierbij wordt de inhoud van het document nagegaan op de aanwezigheid van verwijzingen naar andere rechtsleer, verwijzingen naar andere rechtspraak, de inhoud van de uitspraak en de nieuwheid ervan, alsook het aantal bladzijden, het aantal tegenstrijdige uitspraken, het aantal uitspraken aangehaald uit het verleden, en het aantal uitspraken in dezelfde materie.The "content" category includes various criteria that relate to the general content. Here the content of the document is checked for the presence of references to other legal literature, references to other case law, the content of the judgment and its novelty, as well as the number of pages, the number of conflicting judgments, the number of judgments cited from the past, and the number of statements in the same matter.

De categorie ‘specificiteit' omvat verschillende criteria welke betrekking hebben op de nieuwheid van de uitspraken. Hierbij wordt de inhoud van het document nagegaan op eventuele vernieuwende passages en precedenten.The "specificity" category includes various criteria that relate to the novelty of the judgments. The content of the document is checked for possible innovative passages and precedents.

TABEL 1TABLE 1

CATEGORI E CATEGORI E WEGI NGS FACTOR WEGI NGS FACTOR CRITERI UM CRITERI UM Rechter Judge 1 1 Omvat het document een verandering in de rechtspraak? Does the document contain a change in case law? 1 1 Omvat het document de naam van de rechter? Does the document include the name of the judge?

BE2017/6017BE2017 / 6017

2 2 Omvat het document uitspraken van hogere rechtscolleges? Does the document include judgments from higher courts? 2 2 Is de uitspraak vernieuwend? Heeft de uitspraak een bepaalde actualiteitswaarde? Is the ruling innovative? Does the judgment have a certain topical value? 2 2 Heeft de rechter een gekende reputatie? Does the court have a known reputation? 3 3 Zijn er precedenten gekend in de rechtspraak? Are precedents known in case law? Wet Law 1 1 Omvat het document een verandering in de wetgeving? Does the document contain a change in legislation? 2 2 Omvat het document de wetgeving die van toepassing was op het ogenblik van de uitspraak? Does the document include the legislation applicable at the time of the judgment? 3 3 Omvat het document de wetgeving die van toepassing was op het ogenblik van de feiten? Does the document include the legislation applicable at the time of the facts? Tijd Time 1 1 Omvat het document de datum van het pleidooien? Does the document include the date of the plea? 1 1 Omvat het document de tijd tussen de feiten en de uitspraak? Does the document cover the time between the facts and the ruling? 2 2 Omvat het document de datum van de uitspraak? Does the document include the date of the ruling? 2 2 Omvat het document de datum van de feiten? Does the document include the date of the facts? 2 2 Omvat het document de datum van de betwisting? Does the document include the date of the dispute? Inhoud Content 1 1 Omvat het document verwijzingen naar andere rechtsleer Includes references to other legal literature 1 1 Omvat het document verwijzingen naar andere rechtspraak Includes references to other case law 1 1 Hoeveel bladzijden omvat de uitspraak? Dit is immers een criterium dat wijst op de grondigheid van de analyse. How many pages does the pronunciation include? After all, this is a criterion that points to the thoroughness of the analysis. 1 1 Zijn er over dezelfde materie al meerdere uitspraken geweest die tegenstrijdig zijn? Het kan immers een meerwaarde zijn dat dit punt wordt uitgeklaard teneinde te komen tot een zekere rechtszekerheid. Have there been multiple conflicting statements about the same subject? After all, it can be an added value that this point is clarified in order to achieve certain legal certainty. 2 2 Hoeveel uitspraken van dezelfde rechter zijn er in het verleden reeds gewezen? Dit is immers minder relevant is hoe meer gewezen uitspraken er zijn. How many judgments by the same judge have been delivered in the past? After all, this is less relevant the more pronounced judgments there are. 2 2 Hoeveel uitspraken in dezelfde materie zijn er al in het verleden gewezen? Dit is immers minder relevant is hoe meer gewezen uitspraken er zijn. How many statements in the same matter have been made in the past? After all, this is less relevant the more pronounced judgments there are. 3 3 In welke mate is de inhoud van de uitspraak en de vernieuwende passage te gebruiken in andere zaken? To what extent can the content of the statement and the innovative passage be used in other cases? Specificiteit Specificity 1 1 Is er een vernieuwende passage in de uitspraak te vinden? Is there an innovative passage in the statement? 1 1 Wordt er in het document verwezen naar een precedent? Does the document refer to a precedent?

BE2017/6017BE2017 / 6017

Claims

CONCLUSIONS

A computer-implemented method for sorting a collection of legal-tax documents from the case-law, characterized in that from this collection of legal-tax documents from the case-law a list of responsive documents is obtained, wherein the ordering of the list includes a step in which a weight parameter is attributed to each legal-fiscal document by making a summary that is tested against at least one criterion, these criteria being categorized and summaries being made, first being transformed into a characteristic vector that can then serve as input for a computer, which summaries are represented as vectors in an m-dimensional space, each document d from the document collection being described as a numerical characteristic vector w (d) = (x (d, ti), x (d, tm)) where T = {ti, t _m } represents a dictionary.

A computer-implemented method according to claim 1, characterized in that each category consists of at least one criterion, with a weighting factor attributed to each criterion.

A computer-implemented method according to claim 2, characterized in that the weight parameter is determined by performing a mathematical operation on the weighting factors.

A computer-implemented method according to claim 3, characterized in that tax-legal documents are ordered in the results list according to decreasing value of the weight parameter.

A computer-implemented method according to claim 1, characterized in that this collection of legal-tax documents are logically grouped in a storage folder in a memory unit.

A computer-implemented method according to claim 5, characterized in that this memory unit forms part of a computer system connected to the internet.

A computer-implemented method according to claim 1, characterized in that this collection of legal-tax documents is maintained by an authorized company.

A computer-implemented method according to claim 1, characterized in that the responsive documents contain information fields such as preferably title, article, court and date.

A computer-implemented method according to claim 8, characterized in that the responsive documents contain action fields such as preferably an action field that calls up a short summary, or an action field that calls up the entire contents of the document.

A computer-implemented method according to claim 9, characterized in that the short summary of a responsive document is preferably characterized by a title, an article, a court, a date, a book, and one or more tags.

BE2017 / 6017

A computer-implemented method according to claim 10, characterized in that the aforementioned tag refers to a keyword of the document.

A computer system for sorting a collection of legal-tax documents from the case law according to claims 1-11, characterized in that the computer system is configured for:

receiving legal-tax documents via at least one communication network;

b. storing these documents in a local non-transitory computer-readable information carrier;

c. obtaining instructions from the user via at least one communication network;

d. selecting a source group with tax-legal documents;

e. identifying information contained in the source group;

f. creating an ordered list based on the user's instructions from step (c) according to step (e);

g. storing the ordered list in a local memory;

h. retrieving the ordered list stored in the local memory.

i. transmitting via the at least one communication network the ordered list to the user's computer system from step (c).

A tangible non-transitory computer-readable information carrier comprising a computer program product suitable for sorting a collection of legal-tax documents from the case law according to claims 1-11, the computer program product comprising instructions for:

receiving legal-tax documents via at least one communication network;

c. obtaining instructions from the user via at least one communication network;

d. selecting a source group with tax-legal documents;

e. identifying information contained in the source group;

g. storing the ordered list in a local memory;

h. retrieving the ordered list stored in the local memory.