[go: up one dir, main page]

WO2008030111A2 - Procédé de recherche d'une ou plusieurs bases de données - Google Patents

Procédé de recherche d'une ou plusieurs bases de données Download PDF

Info

Publication number
WO2008030111A2
WO2008030111A2 PCT/NZ2007/000248 NZ2007000248W WO2008030111A2 WO 2008030111 A2 WO2008030111 A2 WO 2008030111A2 NZ 2007000248 W NZ2007000248 W NZ 2007000248W WO 2008030111 A2 WO2008030111 A2 WO 2008030111A2
Authority
WO
WIPO (PCT)
Prior art keywords
user
sequences
searching
search
user identifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/NZ2007/000248
Other languages
English (en)
Other versions
WO2008030111A3 (fr
Inventor
Matthias Thomas Frei
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CARTESIAN GRIDSPEED Ltd
Original Assignee
CARTESIAN GRIDSPEED Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CARTESIAN GRIDSPEED Ltd filed Critical CARTESIAN GRIDSPEED Ltd
Publication of WO2008030111A2 publication Critical patent/WO2008030111A2/fr
Publication of WO2008030111A3 publication Critical patent/WO2008030111A3/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/10Ontologies; Annotations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics

Definitions

  • the present invention relates to a method of searching one or more databases. More particularly but not exclusively it relates to methods for providing search services to authorised users.
  • the databases typically consist of collections of data on genomes and genetic sequences, whether DNA, RNA, or protein.
  • Patent specification WO 2005/124596 describes a data collection cataloguing method. The patent also describes a catalogued database.
  • the invention comprises a method of searching one or more subject genetic sequences maintained in at least one database.
  • the method comprises receiving a user identifier from a user; checking the received user identifier against a plurality of stored user identifiers; checking an authorisation record of the user identifier; on detecting a match between the received user identifier and one of the stored user identifiers, receiving a query genetic sequence; searching the or at least one of the subject genetic sequences in the database(s) for one or more sub-sequences of the query genetic sequence; determining a search result based at least partly on the results of searching the database(s) for the one or more sub-sequences; returning the search result to the user; and debiting the user for the search result.
  • This invention may also be said broadly to consist in the parts, elements and features referred to or indicated in the specification of the application, individually or collectively, and any or all combinations of any two or more of said parts, elements or features, and where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.
  • Figure 1 shows a preferred form system in which the method operates.
  • Figure 1 illustrates one preferred form system 100 suitable for implementation of the method described in more detail below.
  • a principal component of the system 100 is a search engine or portal 105.
  • Search engine 105 accepts requests from authorised users for example HO 1 N to search databases 115 t M .
  • the client software includes a software library that can be integrated into other software applications.
  • the client software is a stand alone program with a user interface or web page content that runs inside an internet browser application.
  • a user operating the computing device first sends a user identifier over a data network to the search engine 105.
  • the search engine 105 has access to an authorisation database 120.
  • the authorisation database 120 contains a plurality of stored user identifiers and a related authorisation record. Each of the stored user identifiers represents a user that, based on a user authorisation record is authorised to request searches of the search engine 105.
  • a user installs client software or accesses a webpage related to search engine 105 on the user computing device.
  • the user must licence the software from an authorised supplier and provide personal details to the search engine 105.
  • the search engine stores the user details in the authorisation database and assigns a user identifier to the user.
  • the user must also pay an access fee at the time of registration.
  • the search requests typically comprise a query genetic sequence.
  • the search pattern sequences in one form represent fragments of genomes or whole genomes, gene data, nucleotide sequences or proteins.
  • the user transmits a user identifier for example a user name and password to die search engine.
  • the search engine checks die authorisation database 120. If there is a match between the received user identifier and a stored user identifier in the authorisation database, and the user identifier has a valid authorisation record then die search engine elicits from the user a search request. The user then transmits the search request in the form of a search pattern sequence to the search engine.
  • the subject genetic sequence could include a nucleotide sequence. This nucleotide sequence in one search technique is divided into several consecutive sub-sequences of a length defined by the user.
  • One example given is
  • ACGTCGTTCAGCATACCGT This sequence is divided into four sub-sequences each of five characters in length where the user has requested divisions of 5.
  • the search engine 105 is interfaced to one or more databases 115j M Databases 115 are available to the search engine 105.
  • the databases include arbitrary collections of data on genomes.
  • the arbitrary collections include publicly available data as well as privately collected data and proprietary data. Included in these databases are databases on known genes of humans, animals, other creatures and plants as well as other genetic sequences of known or unknown function.
  • the databases contain one or more subject genetic sequences
  • a query genetic sequence is broken into different sub-sequences.
  • the search engine 105 searches databases 115 1---M for one or more of these sub-sequences of the search pattern sequence.
  • Patent specification WO 2005/124596 describes how the databases are catalogued using an index array and a location array to assist with searching.
  • the search engine 105 determines a search result based at least partly on the results of searching the or at least one of the subject genetic sequences in databases 115 ⁇ M for one or more of the sub-sequences. Where for example each sub-sequence provided to the search engine is found in a consecutive sequence stored in one of the databases, the result of the search will be that sequence. Where some but not all of the sub-sequences are found in the databases, search engine 105 determines the success or otherwise of the search and provides the search result back to the user.
  • the search results are returned to users or user HO 1 N .
  • a usage fee is due.
  • the user is required to pay up front for any number of searches.
  • the search result is returned to the user, the user is debited a fee for the search result.
  • the authorisation database 120 or an associated accounting component, keeps a usage account for each user to track paid fees against performed searches. If the number of prepaid searches falls below a specific threshold, the user is reminded that the user is required to top up the usage fee account by a payment.
  • subject genetic sequences ate maintained in databases HS 1 M . It is envisaged that mote than one subject genetic sequence be maintained in each database.
  • Each subject genetic sequence is maintained as a separate record.
  • Each record includes at least two data fields. These data fields include a sequence identifier and the actual sequence data.
  • annotation fields include details of submitter, function and source species. These additional annotation fields are associated with individual subject genetic sequences by storing the annotation fields in the same data record as the subject genetic sequences. Alternatively, the annotation fields are stored in a record that is linked or cross-referenced to the data record containing the genetic sequence.
  • This private set of annotations includes the user's personal reassessment of the function, what evidence would support that special function annotation, and what subsequent work is being done with that sequence.
  • the or at least one of the annotation fields is associated with a user identifier.
  • This association in one form allows annotations to record the submitter and time and date of that submission.
  • association of annotation fields with the user identifier enables the user to view annotation fields that are associated with that user. This means that those annotation fields associated with the user can be displayed to the associated user and not to any other user. This permits a user to maintain a private set of annotations. > ..
  • the user be permitted to make modifications directly to one or more of the annotation fields that are associated with that user identifier.
  • Prior art techniques require a request to be put to a database manager who then makes the required changes. Those required changes are then made available to all other users.
  • the user is able to select one or more subject genetic sequences to search using a query genetic sequence.
  • the user is able to transmit, for example by FTP, at least one subject genetic sequence in order to make that subject genetic sequence available for searching with the query genetic sequence.
  • the user maintains the query genetic sequence in computer memory.
  • Computer memory includes for example a data cache or searchable hard disk.
  • the query genetic sequence is able to be searched for one or more sub-sequences of the or at least one of the subject genetic sequences.
  • a search result is then determined based at least partly on the results of searching the query genetic sequence for the one or more sub-sequences.
  • the results are returned to the user as search results.
  • a user search in which a subject genetic sequence and a query genetic sequence are each searched for sub-sequences of the other are known as an "all by all search". It is envisaged that the user be provided with the facility to conduct all by all searches.
  • One example of the use of an all by all search would be to search all possum genomic reads by all possum genomic reads to cluster them into related groups.
  • a further example would be to search all lactation genes from all mammals (for example possum, mouse, human and cow) by way of an all by all search to cluster them into functional groups.
  • Suitable protection mechanisms are in place to prevent individual users accessing the search results of other users.
  • Search engine 105 does not store the results of the searches for a longer period than is strictly necessary to return the search result to the user.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé consistant à rechercher une ou plusieurs séquences génétiques en objet conservées dans au moins une base de données. Le procédé consiste à recevoir un identifiant utilisateur d'un utilisateur ; à vérifier l'identifiant utilisateur reçu par rapport à une pluralité d'identifiants utilisateur stockés ; à vérifier un enregistrement d'autorisation de l'identifiant utilisateur ; lors de la détection d'un appariement entre l'identifiant utilisateur reçu et un des identifiants utilisateur stockés, recevoir une séquence génétique d'interrogation ; rechercher la ou au moins une des séquences génétiques en objet dans la(les) base(s) de données pour une ou plusieurs sous-séquences de la séquence génétique d'interrogation ; déterminer un résultat de recherche sur la base d'au moins une partie des résultats de recherche dans la(les) base(s) de données pour la ou les sous-séquences ; renvoyer le résultat de recherche à l'utilisateur ; et débiter l'utilisateur du résultat de la recherche.
PCT/NZ2007/000248 2006-09-06 2007-09-06 Procédé de recherche d'une ou plusieurs bases de données Ceased WO2008030111A2 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ549692 2006-09-06
NZ54969206 2006-09-06

Publications (2)

Publication Number Publication Date
WO2008030111A2 true WO2008030111A2 (fr) 2008-03-13
WO2008030111A3 WO2008030111A3 (fr) 2008-06-26

Family

ID=39157691

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NZ2007/000248 Ceased WO2008030111A2 (fr) 2006-09-06 2007-09-06 Procédé de recherche d'une ou plusieurs bases de données

Country Status (1)

Country Link
WO (1) WO2008030111A2 (fr)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6941317B1 (en) * 1999-09-14 2005-09-06 Eragen Biosciences, Inc. Graphical user interface for display and analysis of biological sequence data
JP2004500048A (ja) * 1999-10-26 2004-01-08 バイオロジカル・ターゲッツ・インコーポレーテッド 遺伝子探索システムおよび方法
US20030113756A1 (en) * 2001-07-18 2003-06-19 Lawrence Mertz Methods of providing customized gene annotation reports
US20030220844A1 (en) * 2002-05-24 2003-11-27 Marnellos Georgios E. Method and system for purchasing genetic data

Also Published As

Publication number Publication date
WO2008030111A3 (fr) 2008-06-26

Similar Documents

Publication Publication Date Title
KR100481141B1 (ko) 소정의 검색 요청에 의해 검색 목록순을 추출하는 검색어광고 제공 시스템 및 검색어 광고 제공 방법
US7266551B2 (en) Method and system for generating a set of search terms
CN100437581C (zh) 用于基于广告群管理搜索列表印象的方法和系统
US7032229B1 (en) Automatic tracking of user progress in a software application
JP3877188B2 (ja) 電子通貨システム
WO2003023563A2 (fr) Systeme et procede pouvant modifier le classement des resultats de recherche en fonction du nombre de votes exprimees par des utilisateurs ultimes et des annonceurs
CN1201197A (zh) 网络收费服务器
US20070276800A1 (en) Method For Controlling Display Of Keyword Advertisement In Internet Search Engine And A System Thereof
US6920426B2 (en) Information ranking system, information ranking method, and computer-readable recording medium recorded with information ranking program
CA2391829A1 (fr) Systeme de catalogues electroniques a temps partage et procede associe
WO2003014865A2 (fr) Systeme et procede assurant une protection de rang et de prix dans une liste de recherche etablie par un moteur de recherche pour reseau informatique
US20150106883A1 (en) System and method for researching and accessing documents online
TWI521466B (zh) 一種用於資料管理與決策之計算裝置
KR20040059115A (ko) 키워드비딩 방식의 검색엔진을 이용한 인터넷광고
KR100460010B1 (ko) 파트너 사이트로부터의 검색 요청에 응답하여 광고주의검색 정보를 제공하는 검색어 광고 서비스 방법 및 검색어광고 서비스 시스템
JP3464881B2 (ja) 辞書構築装置および方法
WO2008030111A2 (fr) Procédé de recherche d'une ou plusieurs bases de données
JP2009086727A (ja) 画像表示装置、及びプログラム
KR101144426B1 (ko) 지식 검색 서비스를 이용한 지식 광고 노출 방법 및 지식광고 노출 시스템
KR100479363B1 (ko) 지식 검색 서비스를 이용한 지식 광고 노출 방법 및 지식광고 노출 시스템
EP4443369A1 (fr) Système informatique permettant une publicité ciblée en utilisant des informations personnelles stockées dans une pluralité d'unités de base de données d'une pluralité de sociétés partenaires, et procédé et programme exécutés dans ledit système informatique
KR20050097154A (ko) 인터넷 검색 광고를 위한 키워드 입찰 제어 방법 및키워드 입찰 제어 시스템
US20020072981A1 (en) Search engine adapted to permit real time querying of a set of internet sites
US20020078217A1 (en) Online alcoholic beverage license verification system
KR100488887B1 (ko) 인터넷 검색 광고를 위한 키워드 입찰 제어 방법 및키워드 입찰 제어 시스템

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07834852

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 07834852

Country of ref document: EP

Kind code of ref document: A2