[go: up one dir, main page]

WO2015092873A1 - Information processing system and information processing method - Google Patents

Information processing system and information processing method Download PDF

Info

Publication number
WO2015092873A1
WO2015092873A1 PCT/JP2013/083818 JP2013083818W WO2015092873A1 WO 2015092873 A1 WO2015092873 A1 WO 2015092873A1 JP 2013083818 W JP2013083818 W JP 2013083818W WO 2015092873 A1 WO2015092873 A1 WO 2015092873A1
Authority
WO
WIPO (PCT)
Prior art keywords
computer
computers
data processing
execution
response time
Prior art date
Application number
PCT/JP2013/083818
Other languages
French (fr)
Japanese (ja)
Inventor
仁史 藪崎
洋 中越
耕一 村山
崇利 加藤
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Priority to PCT/JP2013/083818 priority Critical patent/WO2015092873A1/en
Publication of WO2015092873A1 publication Critical patent/WO2015092873A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs

Definitions

  • the present invention relates to an information processing system and an information processing method for improving performance by changing the arrangement configuration of a system including a plurality of computers.
  • Patent Document 1 As a technique for collectively managing servers in a distributed data center, there is a technique described in Patent Document 1.
  • Patent Document 1 when information indicating data to be transferred and conditions for the transfer destination are received from the user terminal, the data center information indicating the characteristics and the conditions for the transfer destination from a plurality of data centers are met according to a predetermined standard.
  • the candidate data center is identified and transmitted to the user terminal, and information on the selection of the migration destination data center from the candidate data center is received from the user terminal, the migration data is received from the migration source data center.
  • a data center management device that transmits the data to a migration destination data center is disclosed.
  • Patent Document 1 does not disclose updating data center information indicating the characteristics of each data center managed by the data center management device.
  • the data center information in Patent Document 1 is static information such as location, cost, SLA, etc., it is about collecting and updating dynamic information such as response time that varies with time in real time. Not considered. For this reason, depending on the operation state and load state of the data center, a situation may occur in which the migration destination data center does not satisfy the conditions requested by the user.
  • the present invention has been made in view of the circumstances as described above, and is applied to a large-scale system in which it is difficult to collect and analyze measurement information indicating an operation state and a load state of each computer constituting the system.
  • an object of the present invention is to improve the performance of the entire system by dynamically changing the data arrangement configuration.
  • the computer system in this embodiment measures the response time by placing applications and data on different computers on a trial basis, and compares and evaluates the response time in the original configuration, and the response time is small.
  • Adopt an arrangement configuration. By repeating this trial placement and evaluation, the response time is improved step by step.
  • a computer system in which a plurality of terminal devices and a plurality of computers are connected via a network, and each of one or more first computers among the plurality of computers has the same function.
  • a computer in an executable state, wherein one of the first computers is one or more second computers that are other computers in the plurality of computers.
  • the second computer sets the copied program to an executable state, and the first and second computers execute the program in response to requests from the plurality of terminal devices.
  • Each of the plurality of terminals measures the response time of the request, and any one of the first and second computers can respond to the request from the plurality of terminals.
  • the information of the computer that processed the request, the one having a relatively long response time is identified from the response times, and the program being executed by the computer that has processed the request of the identified response time is It is characterized by issuing an instruction to stop.
  • FIGS. Schematic examples of the present invention are shown in FIGS.
  • applications and data are arranged on different execution bases on a trial basis, the response time is measured, compared with the response time in the original arrangement configuration, and an arrangement configuration with a small response time is adopted.
  • the response time is improved step by step.
  • the response time may be improved by copying the application or data to multiple execution platforms and the terminal accesses the nearest application or data.
  • the cost of computing resources increases. Therefore, when the number of replicas reaches a specific number, unnecessary execution platforms are stopped.
  • by performing a trial arrangement on many execution platforms at the same time it is possible to search for a configuration with a small response time with a small number of trials, and early recovery can be performed when responsiveness deteriorates, but the cost associated with the trial increases. Therefore, a trial placement is performed within a specific number of ranges according to the situation.
  • the sub-application (Sub App1) and data (Data1) placed in the EU data center (DC1) are copied to the Asian data center (DC2) on a trial basis.
  • FIG. 1B Processing is executed by both sub-applications after copying, and when the response time of Sub App1 of DC2 is shorter than DC1, Sub App1 of DC1 is stopped (FIG. 1 (c)).
  • the test arrangement and the evaluation process are repeated in the same manner for the Sub App 1 of DC2 (FIG. 1 (d)).
  • Sub App1 of DC2 is copied to DC3.
  • the system in the present embodiment is more suitable for application to applications that do not require strong consistency than applications that require strong consistency such as transactions.
  • the application and data are temporarily copied to different locations in a trial and error manner. For this reason, in the case of an application that requires strong consistency, data updated at the time of replication needs to be synchronized between a plurality of locations. Further, in this embodiment, the system at the location judged as unnecessary is stopped, but it is necessary to merge the updated data before stopping. From the viewpoint of ease of implementation, an application that allows reference to old data (that is, it may be weakly consistent) is suitable for the system in this embodiment.
  • the information processing system of this embodiment includes a network 110, a data processing system 100 (100-1 to 100-n), a management computer 130, and a terminal 120 (120-1 to 120-n).
  • the network 130 is a WAN (Wide Area Network) or a LAN (Local Area Network), and may be a virtual network.
  • the data processing system 100 is a system that processes data in response to access from the terminal 120.
  • the data processing system 100 includes a processor 200, a main storage device 300, a communication interface (I / F) 400, and an external storage device 500.
  • the main storage device 300 or the external storage device 500 includes a data processing program 630, an execution base 650, A location server 620 and a management function 640 are stored.
  • the data processing system 100 is shown as a single computer in FIG. 3, it may be a system composed of a plurality of computers. Further, even a multi-tenant that accommodates different applications may be a single tenant that accommodates a single application.
  • the execution platform 650 is middleware for operating the data processing program 630, and includes an OS (Operation System).
  • the location server is a system that performs name resolution, such as DNS (Domain Name Service) or global name service, and holds a name resolution table composed of combinations of URLs and IP addresses.
  • the data processing program 630 is, for example, an application program such as SNS (Social Network System) or e-commerce, or a program that operates to measure and control the state of a server, storage, or network.
  • the data processing program 630 may be a part of a program constituting the application, such as an application plug-in or a web front end.
  • the location server 620 manages the location of the execution base 650 where the data processing program 630 and data are stored, and the data processing program 630 and data are stored in response to inquiries from the terminal 120, the data processing program 630, and the management function 640.
  • the location of the execution platform 650 is notified. Since the location server 620 is equivalent to existing technologies such as a global name service and DNS (Domain Name Service), detailed description is omitted.
  • the management function 640 will be described later with reference to FIG.
  • the management computer 130 is a management device that manages the data processing system 100, and includes an input device, an operation screen, a processor, a storage device, and the like used by the administrator of the data processing system 100.
  • the management computer 130 generates and manages application status information 1000, execution platform status information 1100, and data processing program status information 1200, and notifies the data processing system 100 of them. For this reason, the operation screen of the management computer 130 displays such information, a setting button, a change button, and a delete button so that the administrator can change the setting.
  • the data processing system 100 is configured by an autonomous distributed control architecture having a management function 640
  • the management function 640 is not in the data processing system 100 but in the management computer 130. Or both.
  • the management computer 130 may exist in one or a plurality of execution platforms 650.
  • the terminal 120 is a device for generating and utilizing data, such as a smartphone, a tablet, a notebook PC, a construction device, a medical diagnostic device, a smart meter, a farming device, a car, an elevator, an escalator, and the like.
  • the terminal may include the management function 640 for managing the data processing program 630 on the terminal and managing the response time. Further, the terminal selects the data processing program 630 that requests the same or similar data processing from the plurality of data processing systems 100 and responds with a short response time, and executes the next data processing.
  • the response time until the data processing system 100 responds by requesting the data processing is notified to the location server 620 and the data processing program 630 designated by the location server 620 based on the notified response time or the like.
  • An execution platform is provided for requesting the next data processing.
  • FIG. 3 shows the hardware configuration of the data processing system 100 and the logical configuration of the management function.
  • the management function 640 includes a management calculation unit 642 and a management information storage unit 644.
  • the management calculation unit 642 includes an application quality evaluation function 6410, an execution history analysis function 6420, an execution base management function 6430, a communication function 6440, and a data processing program management function 6460.
  • the application quality evaluation function 6410 evaluates a response time, which is a time from when the terminal 120 requests data processing when using an application until a response is returned. For the measurement and evaluation, for example, an existing method such as collecting and averaging response times measured by a plurality of terminals can be applied, and thus detailed description is omitted.
  • the data processing program management function 6460 confirms the requirements of the data processing program 630.
  • the data processing program 630 can be operated or stopped.
  • the management group to which the data processing program 630 belongs is specified.
  • the execution base management function 6430 calculates the control cycle when the data processing program 630 is operated or stopped, the number of data processing programs 630 to be operated or stopped, and specifies the execution base for operating or stopping the data processing program 630. .
  • the communication function 6440 exchanges information between management functions. When there are a plurality of management function roles, the task management function 6450 determines and executes the role of the task management function 6450.
  • the execution history analysis function 6420 determines information for assisting or determining the execution base for operating or stopping the data processing program 630 from the history information regarding the past operation of the data processing program 630 or the response time at the time of stop.
  • the application / terminal information storage unit 6470 holds application state information 1000.
  • the execution base group information storage unit 6475 holds execution base state information 1100.
  • the execution history information storage unit 6480 has a response time change when the data processing program 630 is moved in the past, a response time change when the data processing program 630 is moved, a failure occurrence, or a sudden increase in access load from the terminal. Holds past execution history information such as response time changes.
  • the management group information storage unit 6485 holds status information of the management function, task list information carried by the management function, and the like.
  • the data processing program information storage unit 6490 holds data processing program status information 1200, data processing program requirements, and the like. Each information is shown in detail below.
  • the application state information is information indicating conditions such as SLA (Service Level Agreement) requested by the application and the current state. This information is referred to when controlling the operation and stop of the data processing program 630 in consideration of the state and characteristics of the application.
  • the application status information includes, for example, an application, a data processing program, an arrangement execution base, a response time, a request response time, a quality degradation allowable time, an operation cost, a related terminal group, a terminal average position, and the like.
  • the application indicates an identifier of the application
  • the data processing program indicates an identifier of the data processing program 630 constituting the application
  • the placement execution base indicates an identifier of the execution base on which the data processing program 630 is placed.
  • Response time is the time from when a terminal requests data processing to the data processing program stored in the same placement execution platform until the response is returned to the terminal. Yes, not a processing delay in the server.
  • the request response time is a response time required by an application (that is, a target or a constraint).
  • the quality degradation time is from when the response time does not meet the required response time until it is satisfied, but the allowable quality degradation time is the allowable quality degradation time.
  • the operating cost is the cost required to operate.
  • the operation cost may be classified into two, an initial cost for operating in the initial stage and a running cost that is a constant cost.
  • the initial cost includes a communication cost generated when an application or data is deployed on an execution platform, and a storage write cost.
  • the running cost includes, for example, an instance use cost and a communication cost associated with data transmission / reception with a terminal.
  • the operation cost may be managed by dividing it into an initial cost and a running cost.
  • the Related terminal group is a set of terminals that use applications.
  • the set of terminals is, for example, a customer segment such as a region where the terminal is located, a language, a user age, and an importance level for an application providing company.
  • the related terminal group may be specified for each data processing program 630.
  • the terminal average position indicates a physical average position of the related terminal group or a logical position on the network.
  • Execution base state information 1100 is illustrated in FIG.
  • the status information of the execution base is information indicating the characteristics of the execution base 630 and the characteristics of the execution base group to which the execution base 630 belongs.
  • the execution base state information is referred to when performing control related to the operation and stop of the data processing program 630 in consideration of the state and characteristics of the execution base and the state and characteristics of the execution base group.
  • Execution board state information includes, for example, execution board, execution board characteristics, execution board group, execution board group characteristics, and the like.
  • the execution base and the execution base group are the identifier of the execution base 630 and the identifier of the execution base group to which the execution base 630 belongs.
  • the execution platform characteristics include, for example, an execution platform type, an operator that provides the execution platform 630, a charging model for the service provided by the execution platform 630, and the location of the execution platform.
  • Examples of the group characteristics of the execution base include, for example, the degree of distribution of the positions of the execution bases belonging to the execution base group, the control cycle that is a control cycle of the operation and stop of the data processing program 630, and new data in the execution base group.
  • FIG. 6 illustrates status information 1200 of the data processing program.
  • the status information of the data processing program is information indicating the system configuration and operation status of the data processing program 630.
  • the status information 1200 of the data processing program is referred to when performing control related to the operation and stop of the data processing program 630 in consideration of the configuration and operation status of the data processing program 630.
  • the status information 1200 of the data processing program includes, for example, a data processing program, a data processing program attribute, a management group, an operation / stoppage availability program, a reference data processing program, a non-reference data processing program, an execution platform, and the like.
  • the data processing program is an identifier of the data processing program 630.
  • the data processing program attribute indicates an attribute of the data processing program 630, and indicates, for example, an attribute such as a Web front server, an App server, a DB server, or a required high level of data consistency.
  • the management group indicates the identifier of the management group to which the data processing program 630 belongs.
  • the operation / stop flag indicates whether or not the data processing program 630 can be newly operated on an arbitrary execution platform 650 and whether or not the data processing program 630 can be stopped.
  • the reference data program indicates an identifier of a different data processing program 630 to which the data processing program 1210 refers.
  • the referenced data processing program indicates a different data processing program 630 that refers to the data processing program 1210.
  • the execution base indicates an identifier of the execution base for operating the data processing program 1210.
  • Data processing program management function holds the requirements of the data processing program 630 for each data processing program 630.
  • Data processing program requirements here include computing resources such as CPU, memory, storage capacity, communication bandwidth, SLA such as response time, availability, failure recovery time, PV (Page View), service sales, cloud This is a requirement related to ROI (Return On Investment) indicating sales for costs associated with use, KPI (Key Performance Indicator) such as customer satisfaction, and costs associated with cloud use.
  • Fig. 7 (a) shows the status information of the management function.
  • the management function status information is information indicating the status of the management function 640.
  • the management function status information holds information necessary for the management function 640 to control the operation and stop of the data processing program 630.
  • the management function status information is classified into, for example, a management function, a management group, a task, an operation status, an arrangement execution base, and a management target data processing program.
  • the management group is as described above.
  • the management function indicates an identifier of the management function 630.
  • the task is an identifier of a task that the management function 640 bears.
  • the tasks include, for example, an inter-management group information transmission task, an analysis task for analyzing the status within the management group, and a resource task for requesting computing resources.
  • the operating status is the operating status of the management function.
  • the management target data processing program is an identifier of a data processing program managed by the management function.
  • the placement execution base is an identifier of
  • Fig. 7 (b) shows the task information of the management function.
  • the task information is information that the management function 640 refers to when determining a task that the management function 640 is responsible for.
  • the task information is uniquely set for the same management group.
  • the task information includes information such as task name and priority.
  • Each management function 640 executes, for example, a task that can execute an arbitrary task and has a high priority according to a computing resource that can be used, and is not assigned to another management function 640.
  • a task may be fixedly assigned to the management function 640.
  • FIG. 8 is a sequence diagram showing an outline of the operation of the data processing system in the present embodiment. First, a flow of a series of operations will be described, and a specific example of this operation will be described.
  • the management function 640-1 of the data processing system 1 selects the data processing system 2 based on a predetermined condition, and the same program 630-2 as the data processing program 630-1 is selected as the data processing system 2 To be ready for operation on the execution platform 650-2.
  • the management function 640-1 may send a request to the management function 640-2 so that the management function 640-2 can operate the data processing program 630-2.
  • the management function 640-1 updates the name resolution table managed by the location server 620-1 based on the information on the location of the execution base on which the data processing program 630-1 held by the management function 640-1 operates. To do.
  • the program 630-1 (630-2) operates on the execution bases of the data processing systems 1 and 2, information on the locations of both execution bases is stored in the table. For example, if the name resolution table is a combination of a URL and an IP address, the execution base IP address is added to the URL line indicating the data processing program 630-1.
  • the update timing may be updated when information on the location of the execution base where the data processing program 630 held by the management function 640 operates is updated, or at certain time intervals based on a timer. Thereafter, for the sake of simplification, description of updating the name resolution table is omitted, but the name resolution table is updated at the above timing.
  • step 820 when the application platform 720 of the terminal 120 inquires the location server 620-1 about an access destination necessary for executing the data processing program 630-1 (630-2), the location server 620-1 720 notifies the access destinations (execution platforms 650-1 and 650-2) that the data processing program (630-2) can execute.
  • the timing of the inquiry is a time when the user inputs to the terminal 120-1, a time determined regularly or in advance.
  • the location server 620 grasps all or any execution platform 650 in which the data processing program 630 can operate by sharing information between the management functions 640.
  • the access destination notified by the location server 620 is one or a plurality of execution platforms 650 determined by the management function 640 in Step 3240 and Step 3250. Since there is an existing technology such as DNS for dealing with inquiries about access destinations, a detailed description is omitted.
  • step 830 the execution platform 650-1 and 650-2, which are the access destinations grasped in step 840 by the application 710 of the terminal 120, requests data processing and returns the result quickly (in this case, the execution platform 650). -2) and requests the execution platform 650-2 to continue data processing.
  • Data processing is the contents of data processing specified by the application.
  • step 840 as in step 810, the management function 640-2 selects the data processing system 3 based on a predetermined condition, and executes the data processing program 630-2 (630-) on its execution base 650-3.
  • the program 640-3 identical to 1) is put into an operable state.
  • the management function 640-2 specifies the management function (management function 640-1), and transmits / receives updated status information (in this case, the data processing program 630-3 is operable). Similar to step 810, the application platform 720-1 makes an inquiry to the location server 620-2.
  • step 860 when the application platform 720-1 of the terminal 120-1 inquires of the location server 620-1 about the access destination, the location server 620-1 notifies the application platform 720-1 of the access destination.
  • step 870 when the application 710-1 requests data processing from the access destination (execution base 650-1) grasped in step 810, the execution base 650-1 returns the processing result of the data processing program 630-1.
  • the management functions of the data processing system are set so that each of them can autonomously execute its own data processing program on the execution base of other systems. That is, a program having the same function is executed in various systems, but since the system (execution base) for executing the program is notified to the terminal, the terminal is notified of all of the notified systems. Request processing for one of them.
  • the terminal when the terminal receives the processing result from the system that requested the processing, the terminal selects a system with a short response time and requests the next processing.
  • the program whose response time is shortened is continuously used, so that it is possible to determine the execution destination of the program so that the response time gradually decreases.
  • Each management function in this embodiment belongs to a plurality of management groups and shares information within the same management group. Moreover, each management function bears a different task. Each management function autonomously selects the task that it takes.
  • the task is, for example, a task of totaling response times notified or measured to each data processing system, a task of exchanging information with different management groups, a task of determining the next placement destination by Bayesian estimation, This is a task for analyzing an execution history or the like. Details of the management group and task determination method will be described later.
  • the execution platform management function 6430 determines the execution platform group to which the data processing program 630 to be managed belongs.
  • the execution platform group is a set of execution platforms 650 that are candidates for execution platforms for newly operating the data processing program 630.
  • An execution platform may belong to a plurality of execution platform groups.
  • the execution platform group may be defined in advance by an application developer or a distributed processing system administrator. Therefore, the GUI of the management computer includes an input field and a setting button for setting a determination policy for the execution base group. The method for determining the execution platform group is shown below.
  • the first example of the execution base group determination method is a method of determining the execution base group based on the physical location of the execution base or the logical position on the network and the execution base type. Specifically, for example, an execution platform group whose execution platform type belongs to the execution platform group is different from its own execution platform type is selected. As a result, even if a failure occurs in a certain execution base or a cyber attack occurs, it is possible to propose a possibility that all execution bases may stop operating simultaneously.
  • the data processing program 630 is compatible with the execution base and the data processing time differs depending on the execution base type, the data processing program 630 is not biased to the execution base with a slow data processing time.
  • the data processing program 630 can be operated on an execution platform with a short data processing time.
  • the above compatibility is executed when, for example, the data processing has a feature that requires a large amount of memory, CPU performance, or I / O performance, or features that are suitable for KVS or suitable for RDB. Whether or not the platform has performance and functions that meet the above characteristics.
  • each execution base group has an upper limit value of the degree of distribution, and is determined so that the distribution degree of the execution base does not exceed the upper limit value.
  • a local execution platform group that is a set of execution platforms that are physically or logically close to each other on the network, and distributed execution It can belong to both execution platform groups distributed over a wide area, which is a set of platforms.
  • the second example of the execution base group determination method is a method of determining based on the management group to which the data processing program 630 and the management function 640 running on the execution base 650 belong. In other words, execution bases on which the data processing programs 630 belonging to the same management group are deployed belong to the same execution base group.
  • the third example of the calculation method of the execution base group is a method of determining based on the execution history.
  • the execution history analysis function calculates the response time when the data processing program 630 is newly executed on the execution base from the past execution history.
  • the past execution history is, for example, a change in response time when an arbitrary data processing program 630 whose physical position on the execution base where the data processing program 630 is arranged or whose logical position on the network is similar is moved, Or, when an arbitrary data processing program 630 having similar characteristics of the data processing program 630 is moved, a response time change occurs, or a similar environmental change occurs such as a failure or a sudden increase in access load from the terminal. This is a change in response time when an arbitrary data processing program 630 is moved.
  • the execution platform group management function groups each execution platform so as to belong to one or a plurality of groups based on the calculated response time.
  • the processing from step 3210 to step 3260 is performed for each execution base group.
  • it may have a hierarchical structure in which there are execution base groups in which a plurality of execution base groups are collected.
  • the processing from step 3210 to step 3260 is performed in units of a plurality of execution base groups. May be performed.
  • step 3210 the execution base management function 6430 determines a control cycle for operating and stopping the data processing program 630.
  • a method for determining the control period is exemplified below.
  • the first example of the control cycle determination method is shown.
  • the execution base belonging to the execution base group and its position are grasped.
  • the degree of dispersion is calculated from the position information by, for example, an average value of physical distance and communication delay.
  • a control cycle is determined based on the degree of dispersion. For example, the control period is determined using a linear function that increases monotonously as the degree of dispersion increases.
  • information on the business operator, the charging model, and the execution base type may be used.
  • the control cycle may be increased by a certain rate or a certain value may be added.
  • the billing model indicates that the traffic is pay-as-you-go rather than fixed, or if the increase in the number of execution platforms leads to a significant increase in fees, the control cycle is increased by a certain rate or a certain value is set. Add.
  • a second example of the control cycle calculation method is shown below.
  • a value obtained by multiplying the quality degradation allowable time by a coefficient is set as the control period.
  • the response time of the application state information 1000, the request response time, and the allowable quality degradation time are grasped, the difference between the response time and the request response time is calculated, and the response time improvement expected range of the execution base state information 1100 is
  • the difference is larger than the difference
  • a value obtained by multiplying the quality degradation allowable time by a coefficient is set as a control cycle.
  • the operation cost may be taken into consideration because a large number of data processing programs 630 of an application having a high cost for operation such as application and data arrangement are not operated unnecessarily.
  • the product of the control cycle calculated above and the operation cost is calculated as the control cycle.
  • control cycle calculated in the first and second calculations is calculated in consideration of one or more of the excess bandwidth of the network connecting the execution infrastructure, the I / O surplus performance of the execution infrastructure, and the communication cost.
  • the product of the control cycle calculated above, the surplus bandwidth of the network connecting the execution bases, the surplus performance of the execution base I / O, and the communication cost is calculated as the control cycle.
  • the fourth example of the control cycle calculation method is shown below.
  • the estimated response time improvement cost and cost are estimated using the past execution history and simulation results.
  • the estimation method for example, Bayesian estimation can be used.
  • the expected value of the response time improvement expected range is estimated with reference to the response time improvement expected range under similar conditions obtained from the execution history. Since Bayesian estimation is a common existing technology, a detailed explanation of the calculation method is omitted.
  • the execution infrastructure management function 6430 determines the number of primary operations and the number of operations of the data processing program 630.
  • the number of temporary operations indicates the number of execution platforms 650 that operate the same data processing program 630 in an execution platform test.
  • the trial operation is to operate for a certain period of time so that it can be easily stopped or deleted if unnecessary.
  • the state that can be easily stopped or deleted includes, for example, limited operations such as permitting data reference but not permitting writing of data that requires merging of distributed data later.
  • the number of operations is the number of data processing programs 630 that continue operation without being stopped or deleted when the data processing programs 630 that are temporarily operated on the execution platforms 650 are evaluated.
  • the number of temporary operations and the number of operations are determined based on the group characteristics of the execution base and the characteristics of the execution base as in the control cycle. Further, it may be determined based on application state information and characteristics. Alternatively, it may be determined based on the surplus bandwidth of the network connecting the execution infrastructure, the surplus performance of the I / O of the execution infrastructure, and the communication cost. Alternatively, it may be determined by Bayesian estimation based on the execution history or the simulation result.
  • the execution base management function 6430 determines an execution base that makes the data processing program 630 temporarily operable or an execution base that is to be stopped.
  • candidates for execution platforms to be operated are appropriately limited as execution platform groups based on information such as location information, management groups, and application types. Therefore, in this step, the execution base group is selected at random or selected from the execution base group based on the execution history. A method of selecting based on the past execution history is shown below. Using the past execution history and simulation results, the execution base that minimizes the response time is estimated from the response time when the data processing program is arranged on the execution base included in the execution base group. As the estimation method, for example, Bayesian estimation can be used.
  • the result of measuring the average response time of a terminal accessing the execution base as a result of placing an application on a test execution base as a test.
  • the response time when placing on the same execution base under similar conditions and the placement configuration that minimizes the response time are given as a probability distribution.
  • P t (y k ) is Bayes updated from the result of the (t-1) th time. Assuming that the test location at the t-1th time is i ′, P t (y k ) is calculated by Equations 3 and 4 when the response time is equal to or less than a certain value ⁇ .
  • P 1 (y k ) 1 / N, or P ⁇ (y k ) in the prior simulation.
  • the site to be placed in the T-th trial is the site having the maximum probability P t (y k
  • the probability calculation method is illustrated when the trial placement is performed twice.
  • x i , ⁇ x j ) is calculated by Equation 5.
  • step 3240 the data processing program management function 6460 temporarily maintains or stops the state in which the data processing program 630 can be operated according to the number of operations and the number of temporary operations determined in step 3230, and the response time and cost. Is measured and evaluated. In order to make it possible to operate, it is confirmed whether necessary resources and applications have been allocated to the execution platform 650 on which the necessary applications and data are to be operated, whether they are arranged, and necessary settings are made.
  • computing resources are not secured, secure computing resources. If the computing resource is managed by another cloud operator, the computing resource is requested from the operator. If the application or data is not deployed, deploy the application or data. If initial settings are required, make necessary settings such as initial settings and activate the application. If not, check whether necessary computing resources are secured.
  • the management function 640 that grasps the execution platform 650 on which the application or data operates needs to notify the terminal 120 of the access destination. is there. Therefore, the management function 640 notifies the execution platform 650 to be accessed by the terminal 120 through the location server 620. Specifically, when the terminal 120 makes an inquiry, the terminal notifies the terminal of an identifier such as a URL that uniquely indicates an application running on the execution platform 650.
  • the timing at which the terminal 120 inquires about the access destination is a certain interval, a timing at which an application built in the terminal 120 is activated, a timing at which a reload button or the like on the terminal is pressed by the user, or the like.
  • the first example of the response time measurement method is not the response time itself, but the number of executions of data processing in each execution platform.
  • the terminal accesses the data processing program 630 arranged on the plurality of execution bases 650 and selects the execution base 650 having a short response time, the number of executions of the data processing program 630 increases in the execution base 650 near the terminal.
  • the terminal or the management function 640 measures the response time, and maintains the data processing program 630 arranged on the execution base 650 with a short response time in an operable state.
  • the terminal measures, the terminal notifies one or a plurality of location servers 620.
  • the location server notified of the response time measured by the terminal uses the information sharing mechanism between the plurality of management functions 640 shown in Step 3030 of FIG. 10 to share the response time information.
  • the data processing program management function 6460 determines a data processing program to be operated or stopped based on the evaluation result in step 3240.
  • the number of executions is measured instead of the response time, the number of executions of the data processing program 630 is larger than the data processing programs stored in the other execution bases 650, or the data processing program 630 having a predetermined threshold value or more. Is maintained in an operable state. Note that the data processing program 630 having a small number of executions may be stopped.
  • the response time is measured, if the data processing program 630 operating on the execution platform 650 having a small average response time is continued and the data processing program 630 operating on the execution platform 650 having a large average response time is stopped. decide.
  • the number of executions and the response time are compared and determined among the plurality of execution bases 650, the number of executions and the response time are shared between the management functions 640 of the different data processing systems 100.
  • a method of sharing information between different management functions 640 will be described later in step 3030 of FIG.
  • step 3260 the data processing program management function 6460 maintains or stops the operation of the data processing program determined in step 3250.
  • the procedure for operation or stop in this step is the same as the operation / stop procedure shown in step 3240.
  • step 3270 the execution infrastructure management function 6430 determines whether or not the control cycle needs to be changed. If it is determined that the control cycle is to be changed, the process proceeds to step 3210. If it is determined that the control period is not to be changed, the process proceeds to step 3220. Note that step 3220 is not necessarily executed after the second time.
  • the execution platform is physically or logically on the network, and is divided into a local execution platform group that is a set of execution platforms that are close together and a set of distributed execution platforms. It becomes possible to belong to both execution platform groups distributed over a wide area. For example, it is possible to shorten the control cycle of a local execution platform group and shorten the execution cycle group distributed over a wide area. Thus, a control loop for quickly recovering the response time locally and a control loop from the viewpoint of overall optimization based on evaluation over a wide area are possible.
  • the management function 640 calculates a management group, transmits / receives information between the management functions 640 belonging to the same management group, and determines and executes a task of the management function 640 will be described with reference to FIG. .
  • the data processing program management function 6460 refers to the management information storage unit 644 and grasps the requirements of the data processing program arranged on the execution base to be managed.
  • the data processing program management function 6460 determines the management group to which the data processing program 630 to be managed belongs.
  • the management group is a group of management functions 640 indicating a range in which information related to the status information 1200 of the data processing program is transmitted / received in the plurality of data processing programs 630 distributed to the plurality of execution platforms 650.
  • Information shared by the management functions in the same group includes a plurality of data processing programs 630 in which the management function 640 determines the data processing requested by the user or the analysis processing performed in the background. Determine which of the data processing programs 630 capable of performing similar processing is operated to be stopped / deleted, determine the location of data handled by the data processing program 630, and each data processing program 630 is executed This is used for sharing information such as excess or deficiency of computing resources of the base 650.
  • the management function 640 can limit the parties to which information such as the status information 1200 of the data processing program is transmitted and received, and the number of data processing programs 630 and management functions 640 can be limited. Even in the case of an increase, the amount of data accompanying information transmission / reception can be suppressed.
  • the management group determination method is exemplified below.
  • the management group decision method is roughly divided into three categories. One is a method based on information related to the data processing program 630, one is based on information related to the management function 640, and the other is a method based on information related to the execution platform 650.
  • the determination method is illustrated in each case. The following determination methods may be combined.
  • the first management group determination method is a method for determining based on data processing program attributes. Specifically, the data processing program and the data processing program attribute of the status information 1200 of the data processing program are grasped, and data processing programs having the same data processing program attribute are set as the same management group. When a data processing program holds a plurality of data processing program attributes, those having the same combination of data processing program attributes may be used as the same management group. Note that an application developer or a distributed processing system administrator may predefine attributes and combinations of attributes that are the same management group.
  • the second management group determination method is a method for determining based on the task of the management function 640. Specifically, the task of the management function 640 is grasped, and the management function 640 having the same task is set as the same management group. When the management function 640 holds a plurality of tasks, the same combination of tasks may be set as the same management group. Note that an application developer or a distributed processing system administrator may predefine tasks and combinations of tasks for the same management group.
  • the second management group determination method is to determine the management group based on the execution base group described later. For example, the execution base group in which the execution base value of the execution base group information 1100 is the same as the data processing program and the placement execution base of the application status information 1000 is grasped. Then, the execution base group grasped in the column of the management group of the row which is the data processing program grasped by the data processing program of the status information 1200 of the data processing program is inserted. As a result, the management function 640 that is physically close or logically close to the network can reduce the amount of communication generated by transmitting / receiving information such as the status information 1100 of the data processing program to a physically or logically local area. It becomes possible to limit to.
  • the communication function 6440 is the same as the management function to which the data processing program 630 belongs, the status information 1200 of each data processing program, the application status information 1000, the execution base group information 1100, and the measurement in step 3240.
  • the received response time information and the like are transmitted and received. Since the information sharing range is limited to the management group, an increase in the amount of communication required for information sharing between the management functions 640 can be prevented.
  • the task management function 6450 determines a task that the management function 640 takes. Specifically, the task management function 6450 needs the number of management functions 640 in charge of tasks (the number of task execution management functions) by referring to the task list information of the management group to which the management function 640 belongs. Among tasks that have not reached the number of management functions 640 to be performed (the number of necessary management functions), a task having a high priority is grasped. If the management function 640 has already been in charge of the task, if the priority is higher than the task that was originally in charge, the newly grasped task is assigned and the assignment of the task that has been in charge is stopped.
  • each management function 640 determines the number of tasks in charge based on the amount of computing resources available to itself. You may change it.
  • the task processing load may be used as a weighting factor.
  • avoidance of task assignment may be determined based on the execution base type and distribution of the execution base 650 in which the management function 640 is arranged, or the data processing program attribute.
  • an input field and a setting / change button for the determination policy information are displayed on the GUI of the management computer so that the application developer or the distributed processing system administrator can set the determination policy for the availability.
  • step 3050 the task management function 6450 executes the task in charge.
  • step 3060 the data processing program management function 6450 determines whether or not to change the management group. If it is determined that it is necessary, the process proceeds to step 3020. If it is determined that it is unnecessary, the process proceeds to 3030.
  • the method of judging is illustrated below. The following methods may be combined.
  • the first determination method is a method that is periodically performed based on a timer. That is, the management group is re-determined when a certain time has passed.
  • the period may be a constant value or a dynamically changing value.
  • As a method for calculating the period in the case of dynamic change there is a method of determining based on the execution history. For example, when the management group is re-determined and the management group is not changed, the cycle is lengthened, and when the management group is changed, the cycle is shortened.
  • the second judgment method is not a timer but a trigger based on environmental changes. For example, when the execution base is built on a virtual environment, the management group is changed when the virtual environment is migrated, when the response time shown in the application status information 1000 is increased, or when the terminal average position is changed. Judge that it will be redetermined.
  • the management group can be changed according to changes in the system environment, so that it is possible to reduce the increase in the amount of communication required for information sharing between the management functions 640.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

Provided are an information processing system and an information processing method that improve overall system performance in a large-scale system by dynamically varying the arrangement of applications and data. In a system that comprises a plurality of terminal devices and a plurality of computers, applications and data are experimentally arranged in different computers and response time is measured, the result is compared with the response time in the original arrangement and evaluated, and the arrangement exhibiting the smallest response time is adopted. Response time is improved in a stepwise manner by repeatedly performing experimental arrangement and evaluation.

Description

情報処理システム及び情報処理方法Information processing system and information processing method
 本発明は、複数の計算機からなるシステムの配置構成を変更することで性能改善を行う情報処理システム及び情報処理方法に関する。 The present invention relates to an information processing system and an information processing method for improving performance by changing the arrangement configuration of a system including a plurality of computers.
 分散配置されたデータセンタのサーバを一括管理する技術として、特許文献1に記載の技術がある。特許文献1には、移行するデータを示す情報と移行先の条件をユーザ端末から受信すると、複数のデータセンタの中からその特性を示すデータセンタ情報と移行先の条件とが所定の基準で適合する候補データセンタを特定してユーザ端末に送信し、候補データセンタの中からの移行先のデータセンタの選定についての情報をユーザ端末から受信すると、移行元のデータセンタから移行するデータを受信し、そのデータを移行先のデータセンタへ送信するデータセンタ統括装置について開示されている。 As a technique for collectively managing servers in a distributed data center, there is a technique described in Patent Document 1. In Patent Document 1, when information indicating data to be transferred and conditions for the transfer destination are received from the user terminal, the data center information indicating the characteristics and the conditions for the transfer destination from a plurality of data centers are met according to a predetermined standard. When the candidate data center is identified and transmitted to the user terminal, and information on the selection of the migration destination data center from the candidate data center is received from the user terminal, the migration data is received from the migration source data center. A data center management device that transmits the data to a migration destination data center is disclosed.
特開2012-093992号JP 2012-093992 A
 特許文献1に開示されている技術では、データセンタ統括装置で管理している各データセンタの特性を示すデータセンタ情報を更新することについては開示されていない。すなわち、特許文献1のデータセンタ情報とは所在地や費用、SLAなどの静的な情報であるため、時刻により変動する応答時間といった動的な情報をリアルタイムで収集して更新していくことについては考慮されていない。このため、データセンタの動作状態や負荷状態によっては、移行先のデータセンタがユーザの要求する条件を満たさないといった状況も発生してしまう。 The technology disclosed in Patent Document 1 does not disclose updating data center information indicating the characteristics of each data center managed by the data center management device. In other words, since the data center information in Patent Document 1 is static information such as location, cost, SLA, etc., it is about collecting and updating dynamic information such as response time that varies with time in real time. Not considered. For this reason, depending on the operation state and load state of the data center, a situation may occur in which the migration destination data center does not satisfy the conditions requested by the user.
 近年のクラウドコンピューティングサービスでは、国内外にまたがる広域なエリアに分散して設置された複数のデータセンタを連携してサービスを提供する形態が増えつつある。更に、将来的には、従来の大規模データセンタに加え、コンテナ型データセンタ、1ラック程度の小型データセンタや企業拠点のサーバといった様々な規模からなる膨大な数のサイトを利用したクラウドコンピューティングサービスが提供されることも想定される。 In recent years, in cloud computing services, there are increasing forms of providing services by linking multiple data centers installed in a wide area that spans the country and abroad. Furthermore, in the future, in addition to the conventional large-scale data center, cloud computing using a huge number of sites of various scales such as a container-type data center, a small data center of about 1 rack, and a server at a corporate base. It is also envisaged that a service will be provided.
 このような大規模かつ複雑なシステム環境下で、刻一刻と移り変わるデータセンタの動作状況や負荷状況を考慮して、応答時間等の品質が向上するようにデータやアプリケーションを配置を変更する場合、小規模なシステムの場合とは異なる課題が生じる。例えば、特許文献1のデータセンタ統括装置のような管理サーバで、各サイトの計算機の計測情報を収集し、それらの分析結果に応じてアプリケーションやデータの配置を決定するとした場合、収集する情報量が膨大なため、ネットワーク負荷や収集・分析時間が増大する。また、データやアプリケーションの配置決定までに時間がかかるため、その決定に基づきデータやアプリケーションを移動するタイミングと計測したタイミングとでは動作状況や負荷状況が変わってしまう恐れもある。 In such a large-scale and complicated system environment, considering the operation status and load status of the data center that changes every moment, when changing the arrangement of data and applications to improve the quality of response time, etc. There are different challenges than small systems. For example, if the management server such as the data center management device of Patent Document 1 collects measurement information of computers at each site and decides the arrangement of applications and data according to the analysis results, the amount of information to be collected Increases the network load and collection / analysis time. In addition, since it takes time to determine the arrangement of data and applications, there is a possibility that the operation status and load status may change between the timing of moving the data and applications and the measured timing based on the determination.
 本発明は、上述のような事情を鑑みてなされたものであり、システムを構成する各計算機の動作状態や負荷状態を示す計測情報を収集して分析することが困難な大規模システムにおいて、アプリケーション及びデータの配置構成を動的に変更することで、システム全体としての性能を改善することを目的とする。 The present invention has been made in view of the circumstances as described above, and is applied to a large-scale system in which it is difficult to collect and analyze measurement information indicating an operation state and a load state of each computer constituting the system. In addition, an object of the present invention is to improve the performance of the entire system by dynamically changing the data arrangement configuration.
 上記目的を達成するため、本実施例における計算機システムは、試験的にアプリケーションやデータを異なる計算機に配置して応答時間を計測し、元の配置構成における応答時間と比較評価し、応答時間が小さい配置構成を採用する。この試験的な配置と評価を繰り返すことによって、応答時間を段階的に改善する。 In order to achieve the above object, the computer system in this embodiment measures the response time by placing applications and data on different computers on a trial basis, and compares and evaluates the response time in the original configuration, and the response time is small. Adopt an arrangement configuration. By repeating this trial placement and evaluation, the response time is improved step by step.
 具体的には、複数の端末装置と複数の計算機がネットワークを介して接続された計算機システムであって、前記複数の計算機のなかの1又は2以上の第1の計算機の各々は、同一の機能を備える実行可能な状態のプログラムを保持しており、前記第1の計算機のいずれかの計算機は、前記プログラムを前記複数の計算機のなかの他の計算機である1又は2以上の第2の計算機へコピーして、前記第2の計算機はコピーされたプログラムを実行可能な状態に設定し、前記第1及び第2の計算機は、前記複数の端末装置からのリクエストに応じて前記プログラムを実行し、前記複数の端末の各々は、前記リクエストの応答時間を計測し、前記第1及び第2の計算機のなかのいずれかの第3の計算機は、前記複数の端末より前記リクエストの応答時間と前記リクエストを処理した計算機の情報を受信し、前記応答時間のなかから相対的に応答時間の長いものを特定し、特定した前記応答時間のリクエストを処理した計算機で実行中の前記プログラムを停止するよう指示を出すことを特徴とする。 Specifically, it is a computer system in which a plurality of terminal devices and a plurality of computers are connected via a network, and each of one or more first computers among the plurality of computers has the same function. A computer in an executable state, wherein one of the first computers is one or more second computers that are other computers in the plurality of computers. The second computer sets the copied program to an executable state, and the first and second computers execute the program in response to requests from the plurality of terminal devices. Each of the plurality of terminals measures the response time of the request, and any one of the first and second computers can respond to the request from the plurality of terminals. And the information of the computer that processed the request, the one having a relatively long response time is identified from the response times, and the program being executed by the computer that has processed the request of the identified response time is It is characterized by issuing an instruction to stop.
 本発明によれば、大規模な分散システムにおいて、システム全体としての性能が改善されるように、データ及びアプリケーションの配置構成を動的に変更することが可能になる。 According to the present invention, in a large-scale distributed system, it is possible to dynamically change the arrangement configuration of data and applications so that the performance of the entire system is improved.
本実施例の概要を例示する図である。It is a figure which illustrates the outline | summary of a present Example. 分散処理システムの構成を例示する図である。It is a figure which illustrates the structure of a distributed processing system. データ処理システムの構成を例示する図である。It is a figure which illustrates the composition of a data processing system. アプリケーションの状態情報を例示する図である。It is a figure which illustrates the status information of an application. 実行基盤の状態情報を例示する図である。It is a figure which illustrates the status information of an execution base. データ処理プログラムの状態情報を例示する図である。It is a figure which illustrates the status information of a data processing program. 管理機能の状態情報及びタスク情報を例示する図である。It is a figure which illustrates the status information and task information of a management function. 遠方の実行基盤に配置されたデータ処理プログラムが端末近傍の実行基盤に配置される流れを例示する図である。It is a figure which illustrates the flow by which the data processing program arrange | positioned at the distant execution base | substrate is arrange | positioned at the execution base | substrate near a terminal. 管理機能がデータ処理プログラムを新たに動作・停止させる実行基盤を判断し、実行するまでの流れを例示する図である。It is a figure which illustrates the flow until a management function judges the execution base which operates / stops a data processing program anew, and executes it. 管理機能が管理グループを決定してタスクを実行するまでの流れを例示する図である。It is a figure which illustrates the flow until a management function determines a management group and performs a task.
 本発明における実施例の概略を図1(a)~(d)に示す。本実施例におけるシステムでは、試験的にアプリケーションやデータを異なる実行基盤に配置して応答時間を計測し、元の配置構成における応答時間と比較評価し、応答時間が小さい配置構成を採用する。この試験的な配置と評価を繰り返すことによって、応答時間を段階的に改善する。 Schematic examples of the present invention are shown in FIGS. In the system of the present embodiment, applications and data are arranged on different execution bases on a trial basis, the response time is measured, compared with the response time in the original arrangement configuration, and an arrangement configuration with a small response time is adopted. By repeating this trial placement and evaluation, the response time is improved step by step.
 アプリケーションやデータを複数の実行基盤に複製し、端末が最寄りのアプリケーションやデータにアクセスすることで応答時間が改善することがあるが、複製する数が増えると、コンピューティングリソースのコストが増加する。そのため、複製数が特定の数に達すると、不要な実行基盤を停止させる。また、同時に多くの実行基盤に試験的な配置を行うことで、少ない試行回数で応答時間が小さい構成を探索でき、応答性劣化時に早期復旧できるが、試行に伴うコストが増える。そのため、状況に応じて特定の数の範囲で試験的な配置を行う。 ∙ The response time may be improved by copying the application or data to multiple execution platforms and the terminal accesses the nearest application or data. However, as the number of copies increases, the cost of computing resources increases. Therefore, when the number of replicas reaches a specific number, unnecessary execution platforms are stopped. In addition, by performing a trial arrangement on many execution platforms at the same time, it is possible to search for a configuration with a small response time with a small number of trials, and early recovery can be performed when responsiveness deteriorates, but the cost associated with the trial increases. Therefore, a trial placement is performed within a specific number of ranges according to the situation.
 例えば、初期配置構成(図1(a))ではEUのデータセンタ(DC1)に配置されているサブアプリケーション(Sub App1)とデータ(Data1)を、試験的にアジアのデータセンタ(DC2)にコピーする(図1(b))。コピー後に両方のサブアプリケーションで処理を実行し、DC1よりもDC2のSub App1の応答時間が小さい場合は、DC1のSub App1を停止する(図1(c))。また、DC2のSub App1についても、この試験的な配置と評価の処理が同様に繰り返される(図1(d))。ここでは、DC2のSub App1はDC3にコピーされている。 For example, in the initial configuration (Fig. 1 (a)), the sub-application (Sub App1) and data (Data1) placed in the EU data center (DC1) are copied to the Asian data center (DC2) on a trial basis. (FIG. 1B). Processing is executed by both sub-applications after copying, and when the response time of Sub App1 of DC2 is shorter than DC1, Sub App1 of DC1 is stopped (FIG. 1 (c)). In addition, the test arrangement and the evaluation process are repeated in the same manner for the Sub App 1 of DC2 (FIG. 1 (d)). Here, Sub App1 of DC2 is copied to DC3.
 なお、本実施例におけるシステムには、トランザクション等の強い一貫性が求められるアプリケーションよりも、強い一貫性を要求しないアプリケーションへの適用が適している。本実施例においては、トライアンドエラーの要領で、アプリケーションやデータを一時的に異なる箇所に複製する。そのため、強い一貫性が必要なアプリケーションの場合、複製時に更新されたデータを複数の箇所の間で同期処理が必要になる。また、本実施例では、不要と判断された箇所にあるシステムを停止させるが、停止させる前に更新されたデータをマージする必要がある。実装容易性の観点から、古いデータの参照を許容する(即ち、弱い一貫性でよい)アプリケーションが本実施例におけるシステムに適する。 It should be noted that the system in the present embodiment is more suitable for application to applications that do not require strong consistency than applications that require strong consistency such as transactions. In the present embodiment, the application and data are temporarily copied to different locations in a trial and error manner. For this reason, in the case of an application that requires strong consistency, data updated at the time of replication needs to be synchronized between a plurality of locations. Further, in this embodiment, the system at the location judged as unnecessary is stopped, but it is necessary to merge the updated data before stopping. From the viewpoint of ease of implementation, an application that allows reference to old data (that is, it may be weakly consistent) is suitable for the system in this embodiment.
 本実施例における分散処理システムの構成の概略を図2に示す。本実施形態の情報処理システムは、ネットワーク110、データ処理システム100(100-1~100-n)、管理計算機130、端末120(120-1~120-n)を備える。ネットワーク130はWAN(Wide Area Network)またはLAN(Local Area Network)であり、仮想的なネットワークであってもよい。 The outline of the configuration of the distributed processing system in this embodiment is shown in FIG. The information processing system of this embodiment includes a network 110, a data processing system 100 (100-1 to 100-n), a management computer 130, and a terminal 120 (120-1 to 120-n). The network 130 is a WAN (Wide Area Network) or a LAN (Local Area Network), and may be a virtual network.
 データ処理システム100は、端末120からのアクセスに応じてデータを処理するシステムである。データ処理システム100は、プロセッサ200、主記憶装置300、通信インタフェース(I/F)400、外部記憶装置500を備え、主記憶装置300または外部記憶装置500にはデータ処理プログラム630、実行基盤650、ロケーションサーバ620、管理機能640が格納されている。図3ではデータ処理システム100は計算機単体として示しているが、複数の計算機から構成されるシステムであってもよい。また、異なるアプリケーションを収容するマルチテナントであっても単一のアプリケーションを収容するシングルテナントであってもよい。 The data processing system 100 is a system that processes data in response to access from the terminal 120. The data processing system 100 includes a processor 200, a main storage device 300, a communication interface (I / F) 400, and an external storage device 500. The main storage device 300 or the external storage device 500 includes a data processing program 630, an execution base 650, A location server 620 and a management function 640 are stored. Although the data processing system 100 is shown as a single computer in FIG. 3, it may be a system composed of a plurality of computers. Further, even a multi-tenant that accommodates different applications may be a single tenant that accommodates a single application.
 実行基盤650とはデータ処理プログラム630を動作させるためのミドルウェアであり、OS(Operation System)を含む。ロケーションサーバは名前解決を行うシステムであり、DNS(Domain Name Service)やグローバルネームサービスなどであり、URLとIPアドレスの組合せなどで構成される名前解決のテーブルを保持する。データ処理プログラム630とは、例えば、SNS(Social Network System)やイーコマース等のアプリケーションのプログラムや、サーバやストレージやネットワークの状態を計測して制御する為の動作をするプログラムなどである。データ処理プログラム630は、アプリケーションのプラグインやウェブフロントエンド等、アプリケーションを構成するプログラムの一部であってもよい。 The execution platform 650 is middleware for operating the data processing program 630, and includes an OS (Operation System). The location server is a system that performs name resolution, such as DNS (Domain Name Service) or global name service, and holds a name resolution table composed of combinations of URLs and IP addresses. The data processing program 630 is, for example, an application program such as SNS (Social Network System) or e-commerce, or a program that operates to measure and control the state of a server, storage, or network. The data processing program 630 may be a part of a program constituting the application, such as an application plug-in or a web front end.
 ロケーションサーバ620は、データ処理プログラム630やデータが格納される実行基盤650の場所を管理し、端末120やデータ処理プログラム630、管理機能640の問い合わせに対して、データ処理プログラム630やデータが格納される実行基盤650の場所を通知する。ロケーションサーバ620はグローバルネームサービスやDNS(Domain Name Service)等の既存技術と同等であるため、詳細な説明は省略する。管理機能640は図3を用いて後述する。 The location server 620 manages the location of the execution base 650 where the data processing program 630 and data are stored, and the data processing program 630 and data are stored in response to inquiries from the terminal 120, the data processing program 630, and the management function 640. The location of the execution platform 650 is notified. Since the location server 620 is equivalent to existing technologies such as a global name service and DNS (Domain Name Service), detailed description is omitted. The management function 640 will be described later with reference to FIG.
 管理計算機130は、データ処理システム100を管理する管理装置であり、データ処理システム100の管理者が利用する入力装置と操作画面とプロセッサと記憶装置等を備える。管理計算機130は、アプリケーションの状態情報1000、実行基盤の状態情報1100、データ処理プログラムの状態情報1200を生成、管理し、データ処理システム100に通知する。そのため、管理計算機130の操作画面には、管理者が設定変更等を行うために、これらの情報、及び、設定ボタン、変更ボタン、削除ボタンが表示される。 The management computer 130 is a management device that manages the data processing system 100, and includes an input device, an operation screen, a processor, a storage device, and the like used by the administrator of the data processing system 100. The management computer 130 generates and manages application status information 1000, execution platform status information 1100, and data processing program status information 1200, and notifies the data processing system 100 of them. For this reason, the operation screen of the management computer 130 displays such information, a setting button, a change button, and a delete button so that the administrator can change the setting.
 本実施例では、データ処理システム100が管理機能640を備える自律分散制御のアーキテクチャで構成される事例を中心に説明するが、管理機能640はデータ処理システム100ではなく、管理計算機130にあっても、または両方にあってもよい。また、管理計算機130は実行基盤650の一つまたは複数に存在してもよい。 In the present embodiment, an example in which the data processing system 100 is configured by an autonomous distributed control architecture having a management function 640 will be mainly described. However, the management function 640 is not in the data processing system 100 but in the management computer 130. Or both. Further, the management computer 130 may exist in one or a plurality of execution platforms 650.
 端末120は、データを生成、利活用するための機器で、例えばスマートフォンやタブレット、ノートPCや、建設機器、医療用診断装置、スマートメータ、農作用機器、自動車、エレベータやエスカレータ等である。端末は端末上のデータ処理プログラム630の管理や応答時間を管理する前記管理機能640を備えてもよい。また、端末は、複数のデータ処理システム100に対して同じ、または類似したデータ処理を要求して短い応答時間で応答したデータ処理プログラム630を選択し、次のデータ処理を要求する為の実行基盤を備える、または、上記データ処理を要求してデータ処理システム100が応答するまでの応答時間をロケーションサーバ620に通知し、通知した応答時間等に基づいてロケーションサーバ620が指定したデータ処理プログラム630に次のデータ処理を要求する為の実行基盤を備える。 The terminal 120 is a device for generating and utilizing data, such as a smartphone, a tablet, a notebook PC, a construction device, a medical diagnostic device, a smart meter, a farming device, a car, an elevator, an escalator, and the like. The terminal may include the management function 640 for managing the data processing program 630 on the terminal and managing the response time. Further, the terminal selects the data processing program 630 that requests the same or similar data processing from the plurality of data processing systems 100 and responds with a short response time, and executes the next data processing. Or the response time until the data processing system 100 responds by requesting the data processing is notified to the location server 620 and the data processing program 630 designated by the location server 620 based on the notified response time or the like. An execution platform is provided for requesting the next data processing.
 データ処理システム100のハードウェア構成及び管理機能の論理構成を図3に示す。管理機能640は管理演算部642と管理情報記憶部644とで構成される。管理演算部642は、アプリ品質評価機能6410、実行履歴分析機能6420、実行基盤管理機能6430、通信機能6440、データ処理プログラム管理機能6460とで構成される。アプリ品質評価機能6410は、端末120がアプリケーションを利用する際にデータ処理を要求してから応答が返ってくるまでの時間である応答時間を評価する。計測や評価は、例えば複数の端末で計測した応答時間を収集して平均する等の既存の手法を適用できるため、詳細な説明を割愛する。 FIG. 3 shows the hardware configuration of the data processing system 100 and the logical configuration of the management function. The management function 640 includes a management calculation unit 642 and a management information storage unit 644. The management calculation unit 642 includes an application quality evaluation function 6410, an execution history analysis function 6420, an execution base management function 6430, a communication function 6440, and a data processing program management function 6460. The application quality evaluation function 6410 evaluates a response time, which is a time from when the terminal 120 requests data processing when using an application until a response is returned. For the measurement and evaluation, for example, an existing method such as collecting and averaging response times measured by a plurality of terminals can be applied, and thus detailed description is omitted.
 データ処理プログラム管理機能6460は、データ処理プログラム630の要件を確認する。また、データ処理プログラム630を動作できる状態、または停止状態にする。なお、動作できる状態にする上で、実行基盤の種別が異なること等を理由にコンフィグファイル等のパラメータの変更が必要な場合は変更を行う。また、データ処理プログラム630が属する管理グループを特定する。実行基盤管理機能6430は、データ処理プログラム630の動作や停止を行う際の制御周期、動作や停止させるデータ処理プログラム630の数を計算し、データ処理プログラム630を動作または停止させる実行基盤を特定する。通信機能6440は、管理機能間での情報交換を行う。タスク管理機能6450は、複数の管理機能の役割がある場合に、自身が担う役割を判断し、実行する。実行履歴分析機能6420は、過去のデータ処理プログラム630の動作や停止時の応答時間に関する履歴情報からデータ処理プログラム630を動作または停止させる実行基盤を判断する、または判断補助になる情報を計算する。 
 アプリ・端末情報記憶部6470はアプリケーションの状態情報1000を保持する。実行基盤グループ情報記憶部6475は、実行基盤の状態情報1100を保持する。実行履歴情報記憶部6480は、 過去にデータ処理プログラム630を動かした際の応答時間の変化、データ処理プログラム630を動かした際の応答時間の変化、障害発生や端末からのアクセス負荷急増した際の応答時間の変化などの過去の実行履歴情報を保持する。管理グループ情報記憶部6485は、管理機能の状態情報や管理機能が担うタスク一覧情報などを保持する。データ処理プログラム情報記憶部6490は、データ処理プログラムの状態情報1200、データ処理プログラムの要件などを保持する。以下に各情報を詳細に示す。
The data processing program management function 6460 confirms the requirements of the data processing program 630. In addition, the data processing program 630 can be operated or stopped. In addition, when it is necessary to change a parameter such as a configuration file because the type of execution base is different in order to make it operable, it is changed. Also, the management group to which the data processing program 630 belongs is specified. The execution base management function 6430 calculates the control cycle when the data processing program 630 is operated or stopped, the number of data processing programs 630 to be operated or stopped, and specifies the execution base for operating or stopping the data processing program 630. . The communication function 6440 exchanges information between management functions. When there are a plurality of management function roles, the task management function 6450 determines and executes the role of the task management function 6450. The execution history analysis function 6420 determines information for assisting or determining the execution base for operating or stopping the data processing program 630 from the history information regarding the past operation of the data processing program 630 or the response time at the time of stop.
The application / terminal information storage unit 6470 holds application state information 1000. The execution base group information storage unit 6475 holds execution base state information 1100. The execution history information storage unit 6480 has a response time change when the data processing program 630 is moved in the past, a response time change when the data processing program 630 is moved, a failure occurrence, or a sudden increase in access load from the terminal. Holds past execution history information such as response time changes. The management group information storage unit 6485 holds status information of the management function, task list information carried by the management function, and the like. The data processing program information storage unit 6490 holds data processing program status information 1200, data processing program requirements, and the like. Each information is shown in detail below.
 アプリケーションの状態情報1000を図4に例示する。アプリケーションの状態情報とは、アプリケーションが要求するSLA(Service Level Agreement)等の条件と現在の状態を示す情報である。この情報はアプリケーションの状態や特性を考慮してデータ処理プログラム630の動作や停止に関する制御する上で参照される。アプリケーション状態情報は例えば、アプリケーション、データ処理プログラム、配置実行基盤、応答時間、要求応答時間、品質劣化許容時間、動作コスト、関連端末グループ、端末平均位置などを含む。ここで、アプリケーションはアプリケーションの識別子を示し、データ処理プログラムはアプリケーションなどを構成するデータ処理プログラム630の識別子を示し、配置実行基盤はデータ処理プログラム630が配置される実行基盤の識別子を示す。 Application state information 1000 is illustrated in FIG. The application state information is information indicating conditions such as SLA (Service Level Agreement) requested by the application and the current state. This information is referred to when controlling the operation and stop of the data processing program 630 in consideration of the state and characteristics of the application. The application status information includes, for example, an application, a data processing program, an arrangement execution base, a response time, a request response time, a quality degradation allowable time, an operation cost, a related terminal group, a terminal average position, and the like. Here, the application indicates an identifier of the application, the data processing program indicates an identifier of the data processing program 630 constituting the application, and the placement execution base indicates an identifier of the execution base on which the data processing program 630 is placed.
 応答時間は端末が、同行の配置実行基盤に格納されるデータ処理プログラムにデータ処理を要求してから同配置実行基盤においてデータ処理が完了して応答が端末に結果が返ってくるまでの時間であり、サーバ内での処理遅延ではない。要求応答時間はアプリケーションが要求する(つまり目標または制約とする)応答時間である。応答時間が要求応答時間を満たさなくなってから満たすようになるまでが品質劣化時間であるが、品質劣化許容時間は許容する品質劣化時間である。 Response time is the time from when a terminal requests data processing to the data processing program stored in the same placement execution platform until the response is returned to the terminal. Yes, not a processing delay in the server. The request response time is a response time required by an application (that is, a target or a constraint). The quality degradation time is from when the response time does not meet the required response time until it is satisfied, but the allowable quality degradation time is the allowable quality degradation time.
 動作コストは動作させるためにかかるコストである。動作コストは初期に動作させるためにかかるイニシャルコストと定常的にかかるコストであるランニングコストの二つに分類されていてもよい。イニシャルコストには、アプリケーションやデータを実行基盤にデプロイするために発生する通信コストやストレージへの書き込みコストなどがある。ランニングコストとは、例えばインスタンス利用コストや端末とのデータ送受信に伴う通信コストなどがある。動作コストは、イニシャルコストとランニングコストに分けて管理されてもよい。 The operating cost is the cost required to operate. The operation cost may be classified into two, an initial cost for operating in the initial stage and a running cost that is a constant cost. The initial cost includes a communication cost generated when an application or data is deployed on an execution platform, and a storage write cost. The running cost includes, for example, an instance use cost and a communication cost associated with data transmission / reception with a terminal. The operation cost may be managed by dividing it into an initial cost and a running cost.
 関連端末グループとは、アプリケーションを利用する端末の集合である。端末の集合は、たとえば、端末がある地域や言語、ユーザの年齢やアプリケーション提供企業にとっての重要度などの顧客セグメントである。関連端末グループはデータ処理プログラム630毎に指定されてもよい。端末平均位置とは、関連端末グループの物理的な平均位置、またはネットワーク上の論理的な位置を示す。 Related terminal group is a set of terminals that use applications. The set of terminals is, for example, a customer segment such as a region where the terminal is located, a language, a user age, and an importance level for an application providing company. The related terminal group may be specified for each data processing program 630. The terminal average position indicates a physical average position of the related terminal group or a logical position on the network.
 実行基盤の状態情報1100を図5に例示する。実行基盤の状態情報とは、実行基盤630の特性や実行基盤630が属する実行基盤グループの特性を示す情報である。実行基盤状態情報は実行基盤の状態や特性、また、実行基盤グループの状態や特性を考慮してデータ処理プログラム630の動作や停止に関する制御を行う上で参照される。実行基盤の状態情報はたとえば、実行基盤、実行基盤特性、実行基盤グループ、実行基盤のグループ特性などを含む。ここで、実行基盤、および、実行基盤グループは、実行基盤630の識別子、実行基盤630が属する実行基盤グループの識別子である。実行基盤特性にはたとえば、実行基盤タイプ、実行基盤630を提供する事業者、実行基盤630提供サービスの課金モデル、および実行基盤の位置が含まれる。実行基盤のグループ特性にはたとえば、実行基盤グループに属する実行基盤の位置の分散度合い、データ処理プログラム630の動作や停止の制御の周期である制御周期、および、実行基盤グループ内での新たなデータ処理プログラム630の起動によって見込まれる応答時間の改善幅が含まれる。 Execution base state information 1100 is illustrated in FIG. The status information of the execution base is information indicating the characteristics of the execution base 630 and the characteristics of the execution base group to which the execution base 630 belongs. The execution base state information is referred to when performing control related to the operation and stop of the data processing program 630 in consideration of the state and characteristics of the execution base and the state and characteristics of the execution base group. Execution board state information includes, for example, execution board, execution board characteristics, execution board group, execution board group characteristics, and the like. Here, the execution base and the execution base group are the identifier of the execution base 630 and the identifier of the execution base group to which the execution base 630 belongs. The execution platform characteristics include, for example, an execution platform type, an operator that provides the execution platform 630, a charging model for the service provided by the execution platform 630, and the location of the execution platform. Examples of the group characteristics of the execution base include, for example, the degree of distribution of the positions of the execution bases belonging to the execution base group, the control cycle that is a control cycle of the operation and stop of the data processing program 630, and new data in the execution base group The response time improvement expected by the activation of the processing program 630 is included.
 データ処理プログラムの状態情報1200を図6に例示する。データ処理プログラムの状態情報とは、データ処理プログラム630のシステム構成や動作状況を示す情報である。データ処理プログラムの状態情報1200はデータ処理プログラム630の構成や動作状況を考慮してデータ処理プログラム630の動作や停止に関する制御を行う上で参照される。データ処理プログラムの状態情報1200はたとえば、データ処理プログラム、データ処理プログラム属性、管理グループ、動作・停止可否プログラム、参照データ処理プログラム、非参照データ処理プログラム、実行基盤などを含む。ここで、データ処理プログラムはデータ処理プログラム630の識別子である。 FIG. 6 illustrates status information 1200 of the data processing program. The status information of the data processing program is information indicating the system configuration and operation status of the data processing program 630. The status information 1200 of the data processing program is referred to when performing control related to the operation and stop of the data processing program 630 in consideration of the configuration and operation status of the data processing program 630. The status information 1200 of the data processing program includes, for example, a data processing program, a data processing program attribute, a management group, an operation / stoppage availability program, a reference data processing program, a non-reference data processing program, an execution platform, and the like. Here, the data processing program is an identifier of the data processing program 630.
 データ処理プログラム属性はデータ処理プログラム630の属性を示し、例えば、Webフロントサーバ、Appサーバ、DBサーバなどの属性または、要求されるデータ一貫性の高さなどを示す。管理グループはデータ処理プログラム630が属する管理グループの識別子を示す。動作・停止可否フラグはデータ処理プログラム630を任意の実行基盤650において新たに動作可能か否か、および、停止可能か否かの2種類の情報を示す。参照データプログラムとは、データ処理プログラム1210が参照する異なるデータ処理プログラム630の識別子を示す。被参照データ処理プログラムとは、データ処理プログラム1210を参照する異なるデータ処理プログラム630を示す。実行基盤とはデータ処理プログラム1210を動作させる実行基盤の識別子を示す。 The data processing program attribute indicates an attribute of the data processing program 630, and indicates, for example, an attribute such as a Web front server, an App server, a DB server, or a required high level of data consistency. The management group indicates the identifier of the management group to which the data processing program 630 belongs. The operation / stop flag indicates whether or not the data processing program 630 can be newly operated on an arbitrary execution platform 650 and whether or not the data processing program 630 can be stopped. The reference data program indicates an identifier of a different data processing program 630 to which the data processing program 1210 refers. The referenced data processing program indicates a different data processing program 630 that refers to the data processing program 1210. The execution base indicates an identifier of the execution base for operating the data processing program 1210.
 また、データ処理プログラム管理機能は、データ処理プログラム630毎にデータ処理プログラム630の要件を保持する。ここでデータ処理プログラムの要件とは、CPUやメモリ、ストレージ容量、通信帯域などのコンピューティングリソースや、応答時間、可用性、障害復旧時間などのSLAや、PV(Page View)、サービスの売り上げ、クラウド利用に伴うコストに対する売り上げを示すROI(Return On Investment)、顧客満足度などのKPI(Key Performance Indicator)や、クラウド利用に伴うコストに関する要件である。 Further, the data processing program management function holds the requirements of the data processing program 630 for each data processing program 630. Data processing program requirements here include computing resources such as CPU, memory, storage capacity, communication bandwidth, SLA such as response time, availability, failure recovery time, PV (Page View), service sales, cloud This is a requirement related to ROI (Return On Investment) indicating sales for costs associated with use, KPI (Key Performance Indicator) such as customer satisfaction, and costs associated with cloud use.
 図7(a)に管理機能の状態情報を示す。管理機能の状態情報とは、管理機能640の状態を示す情報である。管理機能の状態情報は管理機能640が、データ処理プログラム630の動作や停止に関する制御を行う上で必要になる情報が保持される。管理機能の状態情報はたとえば、管理機能、管理グループ、タスク、動作状況、配置実行基盤、管理対象データ処理プログラムに分類される。管理グループは前述のとおりである。管理機能は管理機能630の識別子を示す。タスクは管理機能640が担うタスクの識別子である。タスクには、例えば、管理グループ間情報伝達タスク、管理グループ内状態を分析する分析タスク、コンピューティングリソースを要求するリソースタスクなどがある。動作状況は、管理機能の動作状況である。管理対象データ処理プログラムとは、管理機能が管理するデータ処理プログラムの識別子である。配置実行基盤は管理機能が配置される実行基盤(またはデータ処理システム)の識別子である。 Fig. 7 (a) shows the status information of the management function. The management function status information is information indicating the status of the management function 640. The management function status information holds information necessary for the management function 640 to control the operation and stop of the data processing program 630. The management function status information is classified into, for example, a management function, a management group, a task, an operation status, an arrangement execution base, and a management target data processing program. The management group is as described above. The management function indicates an identifier of the management function 630. The task is an identifier of a task that the management function 640 bears. The tasks include, for example, an inter-management group information transmission task, an analysis task for analyzing the status within the management group, and a resource task for requesting computing resources. The operating status is the operating status of the management function. The management target data processing program is an identifier of a data processing program managed by the management function. The placement execution base is an identifier of the execution base (or data processing system) on which the management function is placed.
 図7(b)に管理機能のタスク情報を示す。タスク情報は、管理機能640が、自身が担当するタスクを決定する際に参照する情報である。タスク情報は、同一の管理グループに対して一意に設定される。タスク情報は、タスク名と優先度等の情報を含む。各管理機能640はたとえば、任意のタスクを実行できる状態で、その中から利用可能なコンピューティングリソースに応じて優先度が高く、他の管理機能640によって担当されていないタスクを実行する。なお、固定的に管理機能640にタスクが割り当てられてもよい。 Fig. 7 (b) shows the task information of the management function. The task information is information that the management function 640 refers to when determining a task that the management function 640 is responsible for. The task information is uniquely set for the same management group. The task information includes information such as task name and priority. Each management function 640 executes, for example, a task that can execute an arbitrary task and has a high priority according to a computing resource that can be used, and is not assigned to another management function 640. A task may be fixedly assigned to the management function 640.
 図8は、本実施形態におけるデータ処理システムの動作の概略を示すシーケンス図である。まず、一連の動作の流れを説明し、本動作の具体例を説明する。 FIG. 8 is a sequence diagram showing an outline of the operation of the data processing system in the present embodiment. First, a flow of a series of operations will be described, and a specific example of this operation will be described.
 ステップ810において、データ処理システム1の管理機能640-1が、予め定められた条件に基づきデータ処理システム2を選択し、データ処理プログラム630-1と同一のプログラム630-2を、データ処理システム2の実行基盤650-2で動作できる状態にする。なお、管理機能640-1が、管理機能640-2に要求を出して、管理機能640-2がデータ処理プログラム630-2を動作できる状態にしてもよい。 In step 810, the management function 640-1 of the data processing system 1 selects the data processing system 2 based on a predetermined condition, and the same program 630-2 as the data processing program 630-1 is selected as the data processing system 2 To be ready for operation on the execution platform 650-2. The management function 640-1 may send a request to the management function 640-2 so that the management function 640-2 can operate the data processing program 630-2.
 また、管理機能640-1は、ロケーションサーバ620-1が管理する名前解決のテーブルを、管理機能640-1が保持するデータ処理プログラム630-1の動作する実行基盤の場所の情報に基づいて更新する。ここでは、プログラム630-1(630-2)はデータ処理システム1および2の実行基盤で動作するため、双方の実行基盤の場所の情報がテーブルに格納される。例えば、名前解決のテーブルがURLとIPアドレスの組合せである場合には、データ処理プログラム630-1を示すURLの行に実行基盤のIPアドレスを追加する。更新するタイミングは、管理機能640が保持するデータ処理プログラム630の動作する実行基盤の場所の情報が更新されたとき、またはタイマーに基づいてある時間間隔で更新してもよい。以後、簡略化の為、名前解決のテーブルの更新の記述を省略するが、上記のタイミングで名前解決のテーブルの更新が行われる。 Also, the management function 640-1 updates the name resolution table managed by the location server 620-1 based on the information on the location of the execution base on which the data processing program 630-1 held by the management function 640-1 operates. To do. Here, since the program 630-1 (630-2) operates on the execution bases of the data processing systems 1 and 2, information on the locations of both execution bases is stored in the table. For example, if the name resolution table is a combination of a URL and an IP address, the execution base IP address is added to the URL line indicating the data processing program 630-1. The update timing may be updated when information on the location of the execution base where the data processing program 630 held by the management function 640 operates is updated, or at certain time intervals based on a timer. Thereafter, for the sake of simplification, description of updating the name resolution table is omitted, but the name resolution table is updated at the above timing.
 なお、データ処理プログラム630を動作させる数やタイミング、動作させる実行基盤などの判断方法は、シーケンスを用いて一連の動作概要を説明した後、まとめて示す。 Note that the number and timing of operating the data processing program 630 and the method of determining the execution platform to be operated will be described together after explaining a series of operation outlines using sequences.
 ステップ820において、端末120のアプリ基盤720が、ロケーションサーバ620-1にデータ処理プログラム630-1(630-2)を実行するために必要なアクセス先を問い合わせると、ロケーションサーバ620-1がアプリ基盤720にデータ処理プログラム(630-2)が実行可能なアクセス先(実行基盤650-1、650-2)を通知する。前記問合せのタイミングはユーザが端末120-1に入力した時や定期的あるいは事前に決められた時刻などである。ここで、ロケーションサーバ620は、管理機能640間の情報共有によって、データ処理プログラム630が動作可能な全てまたはそのいずれかの実行基盤650を把握する。また、ロケーションサーバ620が通知するアクセス先は、ステップ3240、及び、ステップ3250において管理機能640が決定した、一つ、または、複数の実行基盤650とする。アクセス先の問い合わせへの対応はDNS等の既存技術がある為、詳細な説明は割愛する。 In step 820, when the application platform 720 of the terminal 120 inquires the location server 620-1 about an access destination necessary for executing the data processing program 630-1 (630-2), the location server 620-1 720 notifies the access destinations (execution platforms 650-1 and 650-2) that the data processing program (630-2) can execute. The timing of the inquiry is a time when the user inputs to the terminal 120-1, a time determined regularly or in advance. Here, the location server 620 grasps all or any execution platform 650 in which the data processing program 630 can operate by sharing information between the management functions 640. Further, the access destination notified by the location server 620 is one or a plurality of execution platforms 650 determined by the management function 640 in Step 3240 and Step 3250. Since there is an existing technology such as DNS for dealing with inquiries about access destinations, a detailed description is omitted.
 ステップ830において、端末120のアプリ710がステップ840において把握したアクセス先である、実行基盤650-1および、650-2にデータ処理を要求し、早く結果を返した実行基盤(ここでは実行基盤650-2とする)を選択し、実行基盤650-2に継続してデータ処理を要求する。データ処理は、アプリケーションが規定するデータ処理の内容である。 In step 830, the execution platform 650-1 and 650-2, which are the access destinations grasped in step 840 by the application 710 of the terminal 120, requests data processing and returns the result quickly (in this case, the execution platform 650). -2) and requests the execution platform 650-2 to continue data processing. Data processing is the contents of data processing specified by the application.
 ステップ840においては、ステップ810と同様に、管理機能640-2は、予め定められた条件に基づき、データ処理システム3を選択し、その実行基盤650-3でデータ処理プログラム630-2(630-1)と同一のプログラム640-3を動作できる状態にする。 In step 840, as in step 810, the management function 640-2 selects the data processing system 3 based on a predetermined condition, and executes the data processing program 630-2 (630-) on its execution base 650-3. The program 640-3 identical to 1) is put into an operable state.
 ステップ850において、管理機能640-2は管理機能(管理機能640-1)を特定し、更新された状態情報(ここでは、データ処理プログラム630-3が動作できる状態であること)を送受信する。ステップ810と同様に、アプリ基盤720-1がロケーションサーバ620-2に問い合わせる。 In step 850, the management function 640-2 specifies the management function (management function 640-1), and transmits / receives updated status information (in this case, the data processing program 630-3 is operable). Similar to step 810, the application platform 720-1 makes an inquiry to the location server 620-2.
 ステップ860において、端末120-1のアプリ基盤720-1がロケーションサーバ620-1にアクセス先を問い合わせると、ロケーションサーバ620-1がアプリ基盤720-1にアクセス先を通知する。 In step 860, when the application platform 720-1 of the terminal 120-1 inquires of the location server 620-1 about the access destination, the location server 620-1 notifies the application platform 720-1 of the access destination.
 ステップ870において、アプリ710-1がステップ810で把握したアクセス先(実行基盤650-1)にデータ処理を要求すると、実行基盤650-1がデータ処理プログラム630-1の処理結果を返す。 In step 870, when the application 710-1 requests data processing from the access destination (execution base 650-1) grasped in step 810, the execution base 650-1 returns the processing result of the data processing program 630-1.
 以上の説明の通り、データ処理システムの管理機能は各々が、自律的に自己のデータ処理プログラムを他のシステムの実行基盤でも実行できるように設定していく。すなわち、様々なシステムで同一の機能を備えるプログラムが実行されていくことになるが、端末に対してはプログラムを実行するシステム(実行基盤)が通知されるので、端末は通知されたシステムの全て、または、いずれかに対して処理を依頼する。 As described above, the management functions of the data processing system are set so that each of them can autonomously execute its own data processing program on the execution base of other systems. That is, a program having the same function is executed in various systems, but since the system (execution base) for executing the program is notified to the terminal, the terminal is notified of all of the notified systems. Request processing for one of them.
 そして、端末は処理を依頼したシステムから処理結果を受信すると、その中から応答時間の短いシステムを選択し、次の処理を依頼する。結果として、応答時間が短くなったプログラムのみが継続して利用されることになるので、段階的に応答時間が短くなるようにプログラムの実行先を決定することが可能になる。なお、詳細については後述するが、応答時間の長いプログラムの実行を停止することで、プログラムの実行数を一定数に抑え、リソースの消費を抑制できる。 Then, when the terminal receives the processing result from the system that requested the processing, the terminal selects a system with a short response time and requests the next processing. As a result, only the program whose response time is shortened is continuously used, so that it is possible to determine the execution destination of the program so that the response time gradually decreases. Although details will be described later, by stopping execution of a program having a long response time, the number of program executions can be suppressed to a fixed number, and resource consumption can be suppressed.
 次に、試験的な配置と評価を繰り返すことによって、応答時間を段階的に改善する為の管理機能640における制御メカニズムを、図9を用いて説明する。 Next, the control mechanism in the management function 640 for improving the response time in stages by repeating trial arrangement and evaluation will be described with reference to FIG.
 本実施例における各管理機能は、複数の管理グループに属し、同一の管理グループ内で情報共有を行う。また、各管理機能は異なるタスクを担う。各管理機能は自身が担うタスクを自律的に選択する。なお、タスクとは、例えば、各データ処理システムに通知または計測される応答時間を集計するタスク、異なる管理グループとの間で情報交換するタスク、ベイズ推定等により次の配置先を決定するタスク、実行履歴等の分析を行うタスク等である。管理グループ及びタスクの決定方法についての詳細は後述する。 Each management function in this embodiment belongs to a plurality of management groups and shares information within the same management group. Moreover, each management function bears a different task. Each management function autonomously selects the task that it takes. The task is, for example, a task of totaling response times notified or measured to each data processing system, a task of exchanging information with different management groups, a task of determining the next placement destination by Bayesian estimation, This is a task for analyzing an execution history or the like. Details of the management group and task determination method will be described later.
 ステップ3205において、実行基盤管理機能6430は、管理するデータ処理プログラム630が属する実行基盤グループを決定する。ここで実行基盤グループとは、データ処理プログラム630を新たに動作させる実行基盤の候補となる実行基盤650の集合である。これによって、多数の実行基盤から構成される分散処理システムにおいても、応答時間の改善の見込みが高い実行基盤を少ない計算量で選択できる。なお、実行基盤は複数の実行基盤グループに属してもよい。また、実行基盤グループはアプリケーション開発者や分散処理システムの管理者によって事前に定義されてもよい。そのため、管理計算機のGUIは、実行基盤グループの判断ポリシを設定するための入力欄や設定ボタンを備える。以下に、実行基盤グループの決定方法を示す。 In step 3205, the execution platform management function 6430 determines the execution platform group to which the data processing program 630 to be managed belongs. Here, the execution platform group is a set of execution platforms 650 that are candidates for execution platforms for newly operating the data processing program 630. As a result, even in a distributed processing system composed of a large number of execution platforms, it is possible to select an execution platform with a high expectation of improving response time with a small amount of calculation. An execution platform may belong to a plurality of execution platform groups. The execution platform group may be defined in advance by an application developer or a distributed processing system administrator. Therefore, the GUI of the management computer includes an input field and a setting button for setting a determination policy for the execution base group. The method for determining the execution platform group is shown below.
 実行基盤グループの決定方法の第一の例は、実行基盤の物理的またはネットワーク上の論理的な位置や実行基盤タイプに基づいて、実行基盤グループを決定する方法である。具体的には、例えば、実行基盤グループに属している実行基盤の実行基盤タイプが、自身の実行基盤タイプと異なる実行基盤グループを選択する。これによって、ある実行基盤で障害が発生したりやサイバー攻撃などを受けた場合においても、全ての実行基盤が同時に動作を停止してしまう可能性を提言することができる。また、データ処理プログラム630に対して実行基盤の相性があり、実行基盤タイプによって、データ処理時間が異なる場合に、データ処理時間が遅い実行基盤にデータ処理プログラム630が偏って配置されることなく、データ処理時間が短い実行基盤でデータ処理プログラム630を動作することができる。なお、上記相性は、例えばデータ処理がメモリ量やCPU性能、I/O性能のいずれかを多く必要とする特徴、あるいはKVSに適した処理やRDBに適した処理である特徴を持つ場合に実行基盤が上記特徴に合った性能や機能を備えるか否かである。 The first example of the execution base group determination method is a method of determining the execution base group based on the physical location of the execution base or the logical position on the network and the execution base type. Specifically, for example, an execution platform group whose execution platform type belongs to the execution platform group is different from its own execution platform type is selected. As a result, even if a failure occurs in a certain execution base or a cyber attack occurs, it is possible to propose a possibility that all execution bases may stop operating simultaneously. In addition, when the data processing program 630 is compatible with the execution base and the data processing time differs depending on the execution base type, the data processing program 630 is not biased to the execution base with a slow data processing time. The data processing program 630 can be operated on an execution platform with a short data processing time. Note that the above compatibility is executed when, for example, the data processing has a feature that requires a large amount of memory, CPU performance, or I / O performance, or features that are suitable for KVS or suitable for RDB. Whether or not the platform has performance and functions that meet the above characteristics.
 また、位置情報に基づいて実行基盤グループを決定する方法としては、例えば、実行基盤グループ毎に分散度合いの上限値を備え、実行基盤の分散度が上限値を超えない範囲で分散するように決定する。これにより、分散度合いの上限値が異なる複数の実行基盤グループを用いて、実行基盤が物理的またはネットワーク上で論理的に、近い実行基盤の集合である局所的な実行基盤グループと、分散した実行基盤の集合である広域に分散する実行基盤グループの両方に属することができる。その結果、応答時間の悪化に対して局所的な実行基盤グループで局所的に復旧する計画と、広域に分散する実行基盤グループで全体最適の視点で復旧する計画を立てることが可能になる。 In addition, as a method for determining the execution base group based on the position information, for example, each execution base group has an upper limit value of the degree of distribution, and is determined so that the distribution degree of the execution base does not exceed the upper limit value. To do. As a result, using multiple execution platform groups with different upper limits of the degree of distribution, a local execution platform group that is a set of execution platforms that are physically or logically close to each other on the network, and distributed execution It can belong to both execution platform groups distributed over a wide area, which is a set of platforms. As a result, it is possible to make a plan for local recovery by a local execution base group against a deterioration in response time and a plan for recovery from a globally optimal viewpoint by execution base groups distributed over a wide area.
 実行基盤グループの決定方法の第二の例は、実行基盤650上で動いているデータ処理プログラム630や管理機能640が属する管理グループに基づいて決定する方法である。すなわち、同一の管理グループに属するデータ処理プログラム630が配備された実行基盤は同一の実行基盤グループに属する。 The second example of the execution base group determination method is a method of determining based on the management group to which the data processing program 630 and the management function 640 running on the execution base 650 belong. In other words, execution bases on which the data processing programs 630 belonging to the same management group are deployed belong to the same execution base group.
 実行基盤グループの計算方法の第三の例は、実行履歴に基づいて決定する方法である。具体的には、実行履歴分析機能が、実行基盤で新たにデータ処理プログラム630を実行した場合の応答時間を過去の実行履歴から計算する。過去の実行履歴とは、例えば、データ処理プログラム630が配置された実行基盤の物理的位置やネットワーク上の論理的位置が似ている任意のデータ処理プログラム630を動かした際の応答時間の変化、あるいは、データ処理プログラム630の特性が似ている任意のデータ処理プログラム630を動かした際の応答時間の変化、あるいは、障害発生や端末からのアクセス負荷急増などの類似した環境変化が発生した際に任意のデータ処理プログラム630を動かした際の応答時間の変化である。実行基盤グループ管理機能が、上記計算した応答時間に基づいて各実行基盤を1つまたは複数のグループに属するようにグルーピングする。 The third example of the calculation method of the execution base group is a method of determining based on the execution history. Specifically, the execution history analysis function calculates the response time when the data processing program 630 is newly executed on the execution base from the past execution history. The past execution history is, for example, a change in response time when an arbitrary data processing program 630 whose physical position on the execution base where the data processing program 630 is arranged or whose logical position on the network is similar is moved, Or, when an arbitrary data processing program 630 having similar characteristics of the data processing program 630 is moved, a response time change occurs, or a similar environmental change occurs such as a failure or a sudden increase in access load from the terminal. This is a change in response time when an arbitrary data processing program 630 is moved. The execution platform group management function groups each execution platform so as to belong to one or a plurality of groups based on the calculated response time.
 図9の説明に戻る。以下、管理機能640が管理する実行基盤650が複数の実行基盤グループに属する場合、実行基盤グループ毎にステップ3210からステップ3260の処理を行う。ただし、複数の実行基盤グループをまとめた実行基盤グループが存在するような階層構造になっていてもよく、そのような場合には複数の実行基盤グループをまとめた単位でステップ3210からステップ3260の処理を行ってもよい。 Returning to the explanation of FIG. Hereinafter, when the execution base 650 managed by the management function 640 belongs to a plurality of execution base groups, the processing from step 3210 to step 3260 is performed for each execution base group. However, it may have a hierarchical structure in which there are execution base groups in which a plurality of execution base groups are collected. In such a case, the processing from step 3210 to step 3260 is performed in units of a plurality of execution base groups. May be performed.
 ステップ3210において、実行基盤管理機能6430は、データ処理プログラム630の動作や停止を行う制御周期を決定する。制御周期の決定方法を以下に例示する。 In step 3210, the execution base management function 6430 determines a control cycle for operating and stopping the data processing program 630. A method for determining the control period is exemplified below.
 制御周期の決定方法の第一の例を示す。実行基盤の状態情報1100を参照して、実行基盤グループに属する実行基盤とその位置を把握する。位置情報から分散度合いを、例えば物理的な距離や通信遅延の平均値で計算する。分散度合いに基づいて制御周期を決定する。たとえば、制御周期を分散度合いが増加すると単調に増加する一次関数を用いて決定する。なお、制御周期を計算する際に、分散度合いに加えて、事業者、課金モデル、実行基盤タイプの情報を用いてもよい。たとえば、事業者が異なる場合は事業者間をつなぐネットワークやGWが通信のボトルネックになる可能性があるため、制御周期をある割合で大きくする、またはある値を足してもよい。また、課金モデルが、通信量が固定ではなく従量課金であることを示す場合や実行基盤数の増加が料金の大幅な増加につながる場合は、制御周期をある割合で大きくする、またはある値を足す。 The first example of the control cycle determination method is shown. By referring to the status information 1100 of the execution base, the execution base belonging to the execution base group and its position are grasped. The degree of dispersion is calculated from the position information by, for example, an average value of physical distance and communication delay. A control cycle is determined based on the degree of dispersion. For example, the control period is determined using a linear function that increases monotonously as the degree of dispersion increases. In addition, when calculating the control cycle, in addition to the degree of distribution, information on the business operator, the charging model, and the execution base type may be used. For example, if the operators are different, the network or GW connecting the operators may become a communication bottleneck, so the control cycle may be increased by a certain rate or a certain value may be added. Also, if the billing model indicates that the traffic is pay-as-you-go rather than fixed, or if the increase in the number of execution platforms leads to a significant increase in fees, the control cycle is increased by a certain rate or a certain value is set. Add.
 制御周期の計算方法の第二の例を示す。アプリケーションの状態情報1000を参照して、品質劣化許容時間に係数をかけた値を制御周期とする。または、アプリケーションの状態情報1000の応答時間、要求応答時間、および、品質劣化許容時間を把握し、応答時間と要求応答時間の差を計算し、実行基盤の状態情報1100の応答時間改善見込み幅が上記差分より大きい場合に品質劣化許容時間に係数をかけた値を制御周期とする。または、アプリケーションやデータの配置などの動作させるためのコストの大きいアプリケーションのデータ処理プログラム630を不要に多数動作させないため、動作コストを考慮してもよい。たとえば、上記で計算した制御周期と動作コストの積を制御周期として計算する。 A second example of the control cycle calculation method is shown below. With reference to the application status information 1000, a value obtained by multiplying the quality degradation allowable time by a coefficient is set as the control period. Alternatively, the response time of the application state information 1000, the request response time, and the allowable quality degradation time are grasped, the difference between the response time and the request response time is calculated, and the response time improvement expected range of the execution base state information 1100 is When the difference is larger than the difference, a value obtained by multiplying the quality degradation allowable time by a coefficient is set as a control cycle. Alternatively, the operation cost may be taken into consideration because a large number of data processing programs 630 of an application having a high cost for operation such as application and data arrangement are not operated unnecessarily. For example, the product of the control cycle calculated above and the operation cost is calculated as the control cycle.
 制御周期の計算方法の第三の例を示す。第一や第二の計算で算出した制御周期に、実行基盤を繋ぐネットワークの余剰帯域や実行基盤のI/Oの余剰性能、通信コストのいずれか、または複数を考慮して計算する。たとえば、上記で算出した制御周期と、実行基盤を繋ぐネットワークの余剰帯域や実行基盤のI/Oの余剰性能、通信コストの積を制御周期として計算する。 A third example of the control cycle calculation method is shown below. The control cycle calculated in the first and second calculations is calculated in consideration of one or more of the excess bandwidth of the network connecting the execution infrastructure, the I / O surplus performance of the execution infrastructure, and the communication cost. For example, the product of the control cycle calculated above, the surplus bandwidth of the network connecting the execution bases, the surplus performance of the execution base I / O, and the communication cost is calculated as the control cycle.
 制御周期の計算方法の第四の例を示す。第二、第三の計算において、過去の実行履歴やシミュレーション結果を用いて、応答時間改善見込み幅やコストを推定する。推定方法は、例えば、ベイズ推定を用いることができる。具体的には、例えば、実行履歴から得られる類似した条件下における応答時間改善見込み幅を参照して、応答時間改善見込み幅の期待値を推定する。ベイズ推定は一般的な既存技術である為、詳細な計算方法の説明は割愛する。 The fourth example of the control cycle calculation method is shown below. In the second and third calculations, the estimated response time improvement cost and cost are estimated using the past execution history and simulation results. As the estimation method, for example, Bayesian estimation can be used. Specifically, for example, the expected value of the response time improvement expected range is estimated with reference to the response time improvement expected range under similar conditions obtained from the execution history. Since Bayesian estimation is a common existing technology, a detailed explanation of the calculation method is omitted.
 図9の説明に戻る。ステップ3220で、実行基盤管理機能6430は、データ処理プログラム630の一次的動作数と動作数を決定する。ここで、一時的動作数とは、同一のデータ処理プログラム630を実行基盤試験的に動作させる実行基盤650の数を示している。試験的な動作は、ある期間動作させて、不要であれば容易に停止または削除できる状態にすることである。容易に停止または削除できる状態とはたとえば、データの参照は許可するが、後で分散したデータのマージが必要となるデータの書き込みは許可しないなどの限定的な動作などがある。動作数とは、一時的に各実行基盤650で動作させたデータ処理プログラム630を評価した際に停止や削除などせずに動作を継続させるデータ処理プログラム630の数である。 Returning to the explanation of FIG. In step 3220, the execution infrastructure management function 6430 determines the number of primary operations and the number of operations of the data processing program 630. Here, the number of temporary operations indicates the number of execution platforms 650 that operate the same data processing program 630 in an execution platform test. The trial operation is to operate for a certain period of time so that it can be easily stopped or deleted if unnecessary. The state that can be easily stopped or deleted includes, for example, limited operations such as permitting data reference but not permitting writing of data that requires merging of distributed data later. The number of operations is the number of data processing programs 630 that continue operation without being stopped or deleted when the data processing programs 630 that are temporarily operated on the execution platforms 650 are evaluated.
 一時的動作数と動作数の決定方法としては、制御周期と同様に、実行基盤のグループ特性や実行基盤の特性に基づいて決定する。また、アプリケーションの状態情報や特性に基づいて決定してもよい。また、実行基盤を繋ぐネットワークの余剰帯域や実行基盤のI/Oの余剰性能や通信コストに基づいて決定してもよい。また、実行履歴やシミュレーション結果に基づくベイズ推定により決定してもよい。 The number of temporary operations and the number of operations are determined based on the group characteristics of the execution base and the characteristics of the execution base as in the control cycle. Further, it may be determined based on application state information and characteristics. Alternatively, it may be determined based on the surplus bandwidth of the network connecting the execution infrastructure, the surplus performance of the I / O of the execution infrastructure, and the communication cost. Alternatively, it may be determined by Bayesian estimation based on the execution history or the simulation result.
 ステップ3230において、実行基盤管理機能6430は、データ処理プログラム630を一時的に動作できる状態にする実行基盤、または停止する実行基盤を決定する。動作させる実行基盤の候補は、ステップ3205において、位置情報や管理グループ、アプリケーションタイプ等の情報に基づいて、実行基盤グループとして適切に制限されている。そのため、本ステップにおいては、実行基盤グループの中からランダムで選択する、または、実行履歴に基づいて実行基盤グループの中から選択する。過去の実行履歴に基づいて選択する方法を以下に示す。過去の実行履歴やシミュレーション結果を用いて、実行基盤グループに含まれる実行基盤にデータ処理プログラムを配置した際の応答時間から応答時間を最小にする実行基盤を推定する。推定方法は、例えば、ベイズ推定を用いることができる。 In step 3230, the execution base management function 6430 determines an execution base that makes the data processing program 630 temporarily operable or an execution base that is to be stopped. In step 3205, candidates for execution platforms to be operated are appropriately limited as execution platform groups based on information such as location information, management groups, and application types. Therefore, in this step, the execution base group is selected at random or selected from the execution base group based on the execution history. A method of selecting based on the past execution history is shown below. Using the past execution history and simulation results, the execution base that minimizes the response time is estimated from the response time when the data processing program is arranged on the execution base included in the execution base group. As the estimation method, for example, Bayesian estimation can be used.
 具体的には、例えば、前提として、試験的にある実行基盤にアプリケーションを配置した結果、実行基盤にアクセスする端末の平均応答時間を計測した結果を得られたとする。また、過去の実行履歴または事前シミュレーションから、類似した条件下において、同じ実行基盤に配置を行った場合の応答時間、及び、応答時間が最小になる配置構成が確率分布として与えられていると仮定する。 Specifically, for example, as a premise, it is assumed that the result of measuring the average response time of a terminal accessing the execution base as a result of placing an application on a test execution base as a test. In addition, it is assumed from the past execution history or prior simulation that the response time when placing on the same execution base under similar conditions and the placement configuration that minimizes the response time are given as a probability distribution. To do.
 xiを、物理的に離れた実行基盤の集合(サイト)iに配置するとサイトiに接続するユーザの平均応答時間ri(t)がある値α未満になる事象 (i=1~N)とする。また、¬xiを、サイトiに配置するとサイトiに接続するユーザの平均応答時間ri(t)がある値α以上になる事象 (i=1~N)とする。また、ykを、サイトkへの配置が応答時間を最小にする事象 (k=1~N)とする。 When x i is placed in a set (site) i of physically separated execution platforms, the average response time r i (t) of users connected to site i falls below a value α (i = 1 to N) And Further, when ¬x i is arranged at site i, an average response time r i (t) of users connected to site i is an event (i = 1 to N) that exceeds a certain value α. Further, y k is an event (k = 1 to N) in which arrangement at site k minimizes response time.
 t回目の配置において,サイトiに配置した際に応答性がある値α以下になった際に、サイトkへの配置が応答時間を最小にする構成である確率はPt(yk|xi)は数式1、数式2で計算される。 In the t-th arrangement, the probability that the arrangement at the site k minimizes the response time when the arrangement at the site i falls below the value α is P t (y k | x i ) is calculated by Equation 1 and Equation 2.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 ここで,Pt(yk)はt-1回目の結果よりベイズ更新される。t-1回目の試験的な配置先をi'とすると,応答時間がある値α以下であった場合,Pt(yk)は数式3、数式4で計算される。 Here, P t (y k ) is Bayes updated from the result of the (t-1) th time. Assuming that the test location at the t-1th time is i ′, P t (y k ) is calculated by Equations 3 and 4 when the response time is equal to or less than a certain value α.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 ここで、P1(yk)=1/N、または、事前シミュレーションにおけるP(yk)とする。  T回目に試験的な配置をすべき、サイトは、確率Pt(yk|xi)、または、Pt(yk|¬x1)が最大のサイトである。 Here, P 1 (y k ) = 1 / N, or P (y k ) in the prior simulation. The site to be placed in the T-th trial is the site having the maximum probability P t (y k | x i ) or P t (y k | ¬x 1 ).
 また、複数回の試験的な配置を行った場合の確率の計算も同様である。一例として、2回試験的な配置を行った場合を確率の計算方法を例示する。サイトiに配置した際に応答性がβ以下になり、かつ、サイトjに配置した際に応答性がある値β以下になった際に、サイトkへの配置が応答時間を最小にする構成である確率Pt(yk|xi, ¬xj)は数式5で計算される。 The same applies to the calculation of the probability when a plurality of trial placements are performed. As an example, the probability calculation method is illustrated when the trial placement is performed twice. Configuration where placement at site k minimizes response time when responsiveness is below β when placed at site i and below responsive value β when placed at site j The probability P t (y k | x i , ¬x j ) is calculated by Equation 5.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 図9の説明に戻る。 Returning to the explanation of FIG.
 ステップ3240において、データ処理プログラム管理機能6460は、ステップ3230で決定した動作数及び一時動作数に応じて、一時的に、データ処理プログラム630を動作できる状態を維持または停止して、応答時間やコストを計測して評価する。なお、動作できる状態にする手順には、必要なアプリケーションやデータが、動作させる実行基盤650にコンピューティングリソースが確保されているか、配置や必要な設定がされているか等を確認する。 In step 3240, the data processing program management function 6460 temporarily maintains or stops the state in which the data processing program 630 can be operated according to the number of operations and the number of temporary operations determined in step 3230, and the response time and cost. Is measured and evaluated. In order to make it possible to operate, it is confirmed whether necessary resources and applications have been allocated to the execution platform 650 on which the necessary applications and data are to be operated, whether they are arranged, and necessary settings are made.
 必要なコンピューティングリソースが確保されていない場合、コンピューティングリソースを確保する。コンピューティングリソースが他のクラウド事業者によって管理される場合、コンピューティングリソースを同事業者に要求する。アプリケーションやデータが配置されていない場合、アプリケーションやデータをデプロイする。初期設定が必要な場合、初期設定等の必要な設定を施し、アプリケーションをアクティベートさせる。配置されていない場合には、必要なコンピューティングリソースが確保されているかを確認する。 If necessary computing resources are not secured, secure computing resources. If the computing resource is managed by another cloud operator, the computing resource is requested from the operator. If the application or data is not deployed, deploy the application or data. If initial settings are required, make necessary settings such as initial settings and activate the application. If not, check whether necessary computing resources are secured.
 また、端末120がアプリケーションやデータを動作させる実行基盤650にアクセス先を変更する為には、アプリケーションやデータが動作する実行基盤650を把握する管理機能640が端末120にアクセス先を通知する必要がある。そのため、管理機能640はロケーションサーバ620を通じて端末120がアクセスすべき実行基盤650を通知する。具体的には、端末120が問い合わせた際に、実行基盤650上で動作するアプリケーションを一意に示すURL等の識別子を端末に通知する。端末120がアクセス先を問い合わせるタイミングとしては、ある間隔、または端末120に内蔵されるアプリケーションを起動するタイミング、あるいは、端末上のリロードボタン等がユーザによって押されたタイミング等である。 In addition, in order for the terminal 120 to change the access destination to the execution platform 650 on which the application or data operates, the management function 640 that grasps the execution platform 650 on which the application or data operates needs to notify the terminal 120 of the access destination. is there. Therefore, the management function 640 notifies the execution platform 650 to be accessed by the terminal 120 through the location server 620. Specifically, when the terminal 120 makes an inquiry, the terminal notifies the terminal of an identifier such as a URL that uniquely indicates an application running on the execution platform 650. The timing at which the terminal 120 inquires about the access destination is a certain interval, a timing at which an application built in the terminal 120 is activated, a timing at which a reload button or the like on the terminal is pressed by the user, or the like.
 応答時間の計測方法の第一の例は、応答時間自体ではなく、各実行基盤におけるデータ処理の実行回数を計測する。端末が複数の実行基盤650に配置されたデータ処理プログラム630にアクセスして応答時間が短い実行基盤650を選択する場合、端末近傍の実行基盤650ではデータ処理プログラム630の実行回数が多くなる。 The first example of the response time measurement method is not the response time itself, but the number of executions of data processing in each execution platform. When the terminal accesses the data processing program 630 arranged on the plurality of execution bases 650 and selects the execution base 650 having a short response time, the number of executions of the data processing program 630 increases in the execution base 650 near the terminal.
 計測方法の第二の例は、端末または管理機能640が応答時間を計測し、応答時間が短い実行基盤650に配置されたデータ処理プログラム630を動作できる状態に維持する。端末が計測する場合、端末は一つ、または、複数のロケーションサーバ620に通知する。端末が計測した応答時間を通知されたロケーションサーバは、図10のステップ3030に示す複数の管理機能640間の情報共有の仕組みを利用して、応答時間の情報を共有する。 In the second example of the measurement method, the terminal or the management function 640 measures the response time, and maintains the data processing program 630 arranged on the execution base 650 with a short response time in an operable state. When the terminal measures, the terminal notifies one or a plurality of location servers 620. The location server notified of the response time measured by the terminal uses the information sharing mechanism between the plurality of management functions 640 shown in Step 3030 of FIG. 10 to share the response time information.
 ステップ3250において、データ処理プログラム管理機能6460は、ステップ3240の評価の結果に基づき、動作または停止するデータ処理プログラムを決定する。応答時間の代わりに実行回数を計測した場合、データ処理プログラム630の実行回数が、他の実行基盤650に格納されるデータ処理プログラムと比較して多い、または、所定の閾値以上のデータ処理プログラム630を動作できる状態に維持する。なお、実行回数が少ないデータ処理プログラム630を停止させてもよい。応答時間を計測した場合、応答時間の平均値が小さい実行基盤650で動作するデータ処理プログラム630の動作を継続させ、応答時間の平均値が大きい実行基盤650で動作するデータ処理プログラム630を停止すると決定する。ここで、実行回数や応答時間を複数の実行基盤650間で比較して決定する際には、異なるデータ処理システム100の管理機能640間で実行回数や応答時間を共有する。異なる管理機能640間における情報共有の方法は、図9のステップ3030にて後述する。 In step 3250, the data processing program management function 6460 determines a data processing program to be operated or stopped based on the evaluation result in step 3240. When the number of executions is measured instead of the response time, the number of executions of the data processing program 630 is larger than the data processing programs stored in the other execution bases 650, or the data processing program 630 having a predetermined threshold value or more. Is maintained in an operable state. Note that the data processing program 630 having a small number of executions may be stopped. When the response time is measured, if the data processing program 630 operating on the execution platform 650 having a small average response time is continued and the data processing program 630 operating on the execution platform 650 having a large average response time is stopped. decide. Here, when the number of executions and the response time are compared and determined among the plurality of execution bases 650, the number of executions and the response time are shared between the management functions 640 of the different data processing systems 100. A method of sharing information between different management functions 640 will be described later in step 3030 of FIG.
 ステップ3260において、データ処理プログラム管理機能6460は、ステップ3250で決定したデータ処理プログラムの動作を維持、または停止する。本ステップにおいて動作、または停止する手順はステップ3240に示す動作・停止の手順と同様である。 In step 3260, the data processing program management function 6460 maintains or stops the operation of the data processing program determined in step 3250. The procedure for operation or stop in this step is the same as the operation / stop procedure shown in step 3240.
 ステップ3270において、実行基盤管理機能6430は、制御周期の変更の要否を決定する。制御周期を変更すると判断した場合はステップ3210に進み、変更しないと判断した場合はステップ3220に進む。なお、二回目以降はステップ3220を必ずしも実行しなくてもよい。 In step 3270, the execution infrastructure management function 6430 determines whether or not the control cycle needs to be changed. If it is determined that the control cycle is to be changed, the process proceeds to step 3210. If it is determined that the control period is not to be changed, the process proceeds to step 3220. Note that step 3220 is not necessarily executed after the second time.
 以上の制御を複数の実行基盤グループで実行することによって、実行基盤が物理的またはネットワーク上で論理的に、近い実行基盤の集合である局所的な実行基盤グループと、分散した実行基盤の集合である広域に分散する実行基盤グループの両方に属することができるようになり、例えば、局所的な実行基盤グループの制御周期を短時間にし、広域に分散する実行基盤グループを長時間にすることが可能になり、局所的に素早く応答時間を復旧するための制御ループと、広域にまたがった評価に基づく全体最適の観点での制御ループが可能になる。 By executing the above control in multiple execution platform groups, the execution platform is physically or logically on the network, and is divided into a local execution platform group that is a set of execution platforms that are close together and a set of distributed execution platforms. It becomes possible to belong to both execution platform groups distributed over a wide area. For example, it is possible to shorten the control cycle of a local execution platform group and shorten the execution cycle group distributed over a wide area. Thus, a control loop for quickly recovering the response time locally and a control loop from the viewpoint of overall optimization based on evaluation over a wide area are possible.
 次に、管理機能640が管理グループを計算して同一の管理グループに属する管理機能640同士で情報の送受信を行い、管理機能640のタスクを判断して実行する手順を図10を用いて説明する。 Next, a procedure in which the management function 640 calculates a management group, transmits / receives information between the management functions 640 belonging to the same management group, and determines and executes a task of the management function 640 will be described with reference to FIG. .
 ステップ3010において、データ処理プログラム管理機能6460は、管理情報記憶部644を参照して、管理する実行基盤に配置されたデータ処理プログラムの要件を把握する。ステップ3020において、データ処理プログラム管理機能6460は、管理するデータ処理プログラム630が属する管理グループを決定する。ここで、管理グループとは、複数の実行基盤650に分散する複数のデータ処理プログラム630において、データ処理プログラムの状態情報1200などに関する情報の送受信を行う範囲を示す管理機能640のグループである。 In step 3010, the data processing program management function 6460 refers to the management information storage unit 644 and grasps the requirements of the data processing program arranged on the execution base to be managed. In step 3020, the data processing program management function 6460 determines the management group to which the data processing program 630 to be managed belongs. Here, the management group is a group of management functions 640 indicating a range in which information related to the status information 1200 of the data processing program is transmitted / received in the plurality of data processing programs 630 distributed to the plurality of execution platforms 650.
 同一のグループ内の管理機能で共有された情報は、管理機能640が、ユーザから要求されるデータ処理やバックグラウンドで行う分析処理などを実行するデータ処理プログラム630を判断する、複数存在する同一または類似の処理を行えるデータ処理プログラム630のどれを動作させてどれを停止・削除させるかを判断する、データ処理プログラム630が扱うデータの配置場所を判断する、各データ処理プログラム630が配置された実行基盤650のコンピューティングリソースの過不足などの情報を共有するためなどに利用される。 Information shared by the management functions in the same group includes a plurality of data processing programs 630 in which the management function 640 determines the data processing requested by the user or the analysis processing performed in the background. Determine which of the data processing programs 630 capable of performing similar processing is operated to be stopped / deleted, determine the location of data handled by the data processing program 630, and each data processing program 630 is executed This is used for sharing information such as excess or deficiency of computing resources of the base 650.
 データ処理プログラム630を管理グループ単位で管理することによって、管理機能640がデータ処理プログラムの状態情報1200などの情報を送受信する相手を制限することができ、データ処理プログラム630や管理機能640の数が増えた場合においても、情報の送受信に伴うデータ量を抑制することができる。以下に管理グループの決定方法を例示する。 By managing the data processing program 630 in units of management groups, the management function 640 can limit the parties to which information such as the status information 1200 of the data processing program is transmitted and received, and the number of data processing programs 630 and management functions 640 can be limited. Even in the case of an increase, the amount of data accompanying information transmission / reception can be suppressed. The management group determination method is exemplified below.
 管理グループの決定方法は大きく分けて3つに分類される。一つはデータ処理プログラ630に関する情報に基づく方法で、一つは管理機能640に関する情報に基づく方法で、もう一つは実行基盤650に関する情報に基づく方法である。それぞれのケースにおいて決定方法を例示する。なお、下記の決定方法を組み合わせてもよい。 The management group decision method is roughly divided into three categories. One is a method based on information related to the data processing program 630, one is based on information related to the management function 640, and the other is a method based on information related to the execution platform 650. The determination method is illustrated in each case. The following determination methods may be combined.
 第一の管理グループの決定方法は、データ処理プログラム属性に基づいて決定する方法である。具体的には、データ処理プログラムの状態情報1200のデータ処理プログラムとデータ処理プログラム属性を把握し、データ処理プログラム属性が同じであるデータ処理プログラムを同一の管理グループとする。データ処理プログラムが複数のデータ処理プログラム属性を保持する場合にはデータ処理プログラム属性の組み合わせが同じであるものを同一の管理グループとしてもよい。なお、アプリケーションの開発者や分散処理システムの管理者が、同一の管理グループとする属性や属性の組み合わせを事前に規定してもよい。 The first management group determination method is a method for determining based on data processing program attributes. Specifically, the data processing program and the data processing program attribute of the status information 1200 of the data processing program are grasped, and data processing programs having the same data processing program attribute are set as the same management group. When a data processing program holds a plurality of data processing program attributes, those having the same combination of data processing program attributes may be used as the same management group. Note that an application developer or a distributed processing system administrator may predefine attributes and combinations of attributes that are the same management group.
 第二の管理グループの決定方法は、管理機能640のタスクに基づいて決定する方法である。具体的には、管理機能640のタスクを把握し、タスクが同じである管理機能640を同一の管理グループとする。管理機能640が複数のタスクを保持する場合にはタスクの組み合わせが同じであるものを同一の管理グループとしてもよい。なお、アプリケーションの開発者や分散処理システムの管理者が、同一の管理グループとするタスクやタスクの組み合わせを事前に規定してもよい。 The second management group determination method is a method for determining based on the task of the management function 640. Specifically, the task of the management function 640 is grasped, and the management function 640 having the same task is set as the same management group. When the management function 640 holds a plurality of tasks, the same combination of tasks may be set as the same management group. Note that an application developer or a distributed processing system administrator may predefine tasks and combinations of tasks for the same management group.
 第二の管理グループの決定方法は後述する実行基盤グループに基づいて管理グループを決定する。たとえば、実行基盤グループ情報1100の実行基盤の値が、アプリケーションの状態情報1000のデータ処理プログラムと配置実行基盤と同じである、実行基盤グループを把握する。そして、データ処理プログラムの状態情報1200のデータ処理プログラムが把握したデータ処理プログラムである行の管理グループの列に把握した実行基盤グループを挿入する。これによって、物理的に近い、またはネットワーク上で論理的に近い管理機能640がデータ処理プログラムの状態情報1100などの情報を送受信することによって発生する通信量を物理的または論理的に局所的な区域に制限することが可能になる。 The second management group determination method is to determine the management group based on the execution base group described later. For example, the execution base group in which the execution base value of the execution base group information 1100 is the same as the data processing program and the placement execution base of the application status information 1000 is grasped. Then, the execution base group grasped in the column of the management group of the row which is the data processing program grasped by the data processing program of the status information 1200 of the data processing program is inserted. As a result, the management function 640 that is physically close or logically close to the network can reduce the amount of communication generated by transmitting / receiving information such as the status information 1100 of the data processing program to a physically or logically local area. It becomes possible to limit to.
 図10の説明に戻る。ステップ3030において、通信機能6440はデータ処理プログラム630が属する管理グループと同一の管理機能640と互いのデータ処理プログラムの状態情報1200、アプリケーション状態情報1000、実行基盤グループ情報1100、及び、ステップ3240で計測された応答時間の情報等の情報を送受信する。情報共有する範囲を管理グループに制限する為、管理機能640間の情報共有に要する通信量の肥大化を防ぐことができる。 Returning to the explanation of FIG. In step 3030, the communication function 6440 is the same as the management function to which the data processing program 630 belongs, the status information 1200 of each data processing program, the application status information 1000, the execution base group information 1100, and the measurement in step 3240. The received response time information and the like are transmitted and received. Since the information sharing range is limited to the management group, an increase in the amount of communication required for information sharing between the management functions 640 can be prevented.
 ステップ3040において、タスク管理機能6450は、管理機能640が担うタスクを決定する。具体的には、タスク管理機能6450は自身の管理機能640が属する管理グループのタスク一覧情報を参照して、タスクを担当している管理機能640の数(タスク実行管理機能数)が、必要とされる管理機能640の数(必要管理機能数)に達していないタスクのうち、優先度が高いタスクを把握する。既に管理機能640がタスクを担当していた場合、元々担当していたタスクよりも優先度が高い場合には把握したタスクを新たに担当して、これまで担当していたタスクの担当をやめる。その際、担当したタスクのタスク実行管理機能数の値を増やし、元々担当していたタスクのタスク実行管理機能数を減らす。上記では説明の簡略化のために各管理機能640が最大で一つのタスクを担当できる場合を例示したが、各管理機能640は自身が使えるコンピューティングリソースの量に基づいて担当するタスクの数を変更してよい。 In step 3040, the task management function 6450 determines a task that the management function 640 takes. Specifically, the task management function 6450 needs the number of management functions 640 in charge of tasks (the number of task execution management functions) by referring to the task list information of the management group to which the management function 640 belongs. Among tasks that have not reached the number of management functions 640 to be performed (the number of necessary management functions), a task having a high priority is grasped. If the management function 640 has already been in charge of the task, if the priority is higher than the task that was originally in charge, the newly grasped task is assigned and the assignment of the task that has been in charge is stopped. At that time, the value of the task execution management function number of the task in charge is increased, and the number of task execution management functions of the task that was originally in charge is reduced. In the above, for simplification of explanation, the case where each management function 640 can handle a maximum of one task is illustrated. However, each management function 640 determines the number of tasks in charge based on the amount of computing resources available to itself. You may change it.
 なお、複数担当する場合には、タスクの処理負荷を重み係数としてもよい。また、管理機能640が配置される実行基盤650の実行基盤タイプや分散度、あるいは、データ処理プログラム属性に基づいて、タスクの担当回避を判断してもよい。その際、アプリケーション開発者や分散処理システム管理者が担当可否の判断ポリシを設定できるように、管理計算機のGUIには上記判断ポリシの情報の入力欄や設定・変更ボタンが表示される。 Note that when multiple persons are in charge, the task processing load may be used as a weighting factor. Further, avoidance of task assignment may be determined based on the execution base type and distribution of the execution base 650 in which the management function 640 is arranged, or the data processing program attribute. At this time, an input field and a setting / change button for the determination policy information are displayed on the GUI of the management computer so that the application developer or the distributed processing system administrator can set the determination policy for the availability.
 ステップ3050において、タスク管理機能6450は担当するタスクを実行する。ステップ3060において、データ処理プログラム管理機能6450は、管理グループを変更するか否かを判断する。必要と判断した場合、ステップ3020へ、不要と判断した場合3030へ進む。判断する方法を以下に例示する。以下の方法は組み合わせてもよい。 In step 3050, the task management function 6450 executes the task in charge. In step 3060, the data processing program management function 6450 determines whether or not to change the management group. If it is determined that it is necessary, the process proceeds to step 3020. If it is determined that it is unnecessary, the process proceeds to 3030. The method of judging is illustrated below. The following methods may be combined.
 第一の判断方法は、タイマーに基づいて周期的に行う方法である。すなわち、ある時間が経過した段階で管理グループの再決定を行う。ここで、周期は一定値であっても動的に変化する値であってもよい。動的に変化する場合の周期の計算方法は、実行履歴に基づいて判断する方法などがある。例えば、管理グループの再決定を行って管理グループに変更がなかった場合には周期を長くし、変更があった場合には短くする。 The first determination method is a method that is periodically performed based on a timer. That is, the management group is re-determined when a certain time has passed. Here, the period may be a constant value or a dynamically changing value. As a method for calculating the period in the case of dynamic change, there is a method of determining based on the execution history. For example, when the management group is re-determined and the management group is not changed, the cycle is lengthened, and when the management group is changed, the cycle is shortened.
 第二の判断方法は、タイマーではなく、環境変化によるトリガーである。たとえば、実行基盤が仮想環境上に構築されている場合に、仮想環境がマイグレーションされた場合、アプリケーション状態情報1000に示す応答時間が増加した場合や端末平均位置が変更された場合等に管理グループを再決定すると判断する。 The second judgment method is not a timer but a trigger based on environmental changes. For example, when the execution base is built on a virtual environment, the management group is changed when the virtual environment is migrated, when the response time shown in the application status information 1000 is increased, or when the terminal average position is changed. Judge that it will be redetermined.
 以上の構成および処理により、システム環境の変化に応じて管理グループを変更できるので、管理機能640間の情報共有に要する通信量の肥大化を削減することが可能になる。 With the above configuration and processing, the management group can be changed according to changes in the system environment, so that it is possible to reduce the increase in the amount of communication required for information sharing between the management functions 640.
100:データ処理システム、110:ネットワーク、120:端末、130:管理計算機、640:管理機能、642:管理演算部、6410:アプリ品質評価機能、6420:実行履歴分析機能、6430:実行基盤管理機能、6440:通信機能、6450:タスク管理機能、6460:データ処理プログラム管理機能、644:管理情報記憶部、6470:アプリ・端末情報記憶部、6475:実行基盤グループ情報記憶部、6480:実行履歴情報記憶部、6485:管理グループ情報記憶部、6490:データ処理プログラム情報記憶部 100: data processing system, 110: network, 120: terminal, 130: management computer, 640: management function, 642: management operation unit, 6410: app quality evaluation function, 6420: execution history analysis function, 6430: execution base management function , 6440: Communication function, 6450: Task management function, 6460: Data processing program management function, 644: Management information storage unit, 6470: Application / terminal information storage unit, 6475: Execution base group information storage unit, 6480: Execution history information Storage unit, 6485: management group information storage unit, 6490: data processing program information storage unit

Claims (14)

  1.  複数の端末装置と複数の計算機がネットワークを介して接続された計算機システムであって、
     前記複数の計算機のなかの1又は2以上の第1の計算機の各々は、同一の機能を備える実行可能な状態のプログラムを保持しており、
     前記第1の計算機のいずれかの計算機は、前記プログラムを前記複数の計算機のなかの他の計算機である1又は2以上の第2の計算機へコピーして、
     前記第2の計算機はコピーされたプログラムを実行可能な状態に設定し、
     前記第1及び第2の計算機は、前記複数の端末装置からのリクエストに応じて前記プログラムを実行し、
     前記複数の端末の各々は、前記リクエストの応答時間を計測し、
     前記第1及び第2の計算機のなかのいずれかの第3の計算機は、前記複数の端末より前記リクエストの応答時間と前記リクエストを処理した計算機の情報を受信し、前記応答時間のなかから相対的に応答時間の長いものを特定し、特定した前記応答時間のリクエストを処理した計算機で実行中の前記プログラムを停止するよう指示を出すことを特徴とする計算機システム。
    A computer system in which a plurality of terminal devices and a plurality of computers are connected via a network,
    Each of one or more first computers among the plurality of computers holds a program in an executable state having the same function,
    Any computer of the first computer copies the program to one or more second computers that are other computers of the plurality of computers,
    The second computer sets the copied program to an executable state,
    The first and second computers execute the program in response to requests from the plurality of terminal devices,
    Each of the plurality of terminals measures a response time of the request,
    The third computer of any of the first and second computers receives the response time of the request and information of the computer that has processed the request from the plurality of terminals, and is relative to the response time. A computer system characterized by specifying a long response time and instructing to stop the program being executed on the computer that has processed the specified response time request.
  2.  請求項1に記載の計算機システムであって、
     前記複数の計算機は複数のグループに分類されており、前記第1の計算機のいずれかの計算機は、自身と同じグループに属する前記第2の計算機へ前記プログラムをコピーすることを特徴とする計算機システム。
    The computer system according to claim 1,
    The plurality of computers are classified into a plurality of groups, and any one of the first computers copies the program to the second computer belonging to the same group as itself. .
  3.  請求項2に記載の計算機システムであって、
     前記グループは、前記複数の計算機の各々が実行可能なプログラムの種類によって分類されることを特徴とする計算機システム。
    The computer system according to claim 2,
    The computer system according to claim 1, wherein the group is classified according to a type of a program that can be executed by each of the plurality of computers.
  4.  請求項1に記載の計算機システムであって、
     前記第1の計算機のいずれかの計算機は、所定の周期で、前記プログラムを前記第2の計算機へコピーすることを特徴とする計算機システム。
    The computer system according to claim 1,
    The computer system according to any one of the first computers, wherein the computer copies the program to the second computer at a predetermined cycle.
  5.  請求項1に記載の計算機システムであって、
     前記第2の計算機は第1または第2のグループのいずれかに分類されており、
     前記第1の計算機のいずれかの計算機は、第1の周期で、前記プログラムを前記第1のグループに属する前記第2の計算機へコピーし、
     前記計算機は、第2の周期で、前記プログラムを前記第2のグループに属する前記第2の計算機へコピーし、
     前記第1の周期は前記第2の周期よりも長いことを特徴とする計算機システム。
    The computer system according to claim 1,
    The second computer is classified into either the first or second group;
    Any one of the first computers copies the program to the second computer belonging to the first group in a first period;
    The computer copies the program to the second computer belonging to the second group in a second period,
    The computer system according to claim 1, wherein the first period is longer than the second period.
  6.  請求項1に記載の計算機システムであって、
     前記複数の計算機は、複数のグループに分類されており、
     前記第3の計算機は、前記複数の端末より受信した前記リクエストの応答時間と前記リクエストを処理した計算機の情報を、自身が属するグループ内の計算機と共有することを特徴とする計算機システム。
    The computer system according to claim 1,
    The plurality of computers are classified into a plurality of groups,
    The computer system, wherein the third computer shares the response time of the request received from the plurality of terminals and information of the computer that has processed the request with a computer in a group to which the third computer belongs.
  7.  請求項6に記載の計算機システムであって、
     同一の前記グループに属する計算機の役割が設定されており、
     前記設定に基づき、前記第1及び第2の計算機のなかから前記第3の計算機が決定されることを特徴とする計算機システム。
    A computer system according to claim 6, wherein
    The role of computers belonging to the same group is set,
    The computer system according to claim 1, wherein the third computer is determined from the first and second computers based on the setting.
  8.  複数の端末装置と複数の計算機がネットワークを介して接続された計算機システムの制御方法であって、
     前記複数の計算機のなかの1又は2以上の第1の計算機の各々は、同一の機能を備える実行可能な状態のプログラムを保持しており、
     前記第1の計算機のいずれかの計算機は、前記プログラムを前記複数の計算機のなかの他の計算機である1又は2以上の第2の計算機へコピーして、
     前記第2の計算機はコピーされたプログラムを実行可能な状態に設定し、
     前記第1及び第2の計算機は、前記複数の端末装置からのリクエストに応じて前記プログラムを実行し、
     前記複数の端末の各々は、前記リクエストの応答時間を計測し、
     前記第1及び第2の計算機のなかのいずれかの第3の計算機は、前記複数の端末より前記リクエストの応答時間と前記リクエストを処理した計算機の情報を受信し、前記応答時間のなかから相対的に応答時間の長いものを特定し、特定した前記応答時間のリクエストを処理した計算機で実行中の前記プログラムを停止するよう指示を出すことを特徴とする計算機システムの制御方法。
    A computer system control method in which a plurality of terminal devices and a plurality of computers are connected via a network,
    Each of one or more first computers among the plurality of computers holds a program in an executable state having the same function,
    Any computer of the first computer copies the program to one or more second computers that are other computers of the plurality of computers,
    The second computer sets the copied program to an executable state,
    The first and second computers execute the program in response to requests from the plurality of terminal devices,
    Each of the plurality of terminals measures a response time of the request,
    The third computer of any of the first and second computers receives the response time of the request and information of the computer that has processed the request from the plurality of terminals, and is relative to the response time. A computer system control method characterized by specifying a long response time and instructing to stop the program being executed by a computer that has processed the specified response time request.
  9.  請求項8に記載の計算機システムの制御方法であって、
     前記複数の計算機は複数のグループに分類されており、前記第1の計算機のいずれかの計算機は、自身と同じグループに属する前記第2の計算機へ前記プログラムをコピーすることを特徴とする計算機システムの制御方法。
    A control method for a computer system according to claim 8,
    The plurality of computers are classified into a plurality of groups, and any one of the first computers copies the program to the second computer belonging to the same group as itself. Control method.
  10.  請求項9に記載の計算機システムの制御方法であって、
     前記グループは、前記複数の計算機の各々が実行可能なプログラムの種類によって分類されることを特徴とする計算機システムの制御方法。
    A control method for a computer system according to claim 9,
    The group is classified according to the type of program that can be executed by each of the plurality of computers.
  11.  請求項8に記載の計算機システムの制御方法であって、
     前記第8の計算機のいずれかの計算機は、所定の周期で、前記プログラムを前記第2の計算機へコピーすることを特徴とする計算機システムの制御方法。
    A control method for a computer system according to claim 8,
    The computer system control method according to any one of the eighth computers, wherein the computer copies the program to the second computer at a predetermined cycle.
  12.  請求項8に記載の計算機システムの制御方法であって、
     前記第2の計算機は第1または第2のグループのいずれかに分類されており、
     前記第1の計算機のいずれかの計算機は、第1の周期で、前記プログラムを前記第1のグループに属する前記第2の計算機へコピーし、
     前記計算機は、第2の周期で、前記プログラムを前記第2のグループに属する前記第2の計算機へコピーし、
     前記第1の周期は前記第2の周期よりも長いことを特徴とする計算機システムの制御方法。
    A control method for a computer system according to claim 8,
    The second computer is classified into either the first or second group;
    Any one of the first computers copies the program to the second computer belonging to the first group in a first period;
    The computer copies the program to the second computer belonging to the second group in a second period,
    The computer system control method according to claim 1, wherein the first period is longer than the second period.
  13.  請求項8に記載の計算機システムの制御方法であって、
     前記複数の計算機は、複数のグループに分類されており、
     前記第3の計算機は、前記複数の端末より受信した前記リクエストの応答時間と前記リクエストを処理した計算機の情報を、自身が属するグループ内の計算機と共有することを特徴とする計算機システムの制御方法。
    A control method for a computer system according to claim 8,
    The plurality of computers are classified into a plurality of groups,
    The third computer shares a response time of the request received from the plurality of terminals and information on the computer that has processed the request with a computer in a group to which the third computer belongs. .
  14.  請求項13に記載の計算機システムの制御方法であって、
     同一の前記グループに属する計算機の役割が設定されており、
     前記設定に基づき、前記第1及び第2の計算機のなかから前記第3の計算機が決定されることを特徴とする計算機システムの制御方法。
    A computer system control method according to claim 13, comprising:
    The role of computers belonging to the same group is set,
    A control method for a computer system, wherein the third computer is determined from the first and second computers based on the setting.
PCT/JP2013/083818 2013-12-18 2013-12-18 Information processing system and information processing method WO2015092873A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/083818 WO2015092873A1 (en) 2013-12-18 2013-12-18 Information processing system and information processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2013/083818 WO2015092873A1 (en) 2013-12-18 2013-12-18 Information processing system and information processing method

Publications (1)

Publication Number Publication Date
WO2015092873A1 true WO2015092873A1 (en) 2015-06-25

Family

ID=53402268

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2013/083818 WO2015092873A1 (en) 2013-12-18 2013-12-18 Information processing system and information processing method

Country Status (1)

Country Link
WO (1) WO2015092873A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017022233A1 (en) * 2015-08-06 2017-02-09 日本電気株式会社 Information processing device, request process delay control method, and storage medium
US20220308869A1 (en) * 2021-03-26 2022-09-29 International Business Machines Corporation Computer management of microservices for microservice based applications
CN115658275A (en) * 2022-11-21 2023-01-31 统信软件技术有限公司 Executable program migration method and device and computing equipment
US11805039B1 (en) * 2023-01-20 2023-10-31 Dell Products L.P. Method and apparatus for detecting degraded network performance

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004092971A1 (en) * 2003-04-14 2004-10-28 Fujitsu Limited Server allocation control method
JP2006285316A (en) * 2005-03-31 2006-10-19 Hitachi Ltd Server performance measurement method, server performance measurement system, and computer program used therefor
WO2011102219A1 (en) * 2010-02-19 2011-08-25 日本電気株式会社 Real time system task configuration optimization system for multi-core processors, and method and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004092971A1 (en) * 2003-04-14 2004-10-28 Fujitsu Limited Server allocation control method
JP2006285316A (en) * 2005-03-31 2006-10-19 Hitachi Ltd Server performance measurement method, server performance measurement system, and computer program used therefor
WO2011102219A1 (en) * 2010-02-19 2011-08-25 日本電気株式会社 Real time system task configuration optimization system for multi-core processors, and method and program

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017022233A1 (en) * 2015-08-06 2017-02-09 日本電気株式会社 Information processing device, request process delay control method, and storage medium
US20220308869A1 (en) * 2021-03-26 2022-09-29 International Business Machines Corporation Computer management of microservices for microservice based applications
CN115658275A (en) * 2022-11-21 2023-01-31 统信软件技术有限公司 Executable program migration method and device and computing equipment
US11805039B1 (en) * 2023-01-20 2023-10-31 Dell Products L.P. Method and apparatus for detecting degraded network performance

Similar Documents

Publication Publication Date Title
JP7138126B2 (en) Timeliness resource migration to optimize resource placement
Sonbol et al. EdgeKV: Decentralized, scalable, and consistent storage for the edge
US11290360B2 (en) Analyzing resource placement fragmentation for capacity planning
CN107273185B (en) Load balancing control method based on virtual machine
US10055252B2 (en) Apparatus, system and method for estimating data transfer periods for job scheduling in parallel computing
Srinivasan et al. Aerospike: Architecture of a real-time operational dbms
CN105429776B (en) Method and system for virtualized network function management
WO2020134364A1 (en) Virtual machine migration method, cloud computing management platform, and storage medium
US20150134795A1 (en) Data stream ingestion and persistence techniques
US20150134796A1 (en) Dynamic partitioning techniques for data streams
US10972555B2 (en) Function based dynamic traffic management for network services
US12008402B2 (en) Determining computer resource usage at multiple levels of a container orchestration system hierarchy
JP2016537705A (en) Method, apparatus and system for virtual machine migration management
CN111722933B (en) Deadlock resolution between distributed processes
WO2015092873A1 (en) Information processing system and information processing method
CN110825704A (en) A method for reading data, a method for writing data, and a server
Durairaj et al. MOM-VMP: multi-objective mayfly optimization algorithm for VM placement supported by principal component analysis (PCA) in cloud data center
Mushtaq et al. In-depth analysis of fault tolerant approaches integrated with load balancing and task scheduling
CN112000460A (en) A method and related equipment for service expansion and contraction based on improved Bayesian algorithm
KR101324850B1 (en) Ontology-based virtual machine allocation device for mobile cloud and method thereof
US20220414577A1 (en) System and method for performance-centric workload placement in a hybrid cloud environment
US10594620B1 (en) Bit vector analysis for resource placement in a distributed system
Nivitha et al. A survey on machine learning based fault tolerant mechanisms in cloud towards uncertainty analysis
Awasare et al. Survey and comparative study on resource allocation strategies in cloud computing environment
US12135629B2 (en) Workload placement based on special purpose accelerator requirements and performance metrics

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13899872

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13899872

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP