[go: up one dir, main page]

WO2008128901A1 - Heterogeneous image processing system - Google Patents

Heterogeneous image processing system Download PDF

Info

Publication number
WO2008128901A1
WO2008128901A1 PCT/EP2008/054331 EP2008054331W WO2008128901A1 WO 2008128901 A1 WO2008128901 A1 WO 2008128901A1 EP 2008054331 W EP2008054331 W EP 2008054331W WO 2008128901 A1 WO2008128901 A1 WO 2008128901A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
processor
image processing
images
heterogeneous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/EP2008/054331
Other languages
French (fr)
Inventor
William Hyun-Kee Chung
Moon Ju Kim
James Randal Moulic
Toshiyuki Sanuki
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
IBM United Kingdom Ltd
International Business Machines Corp
Original Assignee
IBM United Kingdom Ltd
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US11/738,711 external-priority patent/US8331737B2/en
Priority claimed from US11/738,723 external-priority patent/US8326092B2/en
Application filed by IBM United Kingdom Ltd, International Business Machines Corp filed Critical IBM United Kingdom Ltd
Publication of WO2008128901A1 publication Critical patent/WO2008128901A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing

Definitions

  • the present invention relates to image processing/inspection. Specifically, the present invention relates to a heterogeneous image processing system that provides accelerated image processing as compared to previous approaches.
  • each image recording device e.g., camera
  • a separate and distinct (general purpose) processor 12A-N that itself is linked/integrated with a separate and distinct node 14A-N (e.g., PC).
  • a separate and distinct node 14A-N e.g., PC
  • the present invention relates to machine vision computing environments, and more specifically relates to a system and method for accelerating the execution of image processing applications using a hybrid computing system.
  • a hybrid system is generally defined as one that is multi-platform, and potentially distributed via a network or other connection.
  • the invention provides a machine vision system and method for executing image processing applications on a hybrid image processing system referred to herein as an image co-processor that comprises (among other things) a plurality of special purpose engines (SPEs) that work collectively to process multiple images in an accelerated fashion.
  • SPEs special purpose engines
  • implementations of the invention provide a machine vision system and method for distributing and managing the execution of image processing applications at a finegrained level via a switch-connected hybrid system.
  • This method allows one system to be used to manage and control the system functions, and one or more other systems to execute image processing applications.
  • the invention allows the management and control system components to be reused, and the image processing components to be used as an image processing accelerator or image co-processor.
  • the system components can be run using different operating systems, such as Windows (Windows and related terms are trademarks of Microsoft Corp. in the United States and/or other countries), Linux (Linux and related terms are trademarks of Linus Torvalds in the United States and/or other countries),
  • Macintosh Macintosh and related terms are trademarks of Apple Inc. in the United States and/or foreign countries), etc.
  • the present invention provides a heterogeneous image processing system, comprising: an image co-processor comprising a plurality of special purpose engines (SPEs), the plurality of
  • SPEs being configured to: receive a plurality of images; and process the plurality of images to determine associated image data.
  • Fig. 1 shows a system for processing images according to the related art.
  • Fig. 2 shows a heterogeneous, hybrid image processing/inspection system according to the present invention.
  • Fig. 3 shows an approach for integrating the system of Fig. 2 with an existing system according to the present invention.
  • Fig. 4 shows an approach for a heterogeneous, hybrid image processing/inspection system having parallel processors according to the present invention.
  • Fig. 5 shows a flow diagram for an illustrative software process according to the present invention.
  • Fig. 6 shows an approach for a heterogeneous, multi-core processor image processing/ inspection system in which image grabbers are used in dual image co-processors according to the present invention.
  • the present invention relates to machine vision computing environments, and more specifically relates to a system and method for selectively accelerating the execution of image processing applications using a hybrid computing system.
  • a hybrid system is generally defined as one that is multi-platform, and potentially distributed via a network or other connection.
  • the invention provides a machine vision system and method for executing image processing applications on a hybrid image processing system referred to herein as an image co-processor (also referred to herein as a "cell") that comprises (among other things) a plurality of special purpose engines (SPEs) that work to process multiple images in an accelerated fashion.
  • SPEs special purpose engines
  • implementations of the invention provide a machine vision system and method for distributing and managing the execution of image processing applications at a fine-grained level via a switch-connected hybrid system.
  • This method allows one system to be used to manage and control the system functions, and one or more other systems to execute image processing applications.
  • the invention allows the management and control system components to be reused, and the image processing components to be used as an image processing accelerator or image coprocessor.
  • the system components can be run using different operating systems, such as Windows (Windows and related terms are trademarks of Microsoft Corp. in the United States and/or other countries), Linux (Linux and related terms are trademarks of Linus
  • Macintosh Macintosh and related terms are trademarks of Apple Computer Inc. in the United States and/or other countries, etc.
  • a heterogeneous image processing/inspection system according to the present invention is shown.
  • image recordation mechanisms e.g., cameras
  • the image co-processor 52 in turn is connected to a control processor 50.
  • These components are connected in a single, monolithic, tightly integrated system. All image processing is done completely within the single system. Each component can only be used with a limited set of other components. Each component, and thus the entire system, can only run a single operating system.
  • the current image inspection system can be used in a manufacturing line to detect defects in items such as LCD panels or semiconductor wafers.
  • the system performs one or more scans to detect defect points.
  • Image analysis is conducted on a magnified version of each defect point.
  • a single Field of View (FOV) has multiple inspection threads, which run different algorithms, potentially at the same time for different areas in an image. One algorithm may take significantly longer to run than others.
  • Image processing software libraries are used to implement the algorithms. Large amounts of image and log data need to be moved, processed, and stored during inspection, requiring high I/O speeds and bandwidth.
  • This new design approach is a processing/inspection system based on hybrid, reusable components/systems 54A-N that are combined with special purpose engines/accelerators.
  • Image processing applications use algorithms that often have specialized functions that can benefit from special purpose processors.
  • special purpose processors can be used as accelerators to speed up image processing algorithms in a fine-grained, selective fashion that takes advantage of the strengths of both general purpose and special purpose processors.
  • the present invention combines image recording mechanisms/devices 58A-N such as cameras with a special purpose processor for image processing as well as a general processor 50 for determining control information.
  • images are received by hybrid systems 54A-N of image coprocessor 52, which process the images to determine image data.
  • This image data (and optionally the images themselves) are then communicated to control processor 50 and staging storage unit 60.
  • Control processor then processes the image data to determine control information.
  • the images, image data, and/or control information can then be stored in archive storage unit 62.
  • I/O processor 62 acquires images from one or more image recordation mechanisms 58A and passes the image(s) to an input/output (I/O) processor 62.
  • I/O processor 62 generally includes a set of express peripheral component interconnects (PCIs) 68A-N, a pure load balancer (PLB) 64 coupled to the set of express PCIs 68A-N; and a network interface 66 (e.g., GbE) coupled to the PLB 64 for interfacing with at least one legacy application 32 in IA-based PC 14 A.
  • PCIs peripheral component interconnects
  • PLB pure load balancer
  • GbE network interface
  • An I/O buffer 60 is also shown coupled to the I/O processor 62.
  • a power processing element (PPE) 76 an element interconnect bus (EIB) 74 coupled to the PPE, and a set (e.g., one or more, but typically a plurality) of special purpose engines (SPEs) 54A-N.
  • SPEs 54A-N share the load involved with processing image(s) into image data. The division of work among SPEs 54A- N was not previously performed, and hence, previous system are not suitable for current day and future image technology.
  • SPEs 54A-N feed image data, image processing routines, arithmetic/statistical operations, inspect processes, etc. to main memory
  • Image co-processor 52 will leverage legacy application in IA-based PC 14A to have general purpose or control processor 24 process the image or image data to determine control information.
  • IA-based PC system 14A of the related art obtains an image from image recordation mechanism 1OA via image grabber 20, and passes the image to a general purpose image processor 24 for processing (e.g., utilizing image buffer 22). This sparsely processed image data is then passed to bridge chip 26, IA CPU 30, and (DDR) main memory 28.
  • the previous system utilizes only a single general-purpose processor to process the image.
  • the present invention utilizes an image co-processor having a plurality of SPEs 54A-N as well as general purpose control processor 24 of IA-based PC system 14A. This is accomplished by communicated through legacy application(s) 32 in IA-based PC system 14 A.
  • the present invention not only provides improved and accelerated image processing, but it does so by utilizing existing infrastructure.
  • the heterogeneous image processing system of the present invention is operable with multiple different computing platforms (e.g., Windows, Linux, etc.).
  • Fig. 4 shows another embodiment of the present invention.
  • Fig. 4 depicts the convergence of a host machine with PCI express to accommodate in multiple image grabbers 20 A-N in IA-based PC system 14 A, as well as parallel multi-core processors (MCPs) 84 A-N within in image co-processor 52 (MCPs 84A-N can be media communications processors, which are a type of processor specifically designed for the creation and distribution of digital media/content.).
  • MCPs 84A-N can be media communications processors, which are a type of processor specifically designed for the creation and distribution of digital media/content.
  • image recordation mechanisms 10A-N will capture images and feed the same to separate image grabbers 20A-N, which utilize existing O/S image grabber driver images and an existing O/S library for image grabbers 20A-N.
  • PCI express host interface board PCIe HIB 34
  • bridge chip 26 is coupled to (DDR) main memory 28 and IA CPU 30.
  • Switch 64 couples IA-based PC system 14A to image co-processor 52 via PCIe HIBs 34 and 68 A.
  • image co-processor includes parallel PCIe HIBs 68A-N, I/O processors 62A-N, MCPs 84A- N, and external data representation (XDR) modules 82A-N.
  • a gigabit Ethernet (GbE) network interface and switch 80 can then be used to couple image co-processor 52 and IA- based PC system 14A to control processor 50, and archive storage 62.
  • GbE gigabit Ethernet
  • library functions are written for MCPs 84A-N and off-loaded by (e.g., x86) O/S to image co-processor 52 at runtime.
  • Fig. 6 shows another embodiment of the present invention. Specifically, Fig. 6 depicts the native use of image grabbers 56A-N within image co-processors 52A-N. As shown, image grabbers 56A-N each receive a feed fro an image recording mechanism such as cameras 58A-N. Once received by image grabbers 56A-N, the image is passed to I/O processors 62A-N, then to MCPs 54A-N, and then to and external data representation (XDR) modules 82A-N. As further shown, image co-processors 52A-N utilize a set of express peripheral component interconnects shown in Fig. 6 as PCIe 4X/16X.
  • GbE gigabit Ethernet
  • image co-processors 52A-N will process the images received to yield image data and then pass the image data and/or the images (along with other information) to control processor 50 for temporary storage in staging storage unit 60.
  • control processor 50 will further process this information to determine control information for the images.
  • library functions can be written for MCPs 54- N and off-loaded by (e.g., x86) O/S to image co-processors 52A-N at runtime.
  • the library developed for the present invention typically has at least one of the following features (among others):
  • step Sl the process is started with the OpenCV open source library.
  • OpenCV running on x86 is obtained with a frame grabber and a camera (or other image recording device).
  • no cell-specific optimizations are present at this point.
  • step S3 underlying pieces of OpenCV are offloaded to the cell (image co-processor 52) incrementally as required by necessary algorithms. In this step, information about what algorithms and APIs are required could be needed.
  • step S4 the offloaded pieces are optimized, with the process repeating to step S3.
  • the present invention could be deployed within a computer infrastructure. This is intended to demonstrate, among other things, that the present invention could be implemented within a network environment (e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.), or on a stand-alone computer system.
  • a network environment e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.
  • communication throughout the network can occur via any combination of various types of communications links.
  • the communication links can comprise addressable connections that may utilize any combination of wired and/or wireless transmission methods.
  • connectivity could be provided by conventional TCP/IP sockets- based protocol, and an Internet service provider could be used to establish connectivity to the
  • the computer infrastructure is intended to demonstrate that some or all of the components of such an implementation could be deployed, managed, serviced, etc. by a service provider who offers to implement, deploy, and/or perform the functions of the present invention for others.
  • I/O interfaces Further, such computer systems can be in communication with external I/O devices/resources.
  • processing units execute computer program code, such as the software and functionality described above, which is stored in memory. While executing computer program code, the processing unit can read and/or write data to/from memory, I/O interfaces, etc.
  • the bus provides a communication link between each of the components in a computer.
  • External devices can comprise any device (e.g., keyboard, pointing device, display, etc.) that enable a user to interact with the computer system and/or any devices (e.g., network card, modem, etc.) that enable the computer to communicate with one or more other computing devices.
  • the hardware used to implement the present invention can comprise any specific purpose computing article of manufacture comprising hardware and/or computer program code for performing specific functions, any computing article of manufacture that comprises a combination of specific purpose and general purpose hardware/software, or the like.
  • the program code and hardware can be created using standard programming and engineering techniques, respectively.
  • the processing unit therein may comprise a single processing unit, or be distributed across one or more processing units in one or more locations, e.g., on a client and server.
  • the memory medium can comprise any combination of various types of data storage and/or transmission media that reside at one or more physical locations.
  • the I/O interfaces can comprise any system for exchanging information with one or more external device. Still further, it is understood that one or more additional components (e.g., system software, math co-processing unit, etc.) can be included in the hardware.
  • the invention provides a computer-readable/useable medium that includes computer program code to enable a computer infrastructure to heterogeneously process images.
  • the computer-readable/useable medium includes program code that implements the process(es) of the invention. It is understood that the terms computer-readable medium or computer useable medium comprises one or more of any type of physical embodiment of the program code.
  • the computer-readable/useable medium can comprise program code embodied on one or more portable storage articles of manufacture (e.g., a compact disc, a magnetic disk, a tape, etc.), on one or more data storage portions of a computing device (e.g., a fixed disk, a read-only memory, a random access memory, a cache memory, etc.), and/or as a data signal (e.g., a propagated signal) traveling over a network (e.g., during a wired/wireless electronic distribution of the program code).
  • portable storage articles of manufacture e.g., a compact disc, a magnetic disk, a tape, etc.
  • data storage portions of a computing device e.g., a fixed disk, a read-only memory, a random access memory, a cache memory, etc.
  • a data signal e.g., a propagated signal traveling over a network

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The present invention relates to machine vision computing environments, and more specifically relates to a system and method for selectively accelerating the execution of image processing applications using a hybrid computing system. A hybrid system is generally defined as one that is multi-platform, and potentially distributed via a network or other connection. The invention provides a machine vision system and method for executing image processing applications on a hybrid image processing system referred to herein as an image co-processor that comprises (among other things) a plurality of special purpose engines (SPEs) or multi-core processors (MCP) that work to process multiple images in an accelerated fashion.

Description

HETEROGENEOUS IMAGE PROCESSING SYSTEM
FIELD OF THE INVENTION
In general, the present invention relates to image processing/inspection. Specifically, the present invention relates to a heterogeneous image processing system that provides accelerated image processing as compared to previous approaches.
BACKGROUND OF THE INVENTION
Current image processing/inspection systems have limited processing power. Specifically, current systems perform all image processing functions within a single, general-purpose system. The processor used in current image processing/inspection systems is not powerful enough to handle the image processing demands, data rates, and algorithms for much of the current generation of (e.g., manufacturing inspection systems), let alone the next generation of systems. Next-generation manufacturing systems have a need for a fast image processing system in order to complete image inspection within required times. As the size of the inspection area and the amount of gray scale data double, the data per one scan area increases dramatically. Therefore, the image inspection processing time is drastically increased. Thus, the current inspection system(s) will not adequately handle the requirements for future manufacturing systems.
Although, image processing functions are sometimes offloaded to another system, this other system also uses a general purpose processor that fails to actually perform any image processing acceleration. In addition, image processing functions in current systems are tied to a specific processor and platform, making it difficult to offload and accelerate specific functions at a fine-grained level. An example of this is shown in Fig. 1. Specifically, as shown, each image recording device (e.g., camera) 10A-N is linked with a separate and distinct (general purpose) processor 12A-N, that itself is linked/integrated with a separate and distinct node 14A-N (e.g., PC). Such an embodiment fails to provide the accelerated image processing needed by current and emerging image generations. Whereas the development of a new inspection system will increase cost and development time, it is desirable to use reusable system components without impacting system performance. In view of the foregoing, there exists a need for an approach that solves at least one of the above-referenced deficiencies of the current art.
DISCLOSURE OF THE INVENTION
In general, the present invention relates to machine vision computing environments, and more specifically relates to a system and method for accelerating the execution of image processing applications using a hybrid computing system. To this extent, a hybrid system is generally defined as one that is multi-platform, and potentially distributed via a network or other connection. The invention provides a machine vision system and method for executing image processing applications on a hybrid image processing system referred to herein as an image co-processor that comprises (among other things) a plurality of special purpose engines (SPEs) that work collectively to process multiple images in an accelerated fashion.
Moreover, implementations of the invention provide a machine vision system and method for distributing and managing the execution of image processing applications at a finegrained level via a switch-connected hybrid system. This method allows one system to be used to manage and control the system functions, and one or more other systems to execute image processing applications. The invention allows the management and control system components to be reused, and the image processing components to be used as an image processing accelerator or image co-processor. The system components can be run using different operating systems, such as Windows (Windows and related terms are trademarks of Microsoft Corp. in the United States and/or other countries), Linux (Linux and related terms are trademarks of Linus Torvalds in the United States and/or other countries),
Macintosh (Macintosh and related terms are trademarks of Apple Inc. in the United States and/or foreign countries), etc.
The present invention provides a heterogeneous image processing system, comprising: an image co-processor comprising a plurality of special purpose engines (SPEs), the plurality of
SPEs being configured to: receive a plurality of images; and process the plurality of images to determine associated image data. BRIEF DESCRIPTION OF THE DRAWINGS
These and other features of this invention will be more readily understood from the following detailed description of the various aspects of the invention taken in conjunction with the accompanying drawings in which:
Fig. 1 shows a system for processing images according to the related art.
Fig. 2 shows a heterogeneous, hybrid image processing/inspection system according to the present invention.
Fig. 3 shows an approach for integrating the system of Fig. 2 with an existing system according to the present invention.
Fig. 4 shows an approach for a heterogeneous, hybrid image processing/inspection system having parallel processors according to the present invention.
Fig. 5 shows a flow diagram for an illustrative software process according to the present invention.
Fig. 6 shows an approach for a heterogeneous, multi-core processor image processing/ inspection system in which image grabbers are used in dual image co-processors according to the present invention.
The drawings are not necessarily to scale. The drawings are merely schematic representations, not intended to portray specific parameters of the invention. The drawings are intended to depict only typical embodiments of the invention, and therefore should not be considered as limiting the scope of the invention. In the drawings, like numbering represents like elements. DETAILED DESCRIPTION OF THE INVENTION
As indicated above, the present invention relates to machine vision computing environments, and more specifically relates to a system and method for selectively accelerating the execution of image processing applications using a hybrid computing system. To this extent, a hybrid system is generally defined as one that is multi-platform, and potentially distributed via a network or other connection. The invention provides a machine vision system and method for executing image processing applications on a hybrid image processing system referred to herein as an image co-processor (also referred to herein as a "cell") that comprises (among other things) a plurality of special purpose engines (SPEs) that work to process multiple images in an accelerated fashion. Moreover, implementations of the invention provide a machine vision system and method for distributing and managing the execution of image processing applications at a fine-grained level via a switch-connected hybrid system. This method allows one system to be used to manage and control the system functions, and one or more other systems to execute image processing applications. The invention allows the management and control system components to be reused, and the image processing components to be used as an image processing accelerator or image coprocessor. The system components can be run using different operating systems, such as Windows (Windows and related terms are trademarks of Microsoft Corp. in the United States and/or other countries), Linux (Linux and related terms are trademarks of Linus
Torvalds in the United States and/or other countries), Macintosh (Macintosh and related terms are trademarks of Apple Computer Inc. in the United States and/or other countries), etc.
Referring now to Fig. 2, a heterogeneous image processing/inspection system according to the present invention is shown. As depicted, image recordation mechanisms (e.g., cameras) 58A-N record images and are attached to an image co-processor 52 (via one more image frame acquisition mechanisms 56 for image processing). The image co-processor 52 in turn is connected to a control processor 50. These components are connected in a single, monolithic, tightly integrated system. All image processing is done completely within the single system. Each component can only be used with a limited set of other components. Each component, and thus the entire system, can only run a single operating system. The current image inspection system can be used in a manufacturing line to detect defects in items such as LCD panels or semiconductor wafers. The system performs one or more scans to detect defect points. Image analysis is conducted on a magnified version of each defect point. A single Field of View (FOV) has multiple inspection threads, which run different algorithms, potentially at the same time for different areas in an image. One algorithm may take significantly longer to run than others. Image processing software libraries are used to implement the algorithms. Large amounts of image and log data need to be moved, processed, and stored during inspection, requiring high I/O speeds and bandwidth.
This new design approach is a processing/inspection system based on hybrid, reusable components/systems 54A-N that are combined with special purpose engines/accelerators. Image processing applications use algorithms that often have specialized functions that can benefit from special purpose processors. These special purpose processors can be used as accelerators to speed up image processing algorithms in a fine-grained, selective fashion that takes advantage of the strengths of both general purpose and special purpose processors.
Thus, the present invention, combines image recording mechanisms/devices 58A-N such as cameras with a special purpose processor for image processing as well as a general processor 50 for determining control information.
In a typical embodiment, images are received by hybrid systems 54A-N of image coprocessor 52, which process the images to determine image data. This image data (and optionally the images themselves) are then communicated to control processor 50 and staging storage unit 60. Control processor then processes the image data to determine control information. The images, image data, and/or control information can then be stored in archive storage unit 62.
Referring now to Fig. 3, a more detailed diagram of the system of the present invention as well as its integration with existing system is shown in greater detail. As depicted, image grabber 56 acquires images from one or more image recordation mechanisms 58A and passes the image(s) to an input/output (I/O) processor 62. As depicted I/O processor 62 generally includes a set of express peripheral component interconnects (PCIs) 68A-N, a pure load balancer (PLB) 64 coupled to the set of express PCIs 68A-N; and a network interface 66 (e.g., GbE) coupled to the PLB 64 for interfacing with at least one legacy application 32 in IA-based PC 14 A. An I/O buffer 60 is also shown coupled to the I/O processor 62.
Further shown within image co-processor 52 is a power processing element (PPE) 76, an element interconnect bus (EIB) 74 coupled to the PPE, and a set (e.g., one or more, but typically a plurality) of special purpose engines (SPEs) 54A-N. SPEs 54A-N share the load involved with processing image(s) into image data. The division of work among SPEs 54A- N was not previously performed, and hence, previous system are not suitable for current day and future image technology. As further shown, SPEs 54A-N feed image data, image processing routines, arithmetic/statistical operations, inspect processes, etc. to main memory
70 (which could be realized as staging storage unit 60 of Fig. 2.). Image co-processor 52 will leverage legacy application in IA-based PC 14A to have general purpose or control processor 24 process the image or image data to determine control information.
As further depicted, IA-based PC system 14A of the related art obtains an image from image recordation mechanism 1OA via image grabber 20, and passes the image to a general purpose image processor 24 for processing (e.g., utilizing image buffer 22). This sparsely processed image data is then passed to bridge chip 26, IA CPU 30, and (DDR) main memory 28. As can be seen, the previous system utilizes only a single general-purpose processor to process the image. Whereas, the present invention utilizes an image co-processor having a plurality of SPEs 54A-N as well as general purpose control processor 24 of IA-based PC system 14A. This is accomplished by communicated through legacy application(s) 32 in IA-based PC system 14 A. Thus, the present invention not only provides improved and accelerated image processing, but it does so by utilizing existing infrastructure. It should be noted that the heterogeneous image processing system of the present invention is operable with multiple different computing platforms (e.g., Windows, Linux, etc.).
Fig. 4 shows another embodiment of the present invention. Specifically, Fig. 4 depicts the convergence of a host machine with PCI express to accommodate in multiple image grabbers 20 A-N in IA-based PC system 14 A, as well as parallel multi-core processors (MCPs) 84 A-N within in image co-processor 52 (MCPs 84A-N can be media communications processors, which are a type of processor specifically designed for the creation and distribution of digital media/content.). Specifically, image recordation mechanisms 10A-N will capture images and feed the same to separate image grabbers 20A-N, which utilize existing O/S image grabber driver images and an existing O/S library for image grabbers 20A-N. Then, using PCI express host interface board (PCIe HIB 34), the image is communicated to bridge chip 26 for processing and placement in temporary/staging storage unit 60. As further shown, bridge chip 26 is coupled to (DDR) main memory 28 and IA CPU 30. Switch 64 couples IA-based PC system 14A to image co-processor 52 via PCIe HIBs 34 and 68 A. As shown, image co-processor includes parallel PCIe HIBs 68A-N, I/O processors 62A-N, MCPs 84A- N, and external data representation (XDR) modules 82A-N. A gigabit Ethernet (GbE) network interface and switch 80 can then be used to couple image co-processor 52 and IA- based PC system 14A to control processor 50, and archive storage 62. As further shown in Fig. 4, library functions are written for MCPs 84A-N and off-loaded by (e.g., x86) O/S to image co-processor 52 at runtime.
Fig. 6 shows another embodiment of the present invention. Specifically, Fig. 6 depicts the native use of image grabbers 56A-N within image co-processors 52A-N. As shown, image grabbers 56A-N each receive a feed fro an image recording mechanism such as cameras 58A-N. Once received by image grabbers 56A-N, the image is passed to I/O processors 62A-N, then to MCPs 54A-N, and then to and external data representation (XDR) modules 82A-N. As further shown, image co-processors 52A-N utilize a set of express peripheral component interconnects shown in Fig. 6 as PCIe 4X/16X. Communication with host system/control processor 50 can occur via gigabit Ethernet (GbE) technology including GbE switch 80. As indicated above, image co-processors 52A-N will process the images received to yield image data and then pass the image data and/or the images (along with other information) to control processor 50 for temporary storage in staging storage unit 60.
Thereafter, control processor 50 will further process this information to determine control information for the images.
Similar to the above-incorporated application, library functions can be written for MCPs 54- N and off-loaded by (e.g., x86) O/S to image co-processors 52A-N at runtime. Along these lines, the library developed for the present invention typically has at least one of the following features (among others):
(1) It is typically structured as a reusable library with many useful functions and algorithms available. (2) It is useable cross-platform (Windows, Linux, Mac).
(3) It is optimized for a specific processor architecture, but optimizations are encapsulated in a separate library. Architecture supports plugging in of other optimized libraries, such as one for cell.
(4) It includes wrappers for scripting languages such as Python, and graphical user interfaces (GUIs) to make rapid prototyping easier.
Referring now to Fig.5, a method flow diagram of an illustrative software implementation of the present invention is shown. As depicted, in step Sl, the process is started with the OpenCV open source library. In step S2, OpenCV running on x86 is obtained with a frame grabber and a camera (or other image recording device). In a typical embodiment, no cell- specific optimizations are present at this point. In step S3, underlying pieces of OpenCV are offloaded to the cell (image co-processor 52) incrementally as required by necessary algorithms. In this step, information about what algorithms and APIs are required could be needed. In step S4, the offloaded pieces are optimized, with the process repeating to step S3.
It should be understood that the present invention could be deployed within a computer infrastructure. This is intended to demonstrate, among other things, that the present invention could be implemented within a network environment (e.g., the Internet, a wide area network (WAN), a local area network (LAN), a virtual private network (VPN), etc.), or on a stand-alone computer system. In the case of the former, communication throughout the network can occur via any combination of various types of communications links. For example, the communication links can comprise addressable connections that may utilize any combination of wired and/or wireless transmission methods. Where communications occur via the Internet, connectivity could be provided by conventional TCP/IP sockets- based protocol, and an Internet service provider could be used to establish connectivity to the
Internet. Still yet, the computer infrastructure is intended to demonstrate that some or all of the components of such an implementation could be deployed, managed, serviced, etc. by a service provider who offers to implement, deploy, and/or perform the functions of the present invention for others.
Where hardware is provided, it is understood that such any computers utilized will include standard elements such as a processing unit, a memory medium, a bus, and input/output
(I/O) interfaces. Further, such computer systems can be in communication with external I/O devices/resources. In general, processing units execute computer program code, such as the software and functionality described above, which is stored in memory. While executing computer program code, the processing unit can read and/or write data to/from memory, I/O interfaces, etc. The bus provides a communication link between each of the components in a computer. External devices can comprise any device (e.g., keyboard, pointing device, display, etc.) that enable a user to interact with the computer system and/or any devices (e.g., network card, modem, etc.) that enable the computer to communicate with one or more other computing devices.
The hardware used to implement the present invention can comprise any specific purpose computing article of manufacture comprising hardware and/or computer program code for performing specific functions, any computing article of manufacture that comprises a combination of specific purpose and general purpose hardware/software, or the like. In each case, the program code and hardware can be created using standard programming and engineering techniques, respectively. Moreover, the processing unit therein may comprise a single processing unit, or be distributed across one or more processing units in one or more locations, e.g., on a client and server. Similarly, the memory medium can comprise any combination of various types of data storage and/or transmission media that reside at one or more physical locations. Further, the I/O interfaces can comprise any system for exchanging information with one or more external device. Still further, it is understood that one or more additional components (e.g., system software, math co-processing unit, etc.) can be included in the hardware.
While shown and described herein as a heterogeneous image processing system and method, it is understood that the invention further provides various alternative embodiments. For example, in one embodiment, the invention provides a computer-readable/useable medium that includes computer program code to enable a computer infrastructure to heterogeneously process images. To this extent, the computer-readable/useable medium includes program code that implements the process(es) of the invention. It is understood that the terms computer-readable medium or computer useable medium comprises one or more of any type of physical embodiment of the program code. In particular, the computer-readable/useable medium can comprise program code embodied on one or more portable storage articles of manufacture (e.g., a compact disc, a magnetic disk, a tape, etc.), on one or more data storage portions of a computing device (e.g., a fixed disk, a read-only memory, a random access memory, a cache memory, etc.), and/or as a data signal (e.g., a propagated signal) traveling over a network (e.g., during a wired/wireless electronic distribution of the program code).

Claims

1. A heterogeneous image processing system, comprising: an image co-processor comprising a plurality of special purpose engines (SPEs), the plurality of SPEs being configured to: receive a plurality of images; and process the plurality of images to determine associated image data.
2. A heterogeneous image processing system as claimed in claim 1 wherein: the special purpose engines (SPE) are multi-core processors (MCP).
3. A heterogeneous image processing system as claimed in claim 1 or claim 2, further comprising a control processor being configured to receive the image data from the image co-processor, and to determine control information for the plurality of images.
4. The heterogeneous image processing system of claim 3, further comprising: a staging storage unit for storing the image data for use by the control processor; and an archive storage unit for storing at least one of the image data or the control information.
5. A heterogeneous image processing system as claimed in claim 1 or claim 2, further comprising an input/output (I/O) processor, the I/O processor comprising: a set of express peripheral component interconnects (PCIs); a pure load balancer (PLB) coupled to the set of express PCIs; and a network interface coupled to the PLB for interfacing with at least one legacy application.
6. The heterogeneous image processing system as claimed in claim 5, the I/O processor receiving the plurality of images from an image grabber, and providing the plurality of images to the image co-processor.
7. The heterogeneous image processing system as claimed in claim 1 or claim 2, the image co-processor further comprising: a power processing element (PPE); and an element interconnect bus (EIB) coupled to the PPE and the plurality of SPEs.
8. A heterogeneous image processing system as claimed in claim 1 or claim 2, the SPEs comprising accelerators that increase a rate of processing of the plurality of images.
9. A heterogeneous image processing system as claimed in claim 1 or claim 2, the heterogeneous image processing system being operable with multiple different computing platforms.
10. A heterogeneous image processing method, comprising: receiving a plurality of images in an image co-processor, the image co-processor comprising a plurality of special purpose engines (SPEs); processing the plurality of images with the plurality of SPEs to determine image data associated with the plurality of images; and providing at least one of the image data or the plurality of images to a control processor to determine control information associated with the plurality of images.
11. A heterogeneous image processing method as claimed in claim 10, wherein the special purpose engines (SPE) are multi-core processors (MCP).
12. A heterogeneous image processing method as claimed in claim 10 or claim 11, further comprising: determining, the control information with the control processor.
13. A heterogeneous image processing method as claimed in claim 10 or claim 11, further comprising: storing the image data in a staging storage unit for processing by the control processor to determine control information; and storing at least one of the image data or the control information in an archive storage unit.
14. A heterogeneous image processing method as claimed in claim 10 or claim 11, further comprising: receiving the plurality of images in an input/output (I/O) processor, the I/O processor comprising: a set of express peripheral component interconnects (PCIs); a pure load balancer (PLB) coupled to the set of express PCIs; and a network interface coupled to the PLB for interfacing with at least one legacy application.
15. A heterogeneous image processing method as claimed in claim 14, the I/O processor receiving the plurality of images from an image grabber, and providing the plurality of images to the plurality of SPEs.
16. A heterogeneous image processing method as claimed in claim 10 or claim 11, the image co-processor further comprising: a power processing element (PPE); and an element interconnect bus (EIB) coupled to the PPE and the plurality of SPEs.
17. A heterogeneous image processing method as claimed in claim 10 or claim 11, the plurality of SPEs comprising accelerators that increase a rate of processing of the plurality of images.
18. A heterogeneous image processing method as claimed in claim 10 or claim 11, further comprising receiving the plurality of images in an image grabber from at least one image recording devices.
19. A computer program comprising instructions for carrying out all the steps of the method according to any of claim 10 to claim 18, when said computer program is executed on a computer system.
20. A heterogeneous image processing system, comprising: an image co-processor comprising: a set of input/output (I/O) processors; a set of multi-core processors (MCPs) coupled to the set of I/O processors; a set of external data representation (XDR) modules coupled to the set of MCPs; and a set of express peripheral component interconnects (PCIs) coupled the set of
I/O processors, at lease one of the set of express PCIs receiving image data from a PC system external to the image co-processor.
21 A heterogeneous image processing system as claimed in claim 20, comprising: a plurality of said image co-processors and wherein each of said plurality of image co-processors further comprises a set of frame grabbers for receiving images from a set of image recording devices.
22. A heterogeneous image processing system as claimed in claim 20 or claim 21, the PC system being an IA-based PC system that is coupled to the image co-processor via a pure load balancer (PLB) switch.
23. A heterogeneous image processing system as claimed in claim 20 or claim 21, the PC system and the image co-processor being coupled to a control processor via a gigabit Ethernet (GbE) switch.
24. A heterogeneous image processing system of claim 23, the control processor writing a library for the set of MCPs, and off-loading the library to the image co-processor at runtime.
PCT/EP2008/054331 2007-04-23 2008-04-10 Heterogeneous image processing system Ceased WO2008128901A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US11/738,723 2007-04-23
US11/738,711 US8331737B2 (en) 2007-04-23 2007-04-23 Heterogeneous image processing system
US11/738,723 US8326092B2 (en) 2007-04-23 2007-04-23 Heterogeneous image processing system
US11/738,711 2007-04-23

Publications (1)

Publication Number Publication Date
WO2008128901A1 true WO2008128901A1 (en) 2008-10-30

Family

ID=39611624

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2008/054331 Ceased WO2008128901A1 (en) 2007-04-23 2008-04-10 Heterogeneous image processing system

Country Status (1)

Country Link
WO (1) WO2008128901A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114007037A (en) * 2021-09-18 2022-02-01 华中科技大学 Video front-end intelligent monitoring system and method, computer equipment and terminal

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5506999A (en) * 1992-01-22 1996-04-09 The Boeing Company Event driven blackboard processing system that provides dynamic load balancing and shared data between knowledge source processors
WO2000068884A1 (en) * 1999-05-05 2000-11-16 Kla-Tencor Corporation Method and apparatus for inspecting reticles implementing parallel processing
EP1345120A2 (en) * 2002-03-14 2003-09-17 Fuji Photo Film Co., Ltd. Method and apparatus for distributed processing control
US20040170313A1 (en) * 2003-02-28 2004-09-02 Michio Nakano Image processing unit for wafer inspection tool
US20040228515A1 (en) * 2003-03-28 2004-11-18 Takafumi Okabe Method of inspecting defects
US20050022038A1 (en) * 2003-07-23 2005-01-27 Kaushik Shivnandan D. Determining target operating frequencies for a multiprocessor system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5506999A (en) * 1992-01-22 1996-04-09 The Boeing Company Event driven blackboard processing system that provides dynamic load balancing and shared data between knowledge source processors
WO2000068884A1 (en) * 1999-05-05 2000-11-16 Kla-Tencor Corporation Method and apparatus for inspecting reticles implementing parallel processing
EP1345120A2 (en) * 2002-03-14 2003-09-17 Fuji Photo Film Co., Ltd. Method and apparatus for distributed processing control
US20040170313A1 (en) * 2003-02-28 2004-09-02 Michio Nakano Image processing unit for wafer inspection tool
US20040228515A1 (en) * 2003-03-28 2004-11-18 Takafumi Okabe Method of inspecting defects
US20050022038A1 (en) * 2003-07-23 2005-01-27 Kaushik Shivnandan D. Determining target operating frequencies for a multiprocessor system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114007037A (en) * 2021-09-18 2022-02-01 华中科技大学 Video front-end intelligent monitoring system and method, computer equipment and terminal
CN114007037B (en) * 2021-09-18 2023-03-07 华中科技大学 A video front-end intelligent monitoring system, method, computer equipment, and terminal

Similar Documents

Publication Publication Date Title
US8326092B2 (en) Heterogeneous image processing system
US8462369B2 (en) Hybrid image processing system for a single field of view having a plurality of inspection threads
US8675219B2 (en) High bandwidth image processing with run time library function offload via task distribution to special purpose engines
US20220108045A1 (en) Heterogeneous compute architecture hardware/software co-design for autonomous driving
JP6006230B2 (en) Device discovery and topology reporting in combined CPU / GPU architecture systems
DE102021105617A1 (en) TECHNIQUES FOR TRANSFERRING DATA BETWEEN HARDWARE DEVICES
KR20200068564A (en) SYSTEM AND METHOD FOR ACCELERATED DATA PROCESSING IN SSDs
DE102021102589A1 (en) CALCULATION GRAPH OPTIMIZATION
DE112021005433T5 (en) METHOD FOR BALANCING THE POWER OF MULTIPLE CHIPS
KR101900436B1 (en) Device discovery and topology reporting in a combined cpu/gpu architecture system
CN111143174A (en) Optimal operating point estimator for hardware operating under shared power/thermal constraints
CN108132838A (en) Method, device and system for image data processing
CN105843775A (en) On-chip data partitioning read-write method, system and device
US8229251B2 (en) Pre-processing optimization of an image processing system
US8331737B2 (en) Heterogeneous image processing system
DE102021104561A1 (en) ASYNCHRONOUS DATA MOVEMENT PIPELINE
CN116724292A (en) Parallel processing of thread groups
DE112021003991T5 (en) TECHNIQUES FOR GENERATION OF INTERPOLATED VIDEO IMAGES
CN116830101A (en) Tensor modification based on processing resources
DE112019006744T5 (en) EQUIPMENT AND PROCEDURES FOR SEAMLESS CONTAINER MIGRATION FOR GRAPHICS PROCESSORS AND ASSOCIATED DEVICES
DE102020108215A1 (en) Control surface access by means of flat memory allocation
EP4318211A1 (en) Method for inspecting code under weak memory order architecture, and corresponding device
CN101620548B (en) Method and computer system for the computer simulation of a plant or a machine
CN117136354A (en) Multi-architecture execution diagram
EP4328713A1 (en) Predicting inactivity patterns for a signal conductor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08749533

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08749533

Country of ref document: EP

Kind code of ref document: A1