[go: up one dir, main page]

WO2000004502A1 - Graphics processing for efficient polygon handling - Google Patents

Graphics processing for efficient polygon handling Download PDF

Info

Publication number
WO2000004502A1
WO2000004502A1 PCT/US1999/016080 US9916080W WO0004502A1 WO 2000004502 A1 WO2000004502 A1 WO 2000004502A1 US 9916080 W US9916080 W US 9916080W WO 0004502 A1 WO0004502 A1 WO 0004502A1
Authority
WO
WIPO (PCT)
Prior art keywords
vertex
vertices
locations
base
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US1999/016080
Other languages
French (fr)
Inventor
William Lazenby
Dale Kirkland
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intergraph Corp
Original Assignee
Intergraph Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intergraph Corp filed Critical Intergraph Corp
Publication of WO2000004502A1 publication Critical patent/WO2000004502A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/50Lighting effects
    • G06T15/80Shading

Definitions

  • the invention generally relates to computer systems and, more particularly, the invention relates to processing graphics request data for display on a computer display device.
  • Three dimensional graphics request data commonly is processed in a computer system as a plurality of polygons having vertices.
  • Each of the vertices has associated attribute data (e.g., color, transparency, depth, etc.) that is utilized to rasterize pixels on a computer display device.
  • Items to be displayed commonly are in the form of a "T-fan.”
  • a T-fan is a polygon having more than three vertices such as, for example, five vertices.
  • Various techniques have been utilized to process T-fan shaped items.
  • a method and graphics accelerator that processes a polygon having N vertices utilizes a single vertex buffer location for storing data for a single base vertex of the polygon while the polygon is processed.
  • the method and graphics processor first designates one of the vertices of the polygon to be the base vertex, and then defines N-2 different triangles by each of the vertices of the polygon. Each of the N-2 triangles is defined by the base vertex and two of the N vertices other than the base vertex.
  • a buffer is provided for storing the vertex data while the polygon is being processed.
  • the buffer includes a selected number of vertex locations each for storing vertex data for one vertex. The selected number is less than N.
  • the base vertex data is stored in the first vertex location in the buffer.
  • the base vertex data preferably is retrieved from the first vertex location each time one of the N-2 triangles is processed.
  • the base vertex data is stored in the first vertex location until each of the N-2 triangles is processed.
  • the first vertex location then may be utilized for storing other data after each of the triangles is processed.
  • each of the N vertices except the base vertex defines a set of vertices
  • the buffer includes a set of vertex locations other than the first vertex location.
  • the graphics accelerator and method further process each of the vertices in the set of vertices in the set of vertex locations.
  • the set of vertices is processed in the set of vertex locations in a round robin manner.
  • the set of vertices includes N-1 vertices and the set of vertex locations includes fewer than N-1 vertex locations. In such embodiments, selected vertices in the set of vertex locations are overwritten when each of the other vertex locations in the set of vertex locations is filled with vertex data for other vertices.
  • a modulo counter is utilized to perform a round-robin method of processing and storing vertex data.
  • the polygon is a T-fan.
  • Figure 1 schematically shows a portion of an exemplary computer system on which preferred embodiments of the invention may be implemented.
  • Figure 2 schematically shows a preferred graphics accelerator that may be utilized in accordance with preferred embodiments of the invention.
  • Figure 3A schematically shows a preferred embodiment of a geometry accelerator stage shown in Figure 2.
  • Figure 3B schematically shows an alternate embodiment of a geometry accelerator stage shown in Figure 2.
  • Figure 4 shows a preferred header for sending data to the rasterization stage from the geometry accelerator stage in the graphics accelerator shown in Figure 2.
  • Figure 5 A shows an exemplary T-fan.
  • Figure 5B shows a processed exemplary T-fan that may be processed in accordance with preferred embodiments of the invention.
  • Figures 5C and 5D show exemplary primitives.
  • Figure 1 shows a portion of an exemplary computer system 100 on which preferred embodiments of the invention may be implemented.
  • the computer system 100 includes a host processor 104 (i.e., a central processing unit) for executing application level programs and system functions, volatile host memory 102 for short term data storage (i.e., random access memory), a graphics accelerator 106 for processing graphics request code in accord with preferred embodiments of the invention (see Figure 4), a display device 108 for displaying the graphics request code processed by the accelerator 106, and a bus 110 coupling all of the prior noted elements of the system 100.
  • a host processor 104 i.e., a central processing unit
  • volatile host memory 102 for short term data storage (i.e., random access memory)
  • graphics accelerator 106 for processing graphics request code in accord with preferred embodiments of the invention (see Figure 4)
  • a display device 108 for displaying the graphics request code processed by the accelerator 106
  • a bus 110 coupling all of the prior noted elements of the system 100.
  • the graphics accelerator 106 preferably utilizes any well known graphics processing application program interface such as, for example, the OPENGLTM application program interface (available from Silicon Graphics, Inc. of Mountain View, California) for processing three dimensional ("3D") and two dimensional ("2D") graphical request code.
  • the host processor 104 executes a graphical drawing application program such as, for example, the PLANT DESIGN SYSTEMTM drawing program, available from Intergraph Corporation of Huntsville, Alabama.
  • the graphics accelerator 106 includes a double buffered frame buffer 200 (i.e., having a back buffer and a front buffer) for storing the processed graphics request code in accordance with the OPENGLTM interface.
  • the graphics accelerator 106 also preferably includes a geometry accelerator 202 for performing geometry operations that commonly are executed in graphics processing, a rasterizer 204 for rasterizing pixels on the display device 108, and a resolver 206 for storing data in the frame buffer 200 and transmitting data from the frame buffer 200 to the display device
  • the graphics accelerator 106 preferably is adapted to process both 2D and 3D graphical data.
  • the graphics accelerator 106 see, for example, copending patent application entitled “Wide Instruction Word Graphics Processor”, filed on even date herewith and naming Vernon Brethour, Gary Shelton, William Lazenby, and Dale Kirkland as inventors, the disclosure of which is incorporated herein, in its entirety, by reference.
  • Figure 3 A schematically shows a preferred embodiment of the geometry accelerator stage 202 shown in Figure 2.
  • the geometry accelerator stage 202 processes graphics primitives, which are two-dimensional surfaces such as points, lines, and polygons that are combined in 2D and 3D space to create 2D and 3D objects.
  • Each endpoint (i.e., comer) of a primitive is called a vertex.
  • Figure 5C shows some exemplary primitives including a line, a triangle, and a quadrilateral.
  • a line has two vertices
  • a triangle has three vertices
  • a quadrilateral has four vertices.
  • More complex primitives such as strip primitives may be formed by using primitives as sub-elements, or sub- primitives.
  • Figure 5D shows exemplary strip primitives, including a line strip (Lstrip), a triangle strip (Tstrip), and a quadrilateral strip (Qstrip).
  • Lstrips are composed of sub- primitives that are Lines
  • Tstrips are composed of sub-primitives that are Triangles
  • Qstrips are composed of sub-primitives that are Quadrilaterals.
  • the geometry accelerator 202 processes these and other primitives. More particularly, selected elements of the geometry accelerator 202 are shown for processing a polygon having more than three vertices.
  • the geometry accelerator 202 includes an input buffer 320 for storing the input vertex data.
  • the input buffer 320 includes four vertex locations for storing all of the necessary vertex data for four vertices.
  • data is color data, transparency data, depth data, and other commonly known 2D or 3D graphical data.
  • such data may be represented by a plurality of numbers (e.g., thirty-two bit words) such as, for example, forty-eight numbers.
  • each of the vertex locations in the input buffer preferably includes an array of forty-eight thirty-two bit words.
  • the geometry accelerator 202 also includes a geometry accelerator processor 330 ("processor") for executing geometry and other known math functions on the vertex data.
  • the math functions may include multiplication operations, addition operations, and reciprocal operations.
  • the processor 330 produces processed vertex data that is stored in an output buffer 340 that also is a part of the geometry accelerator 202.
  • data in the output buffer 340 is retrieved by the processor for further processing such as, for example, clipping operations.
  • the output buffer 340 preferably includes four vertex locations for storing all of the necessary vertex data for the vertices being processed. Accordingly, the output buffer 340 may include four vertex locations for storing all of the necessary vertex data. For some graphics operations, additional vertex locations may be necessary. These may be called “reserved vertex locations.” In preferred embodiments, each of the vertex locations in the output buffer 340 includes an array of thirty-two separate thirty-two bit words.
  • the geometry accelerator also includes a vertex handler 350 for retrieving the vertex data from the output buffer and transmitting such data to the rasterizer stage 204.
  • the vertex handler 350 is controlled by the processor 330 to retrieve the output vertex data.
  • the processor 330 signals the vertex handler when vertex data for one or more vertices has been processed and placed in the output buffer 340.
  • vertex data is partially processed and is sent to the output buffer 340 temporarily.
  • the partially processed vertex data is then retrieved by the processor 330 for further processing before rasterization. In such cases, the processor 330 signals the vertex handler 350 not to retrieve vertex data from the output buffer 340 until the processing is completed.
  • the vertex handler 350 also signals the processor 330 when the vertex handler 350 cannot retrieve vertex data from the output buffer 340, for example, when the rasterizer 204 cannot handle additional rasterization requests
  • two bits of the address are required to select which of the four individual vertex locations to access within the input buffer 320 and the output buffer 340.
  • two separate pointers are maintained for addressing the input buffer 320 and the output buffer 340. The addressing (i.e., indexing) of the input and output buffers is the same so that an input buffer vertex location and corresponding output buffer vertex location correspond to the same vertex. Keeping the two buffers in sync in this manner simplifies buffer addressing.
  • a specialized arithmetic logic unit (“ALU") support is provided in the form of increment and decrement instructions to traverse the input and output vertex buffers using modulo arithmetic. If an increment operation (i.e., an operation that adds one to a given pointer) results in a value that is greater than a predetermined maximum, then a "0" value is returned. If a decrement operation (i.e., an operation that subtracts one from a given pointer) results in a value that is less than "0", then the maximum value is returned.
  • the modulo counter With four vertex locations V0, VI, V2, and V3 in each of the input and output buffers, the modulo counter is configured with a maximum value of 3. The counter counts 0, 1, 2, 3, 0, 1, 2, 3, etc. to address these locations.
  • Performance may be improved by dynamically configuring the buffer addressing. More particularly, the buffer addressing is configured by setting the maximum value on the modulo counter so that fewer than all of the vertex locations may be accessed depending on the processing required. The vertex locations which are accessed may be called "addressable vertex locations.” The maximum value on the modulo counter, therefore, may be adjusted according to the graphics primitive being processed. In some cases involving a graphics primitive composed of sub-primitives, processing may be simplified by not using all of the available vertex locations. By setting the maximum value on the modulo counter so that the number of addressable vertex locations is equal to the sub-primitive size, simplified buffer addressing can be achieved.
  • the buffer addressing can be dynamically configured based upon the sub-primitive size (e.g., 1, 2, 3, or 4). For example, consider the Lstrip in Figure 5D.
  • Lstrip can be used as an approximation of a smooth curve drawn by connecting a series of consecutive vertices with lines, where each Line is defined by two vertices. Each subsequent line shares a vertex with the previous line.
  • the sub-primitive size is two.
  • the buffer addressing could be configured to use only two of the vertex locations. Vertex data for A is stored in vertex location V0 and vertex data for B is stored in vertex location
  • vertex data for B is stored in vertex location V0
  • vertex data for C is stored in vertex location VI.
  • vertex data for C is stored in vertex location V0 and so on.
  • the vertex data for a first vertex of a line is always accessed from vertex location V0
  • the vertex data for a second vertex of the line is always accessed from vertex location VI.
  • the modulo counter thus uses a maximum value of one, and counts 0, 1, 0, 1, etc.
  • vertex data for each odd-numbered vertex in the line request is accessed from vertex location V0, and when the modulo counter counts 1 , vertex data for each even-numbered vertex in the line request is accessed from vertex location VI. This simplifies the processing code used to process the primitive, which should access vertex data for corresponding vertices of each sub-primitive from the same vertex locations each time.
  • performance is improved by utilizing all of the vertex locations.
  • performance may be improved by using a maximum value of three, so that all four vertex locations are used and the modulo counter would count 0, 1, 2, 3, 0, 1,
  • the performance may be improved by using all four vertex locations, in this case, because vertex data for vertex C is already stored in the input buffer when the first line (vertices A and B) is being processed.
  • the next line (vertices B and C) can then processed without the need for the processor 330 to wait for vertex data for vertex B and vertex C to be stored in input vertex locations.
  • strip primitives preferably are processed in this same way, except that sub- primitives may share up to two vertices.
  • Tstrip and Qstrip shown in Figure 5D In the case of a Tstrip, each new vertex in the request stream
  • every second vertex in the request stream (starting with the fourth vertex) forms a sub-quad with the previous three vertices: ABCD, CDEF, EFGH, etc.
  • This processing can be performed in the same manner as described above for consecutive vertices, but must be done every time a new sub-primitive is formed
  • the buffer addressing can be dynamically configured to one less than the number of vertices (i.e., N-1). This leaves one vertex location out of the modulo arithmetic (buffer N, or V3 in a preferred embodiments), which can be used to hold the base vertex data that is a part of each sub-primitive.
  • vertex location V3 is not an addressable vertex location, because the modulo counter cannot generate an integer that designates vertex location V3.
  • the processor can subsequently access the base vertex data as needed by directly addressing vertex location V3 without using the modulo counter. Otherwise, the two most recent vertices can be accessed by using the modulo arithmetic.
  • the output buffer includes additional reserved vertex locations necessary for certain graphics operations. These reserved vertex locations are not addressable by the modulo arithmetic. Instead, these reserved vertex locations may be addressed in a manner similar to the addressing of the vertex location holding the base vertex data.
  • FIG 3B schematically shows an alternate embodiment of the geometry accelerator stage 202 shown in Figure 2.
  • the geometry accelerator further includes a vertex assembler 310 which receives input vertex data and passes it to the input buffer.
  • the vertex assembler 310 communicates with the processor and receives an address which designates a vertex location in the input buffer 320 in which to store vertex data for a single vertex.
  • the vertex assembler 310 also signals the processor 330 when vertex data is ready to be passed to the input buffer.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Generation (AREA)

Abstract

A method and graphic accelerator that process a polygon having N vertices utilizes a single vertex buffer location for storing data for a single base vertex of the polygon while the polygon is processed. To that end, the method and graphics processor first designates one of the vertices of the polygon to be the base vertex, and then defines N-2 different triangles by each of the vertices of the polygon. Each of the N-2 triangles is defined by the base vertex and two other of the N-1 vertices. A buffer is provided for storing the vertex data while it is being processed. The buffer includes a selected number of vertex locations for storing vertex data for one vertex. The selected number is less than N. In preferred embodiments, the base vertex is stored in the first vertex location in the buffer. The base vertex may be retrieved from the first vertex location each time one of the N-2 triangles is processed.

Description

GRAPHICS PROCESSING FOR EFFICIENT POLYGON HANDLING
FIELD OF THE INVENTION
The invention generally relates to computer systems and, more particularly, the invention relates to processing graphics request data for display on a computer display device.
BACKGROUND OF THE INVENTION
Three dimensional graphics request data commonly is processed in a computer system as a plurality of polygons having vertices. Each of the vertices has associated attribute data (e.g., color, transparency, depth, etc.) that is utilized to rasterize pixels on a computer display device. Items to be displayed commonly are in the form of a "T-fan." As known by those skilled in the art, a T-fan is a polygon having more than three vertices such as, for example, five vertices. Various techniques have been utilized to process T-fan shaped items.
SUMMARY OF THE INVENTION
In accordance with one aspect of the invention, a method and graphics accelerator that processes a polygon having N vertices utilizes a single vertex buffer location for storing data for a single base vertex of the polygon while the polygon is processed. To that end, the method and graphics processor first designates one of the vertices of the polygon to be the base vertex, and then defines N-2 different triangles by each of the vertices of the polygon. Each of the N-2 triangles is defined by the base vertex and two of the N vertices other than the base vertex. A buffer is provided for storing the vertex data while the polygon is being processed. The buffer includes a selected number of vertex locations each for storing vertex data for one vertex. The selected number is less than N. In preferred embodiments, the base vertex data is stored in the first vertex location in the buffer. The base vertex data preferably is retrieved from the first vertex location each time one of the N-2 triangles is processed.
In accordance with other embodiments of the invention, the base vertex data is stored in the first vertex location until each of the N-2 triangles is processed. The first vertex location then may be utilized for storing other data after each of the triangles is processed.
In accordance with another aspect of the invention, each of the N vertices except the base vertex defines a set of vertices, while the buffer includes a set of vertex locations other than the first vertex location. The graphics accelerator and method further process each of the vertices in the set of vertices in the set of vertex locations. In some embodiments, the set of vertices is processed in the set of vertex locations in a round robin manner. In other embodiments, the set of vertices includes N-1 vertices and the set of vertex locations includes fewer than N-1 vertex locations. In such embodiments, selected vertices in the set of vertex locations are overwritten when each of the other vertex locations in the set of vertex locations is filled with vertex data for other vertices.
In accordance with still other aspects of the invention, a modulo counter is utilized to perform a round-robin method of processing and storing vertex data. In preferred embodiments, the polygon is a T-fan.
BRIEF DESCRIPTION OF THE DRAWINGS The foregoing and other objects and advantages of the invention will be appreciated more fully from the following further description thereof with reference to the accompanying drawings wherein: Figure 1 schematically shows a portion of an exemplary computer system on which preferred embodiments of the invention may be implemented.
Figure 2 schematically shows a preferred graphics accelerator that may be utilized in accordance with preferred embodiments of the invention.
Figure 3A schematically shows a preferred embodiment of a geometry accelerator stage shown in Figure 2.
Figure 3B schematically shows an alternate embodiment of a geometry accelerator stage shown in Figure 2.
Figure 4 shows a preferred header for sending data to the rasterization stage from the geometry accelerator stage in the graphics accelerator shown in Figure 2. Figure 5 A shows an exemplary T-fan.
Figure 5B shows a processed exemplary T-fan that may be processed in accordance with preferred embodiments of the invention. Figures 5C and 5D show exemplary primitives.
DESCRIPTION OF SPECIFIC EMBODIMENTS
Figure 1 shows a portion of an exemplary computer system 100 on which preferred embodiments of the invention may be implemented. More particularly, the computer system 100 includes a host processor 104 (i.e., a central processing unit) for executing application level programs and system functions, volatile host memory 102 for short term data storage (i.e., random access memory), a graphics accelerator 106 for processing graphics request code in accord with preferred embodiments of the invention (see Figure 4), a display device 108 for displaying the graphics request code processed by the accelerator 106, and a bus 110 coupling all of the prior noted elements of the system 100.
The graphics accelerator 106 preferably utilizes any well known graphics processing application program interface such as, for example, the OPENGL™ application program interface (available from Silicon Graphics, Inc. of Mountain View, California) for processing three dimensional ("3D") and two dimensional ("2D") graphical request code. In preferred embodiments, the host processor 104 executes a graphical drawing application program such as, for example, the PLANT DESIGN SYSTEM™ drawing program, available from Intergraph Corporation of Huntsville, Alabama.
Figure 2 shows several elements of the graphics accelerator 106. In preferred embodiments, the graphics accelerator 106 includes a double buffered frame buffer 200 (i.e., having a back buffer and a front buffer) for storing the processed graphics request code in accordance with the OPENGL™ interface. Among other things, the graphics accelerator 106 also preferably includes a geometry accelerator 202 for performing geometry operations that commonly are executed in graphics processing, a rasterizer 204 for rasterizing pixels on the display device 108, and a resolver 206 for storing data in the frame buffer 200 and transmitting data from the frame buffer 200 to the display device
108. As noted above, the graphics accelerator 106 preferably is adapted to process both 2D and 3D graphical data. For more information relating to preferred embodiments of the graphics accelerator 106, see, for example, copending patent application entitled "Wide Instruction Word Graphics Processor", filed on even date herewith and naming Vernon Brethour, Gary Shelton, William Lazenby, and Dale Kirkland as inventors, the disclosure of which is incorporated herein, in its entirety, by reference.
Figure 3 A schematically shows a preferred embodiment of the geometry accelerator stage 202 shown in Figure 2. The geometry accelerator stage 202 processes graphics primitives, which are two-dimensional surfaces such as points, lines, and polygons that are combined in 2D and 3D space to create 2D and 3D objects. Each endpoint (i.e., comer) of a primitive is called a vertex. Figure 5C shows some exemplary primitives including a line, a triangle, and a quadrilateral. A line has two vertices, a triangle has three vertices, and a quadrilateral has four vertices. More complex primitives such as strip primitives may be formed by using primitives as sub-elements, or sub- primitives. Figure 5D shows exemplary strip primitives, including a line strip (Lstrip), a triangle strip (Tstrip), and a quadrilateral strip (Qstrip). Lstrips are composed of sub- primitives that are Lines, Tstrips are composed of sub-primitives that are Triangles, and Qstrips are composed of sub-primitives that are Quadrilaterals. The geometry accelerator
202 processes these and other primitives. More particularly, selected elements of the geometry accelerator 202 are shown for processing a polygon having more than three vertices. The geometry accelerator 202 includes an input buffer 320 for storing the input vertex data. In preferred embodiments, the input buffer 320 includes four vertex locations for storing all of the necessary vertex data for four vertices. Among that data is color data, transparency data, depth data, and other commonly known 2D or 3D graphical data. As is known in the art, such data may be represented by a plurality of numbers (e.g., thirty-two bit words) such as, for example, forty-eight numbers. Accordingly, each of the vertex locations in the input buffer preferably includes an array of forty-eight thirty-two bit words.
The geometry accelerator 202 also includes a geometry accelerator processor 330 ("processor") for executing geometry and other known math functions on the vertex data. Among other functions, the math functions may include multiplication operations, addition operations, and reciprocal operations. The processor 330 produces processed vertex data that is stored in an output buffer 340 that also is a part of the geometry accelerator 202. In some embodiments, data in the output buffer 340 is retrieved by the processor for further processing such as, for example, clipping operations.
In a manner similar to the input buffer 320, the output buffer 340 preferably includes four vertex locations for storing all of the necessary vertex data for the vertices being processed. Accordingly, the output buffer 340 may include four vertex locations for storing all of the necessary vertex data. For some graphics operations, additional vertex locations may be necessary. These may be called "reserved vertex locations." In preferred embodiments, each of the vertex locations in the output buffer 340 includes an array of thirty-two separate thirty-two bit words.
The geometry accelerator also includes a vertex handler 350 for retrieving the vertex data from the output buffer and transmitting such data to the rasterizer stage 204. In some embodiments, the vertex handler 350 is controlled by the processor 330 to retrieve the output vertex data. The processor 330 signals the vertex handler when vertex data for one or more vertices has been processed and placed in the output buffer 340. For some graphics operations, such as clipping, vertex data is partially processed and is sent to the output buffer 340 temporarily. The partially processed vertex data is then retrieved by the processor 330 for further processing before rasterization. In such cases, the processor 330 signals the vertex handler 350 not to retrieve vertex data from the output buffer 340 until the processing is completed. In a like manner, the vertex handler 350 also signals the processor 330 when the vertex handler 350 cannot retrieve vertex data from the output buffer 340, for example, when the rasterizer 204 cannot handle additional rasterization requests In a preferred embodiment, two bits of the address are required to select which of the four individual vertex locations to access within the input buffer 320 and the output buffer 340. In addition, two separate pointers are maintained for addressing the input buffer 320 and the output buffer 340. The addressing (i.e., indexing) of the input and output buffers is the same so that an input buffer vertex location and corresponding output buffer vertex location correspond to the same vertex. Keeping the two buffers in sync in this manner simplifies buffer addressing. In preferred embodiments, a specialized arithmetic logic unit ("ALU") support is provided in the form of increment and decrement instructions to traverse the input and output vertex buffers using modulo arithmetic. If an increment operation (i.e., an operation that adds one to a given pointer) results in a value that is greater than a predetermined maximum, then a "0" value is returned. If a decrement operation (i.e., an operation that subtracts one from a given pointer) results in a value that is less than "0", then the maximum value is returned. With four vertex locations V0, VI, V2, and V3 in each of the input and output buffers, the modulo counter is configured with a maximum value of 3. The counter counts 0, 1, 2, 3, 0, 1, 2, 3, etc. to address these locations.
Performance may be improved by dynamically configuring the buffer addressing. More particularly, the buffer addressing is configured by setting the maximum value on the modulo counter so that fewer than all of the vertex locations may be accessed depending on the processing required. The vertex locations which are accessed may be called "addressable vertex locations." The maximum value on the modulo counter, therefore, may be adjusted according to the graphics primitive being processed. In some cases involving a graphics primitive composed of sub-primitives, processing may be simplified by not using all of the available vertex locations. By setting the maximum value on the modulo counter so that the number of addressable vertex locations is equal to the sub-primitive size, simplified buffer addressing can be achieved.
When processing primitives having sub-primitives that are comprised of consecutive vertices, the buffer addressing can be dynamically configured based upon the sub-primitive size (e.g., 1, 2, 3, or 4). For example, consider the Lstrip in Figure 5D. The
Lstrip can be used as an approximation of a smooth curve drawn by connecting a series of consecutive vertices with lines, where each Line is defined by two vertices. Each subsequent line shares a vertex with the previous line. Here, the sub-primitive size is two. The buffer addressing could be configured to use only two of the vertex locations. Vertex data for A is stored in vertex location V0 and vertex data for B is stored in vertex location
VI when processing the first line. When the next line is processed, vertex data for B is stored in vertex location V0, and vertex data for C is stored in vertex location VI. When the next line is processed, vertex data for C is stored in vertex location V0 and so on. The vertex data for a first vertex of a line is always accessed from vertex location V0, and the vertex data for a second vertex of the line is always accessed from vertex location VI. The modulo counter thus uses a maximum value of one, and counts 0, 1, 0, 1, etc. When the modulo counter counts 0, vertex data for each odd-numbered vertex in the line request is accessed from vertex location V0, and when the modulo counter counts 1 , vertex data for each even-numbered vertex in the line request is accessed from vertex location VI. This simplifies the processing code used to process the primitive, which should access vertex data for corresponding vertices of each sub-primitive from the same vertex locations each time.
In other cases, performance is improved by utilizing all of the vertex locations. In the Lstrip example, performance may be improved by using a maximum value of three, so that all four vertex locations are used and the modulo counter would count 0, 1, 2, 3, 0, 1,
2, 3, etc. The performance may be improved by using all four vertex locations, in this case, because vertex data for vertex C is already stored in the input buffer when the first line (vertices A and B) is being processed. The next line (vertices B and C) can then processed without the need for the processor 330 to wait for vertex data for vertex B and vertex C to be stored in input vertex locations.
Other strip primitives preferably are processed in this same way, except that sub- primitives may share up to two vertices. Consider, for example, the Tstrip and the Qstrip shown in Figure 5D. In the case of a Tstrip, each new vertex in the request stream
(starting with the third vertex) forms a sub-triangle with the previous two vertices: ABC, BCD, CDE, etc. . . . In the case of a Qstrip, every second vertex in the request stream (starting with the fourth vertex) forms a sub-quad with the previous three vertices: ABCD, CDEF, EFGH, etc. This processing can be performed in the same manner as described above for consecutive vertices, but must be done every time a new sub-primitive is formed
(i.e., every vertex for a Tstrip, every other vertex for a Qstrip).
In processing primitives such as triangle fans and polygons that require a base vertex, the buffer addressing can be dynamically configured to one less than the number of vertices (i.e., N-1). This leaves one vertex location out of the modulo arithmetic (buffer N, or V3 in a preferred embodiments), which can be used to hold the base vertex data that is a part of each sub-primitive. In OPENGL™ triangle fan and polygon requests, the buffer pointers are set to address vertex location V3, and the first (base) vertex data is then read into vertex location V3 of the input buffer. Using the increment operation with a maximum value of two to advance to the next location results in 3+1=4, which is greater than the maximum of two. V0 thus becomes the next vertex location. In this manner, vertex location V3 is not an addressable vertex location, because the modulo counter cannot generate an integer that designates vertex location V3. The processor can subsequently access the base vertex data as needed by directly addressing vertex location V3 without using the modulo counter. Otherwise, the two most recent vertices can be accessed by using the modulo arithmetic.
In another embodiment, the output buffer includes additional reserved vertex locations necessary for certain graphics operations. These reserved vertex locations are not addressable by the modulo arithmetic. Instead, these reserved vertex locations may be addressed in a manner similar to the addressing of the vertex location holding the base vertex data.
Figure 3B schematically shows an alternate embodiment of the geometry accelerator stage 202 shown in Figure 2. The geometry accelerator further includes a vertex assembler 310 which receives input vertex data and passes it to the input buffer. The vertex assembler 310 communicates with the processor and receives an address which designates a vertex location in the input buffer 320 in which to store vertex data for a single vertex. The vertex assembler 310 also signals the processor 330 when vertex data is ready to be passed to the input buffer. Although various exemplary embodiments of the invention have been disclosed, it should be apparent to those skilled in the art that various changes and modifications can be made which will achieve some of the advantages of the invention without departing from the true scope of the invention. These and other obvious modifications are intended to be covered by the appended claims.

Claims

We claim:
1. A method of processing a polygon having N vertices, where N is at least four, wherein one of the vertices is designated as the base vertex and the vertices other than the base vertex comprise a set of non-base vertices, and N-2 different triangles are defined by the base vertex and two non-base vertices, the method comprising:
A. providing a buffer for storing vertex data, the buffer including a selected number of vertex locations each for storing vertex data for one vertex; and
B. storing the base vertex data in a first vertex location in the buffer, the base vertex data being retrieved from the first vertex location each time one of the N-2 triangles is processed.
2. The method as defined by claim 1 wherein the base vertex data is stored in the first vertex location until each of the N-2 triangles is processed.
3. The method as defined by claim 1 wherein the buffer includes a set of other vertex locations other than the first vertex location, the method further comprising:
C. processing the vertices in the set of non-base vertices in the set of other vertex locations.
4. The method as defined by claim 3 wherein the set of non-base vertices is processed in the set of other vertex locations in a round robin manner.
5. The method as defined by claim 3 wherein the set of non-base vertices includes N-1 vertices and wherein the set of other vertex locations includes fewer than N-1 vertex locations, step C comprising:
C 1. overwriting vertex data for selected vertices in the set of other vertex locations when each of the vertex locations in the set of other vertex locations is filled with vertex data for other vertices.
6. The method as defined by claim 3 wherein a modulo counter generates integers appropriate for selecting vertex locations storing vertex data.
7. The method as defined by claim 1 wherein the polygon is a T-Fan.
8. A graphics accelerator for processing a polygon having N vertices, the graphics accelerator comprising: means for designating one of the vertices to be the base vertex and the vertices other than the base vertex as a set of non-base vertices; means for defining N-2 different triangles, each triangle being defined by the base vertex and two non-base vertices; and a buffer for storing vertex data, the buffer including a selected number of vertex locations each for storing vertex data for one vertex, the select number being less than N, the base vertex data being stored in a first vertex location in the buffer, the base vertex data being retrieved from the first vertex location each time one of the N-2 triangles is processed.
9. The graphics accelerator as defined by claim 8 wherein the base vertex data is stored in the first vertex location until each of the N-2 triangles is processed.
10. The graphics accelerator as defined by claim 8 wherein the buffer includes a set of other vertex locations other than the first vertex location, the graphics accelerator further comprising: a vertex processor that processes each of the vertices in the set of non-base vertices in the set of other vertex locations.
11. The graphics accelerator as defined by claim 10 wherein the set of non-base vertices is processed in the set of other vertex locations in a round robin manner.
12. The graphics accelerator as defined by claim 10 wherein the set of non-base vertices includes N-1 vertices and wherein the set of other vertex locations includes fewer than N-1 vertex locations, the graphics accelerator further comprising: means for overwriting selected vertices in the set of other vertex locations when each of the vertex locations in the set of other vertex locations is filled with vertex data for other vertices.
13. The graphics accelerator as defined by claim 10 further comprising a modulo counter for incrementing use of successive vertex locations in the set of other vertex locations.
14. The graphics accelerator as defined by claim 8 wherein the polygon is a T-Fan.
15. An apparatus for processing computer graphics requests, the apparatus comprising: a set of input vertex registers for storing input vertex data, the set having at least one reserved register and at least one addressable register; and a processor, coupled to the input vertex registers, having a clock capable of a single clock modulo increment operation yielding an integer appropriate for selecting at least one of the addressable registers, the processor actively accessing at least one of the input vertex registers that is either an addressable register designated by an integer generated by the single clock modulo increment operation or that is a reserved register.
16. An apparatus for processing computer graphics requests according to Claim 15, further comprising: a set of output vertex registers for storing processed output vertex data, at least one of the output vertex registers being actively accessed by the processor
17. An apparatus for processing computer graphics requests according to Claim 16, wherein for a number of input vertex registers there are at least an equivalent number of output vertex registers.
18. An apparatus for processing computer graphics requests according to Claim 17, wherein each input vertex register has an associated output vertex register, and vertex data which is removed from an input vertex register for processing is placed in the associated output vertex register after processing.
19. An apparatus for processing computer graphics requests according to Claim 16 further comprising: a vertex assembler capable of receiving the computer graphics requests in the form of input vertex data and passing the input vertex data to the input vertex registers.
20. An apparatus for processing computer graphics requests according to Claim 19, wherein the vertex assembler communicates with the processor and the processor designates an input vertex register in which to store input vertex data for a single vertex.
21. An apparatus for processing computer graphics requests according to Claim 15 further comprising: a vertex assembler capable of receiving the computer graphics requests in the form of input vertex data and passing the input vertex data to the input vertex registers.
22. An apparatus for processing computer graphics requests according to Claim 21, wherein the vertex assembler communicates with the processor and the processor designates an input vertex register in which to store input vertex data for a single vertex.
93249
PCT/US1999/016080 1998-07-17 1999-07-15 Graphics processing for efficient polygon handling Ceased WO2000004502A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US9318398P 1998-07-17 1998-07-17
US60/093,183 1998-07-17

Publications (1)

Publication Number Publication Date
WO2000004502A1 true WO2000004502A1 (en) 2000-01-27

Family

ID=22237616

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/016080 Ceased WO2000004502A1 (en) 1998-07-17 1999-07-15 Graphics processing for efficient polygon handling

Country Status (1)

Country Link
WO (1) WO2000004502A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8976188B1 (en) 2012-04-20 2015-03-10 Google Inc. Optimized data communication system and method for an image rendering system
US9721363B2 (en) 2014-05-19 2017-08-01 Google Inc. Encoding polygon data for fast retrieval and rendering

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0448286A2 (en) * 1990-03-16 1991-09-25 Hewlett-Packard Company Methods and apparatus for generating arbitrarily addressed, arbitrarily shaped tiles in computer graphics systems
EP0627682A1 (en) * 1993-06-04 1994-12-07 Sun Microsystems, Inc. Floating-point processor for a high performance three dimensional graphics accelerator

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0448286A2 (en) * 1990-03-16 1991-09-25 Hewlett-Packard Company Methods and apparatus for generating arbitrarily addressed, arbitrarily shaped tiles in computer graphics systems
EP0627682A1 (en) * 1993-06-04 1994-12-07 Sun Microsystems, Inc. Floating-point processor for a high performance three dimensional graphics accelerator

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8976188B1 (en) 2012-04-20 2015-03-10 Google Inc. Optimized data communication system and method for an image rendering system
US9721363B2 (en) 2014-05-19 2017-08-01 Google Inc. Encoding polygon data for fast retrieval and rendering

Similar Documents

Publication Publication Date Title
US6552723B1 (en) System, apparatus and method for spatially sorting image data in a three-dimensional graphics pipeline
US4679041A (en) High speed Z-buffer with dynamic random access memory
EP1066600B1 (en) Block- and band-oriented traversal in three-dimensional triangle rendering
JP2637931B2 (en) Computer system for texture mapping
US6476816B1 (en) Multi-processor graphics accelerator
US7042462B2 (en) Pixel cache, 3D graphics accelerator using the same, and method therefor
US6157743A (en) Method for retrieving compressed texture data from a memory system
US6243081B1 (en) Data structure for efficient retrieval of compressed texture data from a memory system
US20030030643A1 (en) Method and apparatus for updating state data
US6738069B2 (en) Efficient graphics state management for zone rendering
US6518971B1 (en) Graphics processing system with multiple strip breakers
US5877773A (en) Multi-pass clipping in a geometry accelerator
EP0489594B1 (en) Computer graphics system
US20080150951A1 (en) 3-d rendering engine with embedded memory
WO1995030969A1 (en) Computer graphics system having high performance multiple layer z-buffer
US5878216A (en) System and method for controlling a slave processor
JPH09500462A (en) Computer graphics system with high performance multi-layer z-buffer
EP0631252B1 (en) Draw processor for a high performance three dimensional graphics accelerator
US6847369B2 (en) Optimized packing of loose data in a graphics queue
JP2004348169A (en) Dynamic adjustment of sample density and / or several rendering passes in a graphics system
US6731303B1 (en) Hardware perspective correction of pixel coordinates and texture coordinates
US7490208B1 (en) Architecture for compact multi-ported register file
WO2000004502A1 (en) Graphics processing for efficient polygon handling
US20210398241A1 (en) Techniques for performing accelerated point sampling in a texture processing pipeline
US5943066A (en) Programmable retargeter method and apparatus

Legal Events

Date Code Title Description
AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase