[go: up one dir, main page]

WO2008053597A1 - Dispositif pour accélérer le traitement d'un cache de vertex de primitives étendues - Google Patents

Dispositif pour accélérer le traitement d'un cache de vertex de primitives étendues Download PDF

Info

Publication number
WO2008053597A1
WO2008053597A1 PCT/JP2007/001196 JP2007001196W WO2008053597A1 WO 2008053597 A1 WO2008053597 A1 WO 2008053597A1 JP 2007001196 W JP2007001196 W JP 2007001196W WO 2008053597 A1 WO2008053597 A1 WO 2008053597A1
Authority
WO
WIPO (PCT)
Prior art keywords
vertex
primitive
size
extended
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2007/001196
Other languages
English (en)
Japanese (ja)
Inventor
Kozakov Maxim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital Media Professionals Inc
Original Assignee
Digital Media Professionals Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital Media Professionals Inc filed Critical Digital Media Professionals Inc
Priority to JP2008541994A priority Critical patent/JP4913823B2/ja
Publication of WO2008053597A1 publication Critical patent/WO2008053597A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/60Memory management

Definitions

  • the present invention relates to the field of three-dimensional computer graphics. More specifically, the present invention provides a method and a method for processing information in a hardware at high speed for a geometric primitive having many vertices called a primitive extended in the specification. About the system.
  • Another set of algorithms requires access to a more complex part of the processed form than just a single triangle in the approximation at a time.
  • access to some limited neighboring vertices of the triangle is necessary to detect the silhouette edges, and computes the curvature at the mesh vertices and so on.
  • FIG. 1 shows a portion of a typical 3D graphics hardware asset pipeline.
  • the list of primitives is described by an index buffer 1100 and a vertex buffer 1200.
  • index buffer 1100 Usually stored in the host machine memory 1000, the contents of the index and vertex buffer are illustrated in Figure 2A and Figure 2B.
  • a set of triangles 101 possibly sharing a vertex, is represented by a set of vertex data that is successively packed into the vertex / buffer 1200 and index buffer 1100.
  • position 3, 1100 defines a triangle.
  • Next 3 positions in set 101 If the majority of the vertices in the vertex / buffer 1200 are reused, at which point the reused vertex in the vertex buffer 1200 hits the point Triangle set surrogates can be quite compact when vertex data marked with a padded fill pattern usually requires much more accumulated space than the index buffer index. Referring to FIG. 2B, the representation is even more concise with the triangle strip 102 case. In this case, vertices taken with the previous two vertices referenced in the index buffer form a triangle. As a result, only one index is needed to define further triangles after the first is processed.
  • the contents of the index buffer in this case describe the triangle set more effectively in terms of the number of index lists per triangle. Describe the same method defined by one set of line segments, with two vertices and one set point defined by one vertex per point using an index and a vertex buffer. Can do.
  • a vertex cache device is used to accelerate the processing of a set of primitives defined using indices and vertices / buffers.
  • the contents of index buffer 1 100 are not only used to extract vertex buffer 1200, but also recently used to detect if vertices with the same index have been processed. May still be available in the cache.
  • the vertex cache controller 2000 gets the contents of the index buffer 1 100 and analyzes it. Initially the cache is empty, so the vertex cache controller will deliver vertex buffer contents containing vertex data for index squirrel obtained from index buffer 1 100 to first vertex cache 3000. Is initialized.
  • memory access potential penalties are obtained from the vertex buffer in a manner that minimizes a relatively large contiguous memory block that can contain vertex data correspondences that are not currently processed only by the index list by the vertex cache controller. Nevertheless, it is the first vertex cache 3000 and it was later indexed. Retained because it may be used by Sir Xuris. Since the index is already at the first vertex, if the vertex data caches 3000, no host memory access is performed.
  • vertex positions may need to be converted from one coordinate system to another.
  • Vertex color is calculated based on standard, position, etc.
  • the vertex cache controller 2000 controls the delivery of vertex data from the first vertex cache to the vertex processor 4000.
  • the converted vertex data is sent to the secondary vertex cache 5000.
  • As Index presents it in the secondary vertex cache 5000, it delivers the vertex data transformed by any other without overheading to the primitive assembly 6000.
  • Patent Document 1 W003 / 081 528 pamphlet
  • Patent Document 2 US Patent Application Specification 2005/001 2750
  • the present invention relates to an algorithm used in three-dimensional computer graphics.
  • the purpose is to solve on-chip processing problems related to extended geometric primitives such as subdivision surface patches, NURBS patches, and adjacent triangles used as input information in rhythms.
  • extended geometric primitives such as subdivision surface patches, NURBS patches, and adjacent triangles used as input information in rhythms.
  • Such algorithms include Gatmu l ⁇ G lark loop, which is a 4_3 subdivision scheme, NURBS surface segmentation, silhouette discovery, and the simplest geometric geometric primitive to construct a triangle 3 It includes various schemes implemented in known computer graphics, such as algorithms that require paired vertices.
  • the size of primitives in current 3D computer graphics is fixed. For example, a triangle has three vertices, a straight line has two vertices, and a point has one vertex.
  • the primitive size can be any size, especially for extended primitives used in subdivision of surface patches. When using extended primitives within a reasonable range, the maximum number of vertices is required to be implemented on the chip.
  • the next problem is the difficulty in achieving random memory access, especially when processing information using extended primitives.
  • the vertex data composing the primitive can be distributed and stored in the storage unit as in simple primitives.
  • the problem here is the number of vertices in the extended primitive, which is several times larger than usual, and it is necessary to access the memory randomly to fetch the corresponding vertex data. It happens. Random access in graphics devices is limited to produce serious defects if not properly cached Normally, it is used only for devices related to vertex caches and devices related to texture sampling.
  • the present invention assigns an index to each vertex, stores the index in the index buffer, expresses the vertex in the primitive using the stored index, and then associates the index with the index.
  • Vertex information is read from the vertex buffer and used for primitive processing.
  • regular primitives such as triangles and quadrilaterals are used as regular primitives, but the present invention also uses primitives with four or more vertices than usual, while using these regular ones.
  • Such primitives with a variable number of vertices and more than usual are called variable-size extended primitives.
  • vertex information about a primitive is input, it is determined whether the primitive is a normal primitive or an extended primitive of variable size.
  • the system of the present invention includes a vertex engine that receives various information from a computer and converts information related to vertices, and a primitive engine that receives vertex information converted from the vertex engine and assembles primitives.
  • the primitives assembled by the primitive engine are rasterized by the rasterizer, stored as 3D computer graphics in a frame buffer, and rendered on the monitor.
  • Vertex conversion means arithmetic processing such as viewpoint conversion for vertices. Regardless of whether the primitive is a point, a line, a triangle, or a polygon more than a quadrangle, the necessary arithmetic processing is performed on the vertex information.
  • Primitive assembly means that the transformed individual vertices are assembled into a primitive. After primitive assembly, individual processing is performed for each primitive.
  • the primitive may include a variable size. If the primitive size is 3, such as when the primitive is a triangle, the index table needs to be accessed every multiple of 3.
  • the device only needs to remember the number 3 and does not need to be described.
  • the primitive size can be changed for each primitive, it is difficult to know how far the next primitive is. Therefore, in the present invention, it is preferable to always describe so that the following elements constitute the primitive.
  • the primitive size is described in the index table.
  • v m (V 1, V 2, v 3: However, v m is (x m, y m, z m)) consist of, the color of each vertex, red , White and yellow.
  • the second triangle adjacent to the first triangle is
  • the vertices excluding the first vertex become the two vertices of the next triangle, and the next triangle is expressed using another vertex.
  • This method can significantly reduce memory costs. However, it is still necessary to describe 2 X n or more vertices to draw n triangles.
  • V 3 in V 3 to form a single triangle, so as to form a next triangle in V 2, V 3, V 4, and stores the index.
  • an index table that can read vertex data from the vertex buffer of the vertex engine is used.
  • index The width of the tableable depends on the primitive and is 3 if the primitive is a triangle. In this method, since one vertex is shared by multiple triangles, it is necessary to repeatedly deliver information about one vertex. An index table is usually one, and is sufficient for most situations. In fact, the OpenGL / ES interface has only one index table. However, some interfaces have independent index tables for vertex attributes such as XYZ coordinates (position attributes), colors (vertex color attributes), and texture coordinates (texture attributes). A famous example of such an interface is Direct 3D.
  • the gist of the present invention is to use a vertex cache fac iliti es 3 ⁇ 4: in a suitable graphics selector to enable processing of extended geometric primitives. About talking.
  • This is an added device compared to the conventional technology, and includes a primitive engine (primitive engine) used for assembling and processing the extended primitive proposed in the present invention, and an extended primitive. Achieved by using in combination with an extended index / vertex buffer to represent the polygon mesh used to represent
  • the extended primitive is expressed based on the extended index / vertex / kuffer for expressing the polygon mesh.
  • 3D graphic libraries such as Direct3D and openGL, and they provide a method for quickly processing polygon meshes using such hardware.
  • the vertex buffer stores the vertex attributes (vertex attributes) at each vertex of the polygon mesh to be processed, while the contents of the index buffer contain mesh connection information.
  • the index buffer contains a description of the polygon array of the same size associated with the vertex numbers that make up the index string indicating the contents of the vertex buffer.
  • an index sequence belonging to a polygon is By referring to the vertex data, the vertex sequence of the polygon is described, and then the polygon edge sequence formed by the continuous connection of the vertices in the polygon is described.
  • the index buffer can use an index buffer in the same way as an extended primitive sequence of a certain size.
  • a buffer is used.
  • a fixed-size extended size of different size that must specify the number of vertices, which is the number of additional vertices, in the vertex list via the extended primitives and indexes stored only in the index buffer. It is sufficient to represent primitives and variable-size extended primitives of different sizes.
  • the creation of size, vertex list, and certain special types of primitives is done by the primitive creation algorithm itself, so the representation is not dependent on the primitive size. This is the same as the representation of simple primitives. From the point of view of the graphic library, the representation is the same as for simple primitives, and even if the primitive sequence is expanded compared to the simple primitive sequence, the API will not change significantly. It is not required and can reduce the required memory.
  • a combination of a vertex cache and a primitive engine is used in order to quickly compute a hardware-extended primitive.
  • the vertex cache allows access to vertices that have been processed immediately before with a short waiting time.
  • the vertex index can be used as a cache tag. Therefore, if the vertex index is the same as that in the cache, the one stored in the latter is used for further processing.
  • the present invention preferably uses the vertex cache for arithmetic processing of extended primitives.
  • the extended primitives can also have a low-latency access to the vertices that were processed immediately before, as in simple primitives. Available. This eliminates the performance degradation caused by the many random accesses required when fetching the extended primitive vertex data.
  • the same vertex cache hardware is used in the processing of simple primitives and extended primitives, the hardware size can be reduced. Also, since the vertex cache is used, there is no upper limit on the maximum size of the extended primitive.
  • the vertex cache is used for vertices in fixed-size extended primitives as well as vertices in variable-size extended primitives.
  • the extended primitive assembly and processing are processed by the primitive engine, which is a module added to the conventional hardware.
  • This module converts the input information into a transformed vertex map for each vertex of the primitive extended from the vertex cache. In addition, it receives size information about variable-size extended primitives from the cache controller.
  • the primitive engine executes an extended primitive arithmetic processing algorithm.
  • the algorithm interprets the extended primitive vertex sequence and outputs the computed result as a simple primitive sequence.
  • the first is versatility when processing extended primitives. If it is controlled and programmable by the primitive engine, any extended primitive can be implemented using the expression by the extended index buffer and vertex buffer proposed by the present invention. , It allows quick access to vertex data. Since it is directly connected to the vertex cache, the latency problem of assembling extended primitives and acquiring vertex data for processing is greatly reduced. Reuse of the pipeline for arithmetic processing and arithmetic processing on the extended primitive chip also contribute to this.
  • the operation result for the extended primitive is a simple primitive sequence and is directly supported by the result of the operation pipeline, the extended primitive sequence is directly used for the simple primitive without using the operation result to the host computer. Since it is converted into a column, the problem of performing arithmetic processing only on the chip can be solved.
  • the fixed index can be expanded by using an extended index / vertex / buffer expression, a minor modification to the vertex cache logic that computes simple primitives, and the primitive engine.
  • the primitive engine On the chip, such as subdivision surface rendering that does not exist in the current 3D graphics hardware such as NURBS tessellation. It is possible to perform the arithmetic processing.
  • Figure 1 shows the conventional hardware architecture of the vertex cache device. Indicates
  • Fig. 2A shows the index / vertex / buffer layout for sampling a sequence of triangles.
  • Figure 2B shows the layout of the index / vertex buffer for sampling the triangle strip sequence.
  • FIG. 3 shows a vertex cache of the architecture in the present invention.
  • Figure 4A shows a triangle and its neighboring fixed-size extended primitives.
  • Figure 4B shows a fixed-size extended primitive strip sequence around a triangle and its neighbors.
  • Fig. 4G shows the index / vertex / uffer layout of the triangle and its neighboring fixed-size extended primitives.
  • Figure 4D shows the index / vertex buffer layout of a triangle and its neighboring fixed-size extended primitive strip sequence.
  • Figure 5A shows the triangle and its neighboring fixed-size extended primitive Suan sequence.
  • Fig. 5B shows the structure of the edge-based flap Siletsu.
  • Fig. 5G shows the index / vertex / uffer layout of the triangle and its neighboring fixed-size extended primitive Suan sequences.
  • Figure 6A shows a variable-size extended primitive.
  • Fig. 6B shows a Gatmu I-Clark subdivision patch with extended primitives of variable size.
  • FIG. 6G shows the layout of the Gatmu l I-Clark subdivision / tach index / vertex buffer with variable-size extended primitives.
  • FIG. 7 shows a communication path between the vertex cache control unit, the primitive engine, and the fixed size primitive integrated circuit introduced by the present invention.
  • FIG. 8B shows the rendering results when siletto detection and visualization are performed.
  • FIG. 9A shows the rendering results of the wire-one frame shape without re-segmentation.
  • Fig. 9B shows the rendering results for the wire-frame shape when subdivision is performed.
  • Fig. 10A Fig. 1 OA shows the result of rendering one wire frame without re-segmentation, and the inside of the box shows the coarse elements of the mesh.
  • Fig. 10B Fig. 1 OB shows the rendering result of the wire-one frame when subdivision is performed, and the inside of the box shows the coarse mesh elements that are smoothed during subdivision.
  • Fig. 11A shows the rendered image without subdivision.
  • Fig. 11B shows the rendered image when subdivision is performed.
  • the present invention basically relates to the processing of on-chip complex geometric primitives (also called extended primitives) formed by a fixed or variable number of vertices.
  • the first aspect of the present invention describes simple primitives and also describes variable-size extended primitive sequences using four or more vertex data for each variable-size extended primitive.
  • a method for two-dimensional computer graphics which can store a vertex group including a plurality of attributes.
  • the position of the attribute in the buffer memory is obtained by multiplying the index that is the vertex number in the vertex sequence by an integer, and the attribute Using the vertex / queffer that can be obtained by biasing with the number indicating the type, using the index and the number indicating the attribute type, in the vertex attribute memory of the vertex in the vertex buffer Specify the position of the vertex buffer, and the vertex position attribute value sequence in the vertex / uffer in relation to the vertex sequence
  • the variable size extended primitive size is stored as an index, the fixed size primitive sequence can be reconstructed using the vertex sequence, and the variable size extended primitive sequence is stored.
  • extended primitives may contain more than four vertices.
  • the number of vertices is 4, 5, 6, 7, 8, 9, or 10.
  • the above-described method is such that the fixed-size primitive sequence can be reconstructed by using the vertex sequence, and in the case of a variable-size extended primitive sequence, it can be reconstructed by using the primitive size.
  • the method can be reconstructed by the size and the vertex sequence that forms the size. Can describe variable size extended primitives.
  • the method may further include the step of identifying multiple index buffers for all remaining vertex attributes, so that each attribute is addressed by its own index, and therefore a separate index buffer is used. If required, all required vertex attributes can be specified for extended primitive vertices, such as neighboring points.
  • the steps of the method for specifying an extended primitive sequence introduced by the present invention have the following merits.
  • Various types of extended primitives can be identified, for example, types that can be reconstructed from fixed-size or variable-sized vertex sequences. Referenced by the compact index It becomes compact by being expressed as vertices shared by primitive sequences. This method is used to identify triangle / quadrature meshes.
  • the method extends the index / top buffer to represent simple primitive sequences to describe fixed-size or variable-size extended primitive sequences.
  • simple primitives refer to basic shapes used in computer graphics, such as triangles, rectangles, lines, and points.
  • Such processing of primitive sequences is usually the biggest problem in 3D graphic libraries such as open G L and di rec t 3D.
  • the extended primitive means a geometric primitive formed by a fixed number or a variable number of vertex sequences having four or more vertex numbers.
  • An index buffer / vertex buffer for representing a simple primitive means that a simple primitive is represented as a simple primitive sequence represented as a vertex sequence constituting the primitive.
  • Vertex buffer means a storage device of vertex attributes used in computer graphics.
  • the vertex attribute means an attribute associated with a point in a four-dimensional homogeneous space used as a vertex of a primitive.
  • Preferred examples of vertex attributes include points in space, colors, texture coordinates, normal vectors, tangent vectors, and so on. Attributes can be of various dimensions, such as scalars, two component vectors, three component vectors, and various values such as 1-byte integer, 2-byte integer, 4-byte integer, 4-byte floating point, etc. Can take a type.
  • Each attribute example requires a fixed memory size storage device determined by its dimension and value type.
  • the method of identifying a point buffer includes placing attribute strings in memory in such a way that attribute values of the same attribute type are placed in memory in the same way. Therefore, the position of an attribute in the vertex buffer can be easily recovered by its position and placement in the column associated with the attribute type.
  • attribute values can be represented by integer positions or indexes in the column. If all attribute columns have the same size, the vertex index value derived from the attribute value vertex sequence will increase in relation to the number of vertices in the sequence. In such a case, a vertex sequence can be specified, and therefore a simple primitive sequence can be specified without using an index buffer.
  • vertex buffer-only representation “ver tex buffer-on ly represent” on.
  • An array in memory that contains a sequence of integer values to identify. If a vertex sequence contains only one of all types of index sequences of its associated vertex attributes, specify one index buffer to fully describe the vertex sequence for all vertex attributes It is enough. Conversely, there may be cases where each vertex attribute type is related to the vertex sequence and is unique to the vertex index of the vertex index. In such cases, the number of index buffers to identify is only the same as the number of attribute types. Combining vertices / queffers with an index buffer with all the required index arrays in it forms an index / vertex / queffer that is represented by a sequence of simple primitives.
  • the process of specifying the vertex buffer is to specify the vertex attributes (attr i butes) associated with each vertex, such as the intex buffer and / or vertex / kuffa in a simple primitive sequence, as described above. Process.
  • the step of specifying the index buffer is a step for specifying an index related to the attribute of the vertex position in order to form a primitive in which the vertex is extended.
  • this step includes the step of specifying the primitive size according to the index value in the index buffer for specifying the vertex position. If a variable-size primitive is used to implement a vertex sequence that constitutes a primitive, this step may include identifying an extended primitive sequence by specifying the primitive size for each primitive.
  • the present invention can be used for an extended primitive that can gather all necessary information by the size of the primitive, the vertex sequence that can form the primitive, and the specific value of the vertex sequence.
  • an example of a fixed-size extended primitive is a triangle “Triangle with Neighborhood” (TWN) primitive that includes a neighbor.
  • TWN triangle with Neighborhood
  • Such a primitive is a mesh that consists of a triangle containing three triangles adjacent to each side of the triangle. Can be formed by studying in a continuous mesh)
  • Figure 4 (A) is a conceptual diagram of a triangular mesh. As shown in Fig. 4 (A), fragment 103 is drawn on the triangular mesh.
  • vertex sequence ⁇ is a mapping method between vertex positions and the connection relationship between each vertex in the primitive and other vertices.
  • the vertex sequence for expressing the TWN primitive is ⁇ ⁇ 0 , ⁇ ,, v 01 , v 2 , v 20 , v 12 ⁇ .
  • the vertices for expressing the TWN primitive are ⁇ 2 , ⁇ ,, Vo, v 3 , v 4 , v 5 ⁇ .
  • V is a vertex position having the attribute index j
  • the vertex position sequence of the specific index of the TWN primitive of the central triangle ⁇ v 2 , V,, v 3 ⁇ is ⁇ 2, 1,0,3,4 , 5 ⁇ .
  • One way is to make a degenerate triangle by using the artificially created open edge vertices twice.
  • Another method is to use the vertex of the central triangle at the opposite position to the open edge. In this case, the triangle does not degenerate and becomes a central triangle.
  • a fixed-size extended primitive sequence can be formed by concatenating the vertex sequence of each primitive in the primitive sequence.
  • the index buffer for vertex position attributes can be expressed as an intex sequence of vertex position attributes of the vertices in the concatenated sequence.
  • TWN primitives there are three preferred methods for representing primitive sequences. In other words, what is expressed as a separate TWN primitive sequence, what is expressed as a TWN strip, and what is expressed as a TWN fan. They are based on separate central triangle rows, based on central triangle strips, and based on central triangle fans.
  • TWN primitive sequences can be designed by concatenate the vertices for each generated TWN primitive with the TWN primitive for each triangle as the central triangle.
  • Fig. 4 (A In), the central triangle ⁇ v ­ 2, ⁇ ,, v 3 ⁇ and ⁇ v ­ 3, ⁇ ,, v 5 ⁇ depicts a fragment of TWN primitives formed by rows consisting of (fragment).
  • the corresponding vertices are ⁇ 2 , ⁇ ,, v 0 , v 3 , v 4 , v 5 , v 3 , v,, v 2 , v 5 , v 6 , v 7 ⁇ , and the corresponding vertex position attributes
  • the index sequence corresponding to is ⁇ 2, 1,0,3,4,5,3, 1,2,5,6,7 ⁇ ⁇ as shown in FIG.
  • a TWN strip can be composed of a triangular strip.
  • the vertex sequence ⁇ 0 , ⁇ ,, v 2 , v 3 , v 4 , v 5 ,-"n is a triangular strip formed by this, ⁇ v ­ 0 v ,, v 0 v 2 , v, v 3 , v 2 v 4 , v 3 v 5 , ⁇ Vertical row of triangles adjacent to the triangle adjacent to the triangle strip on the opposite side along the side ⁇ v 01 , v 02 , v 13 , v 24 , v 35 , "'n For a triangle strip formed by this, the vertex sequence defined by the ⁇ strip is That is, ⁇ 0 , V ,, v 01 ,
  • a mesh fragment is represented by a triangle strip 104 consisting of two triangles ⁇ v ­ 2 , ⁇ , v 3 ⁇ and ⁇ v ­ 3 , ⁇ , v 5 ⁇ . Yes.
  • This strip can be formed by four length vertex sequences ⁇ v ­ 2 , V 1, v 3 , v 5 ⁇ .
  • the two TWN primitives that define the TWN strip are ⁇ 2 , ⁇ ,, v 0 , v 3 , v 4 , v 5 , v 7 , v 6 ⁇ , and correspond to them.
  • the index sequence for specifying the vertex position is ⁇ 2, 1,0,3,4,5,7,6 ⁇ .
  • TWN fans can also be formed from triangular fans.
  • a TWN fan can also reduce the number of index buffers required to represent the TWN primitive sequence of the triangle fan sequence that is the corresponding center.
  • the triangular fan has a vertex sequence ⁇ 0 , ⁇ ,, ⁇ 2 , ⁇ 3 , ⁇ 4 , ⁇ 5 , ⁇ , and ⁇ v & shy ⁇ v ', ⁇ , ⁇ 2 , ⁇ 2 ⁇ 3 , ⁇ 3 ⁇ 4 , ⁇ 4 ⁇ 5 , It can be formed by triangular vertices ⁇ v 01 , v 12 , v 23 , v 34 , v 45 , ⁇ adjacent to the triangle in the triangular fan at the opposite vertex of the side.
  • Vertex string characterizing define TWN fan is represented by ⁇ V 0, V ,, Voi, V 2, V, 2, V 3, V 23, V 4, V 34, V 5, V 45, ⁇ ,
  • the first six vertices mean the first TWN primitive, and every second vertice defines a consecutive TWN primitive. Note that the first six vertices are different from the TWN strip.
  • FIG. 5 (A) two triangles constituting the triangle fan 1 05 shows a mesh fragments by the;; ⁇ 2, v 3, v 4 v & shy ⁇ ⁇ v & shy 2, v ,, v 3 ⁇ and.
  • This fan can be expressed by a vertex sequence ⁇ v & shy ⁇ v ⁇ vj of length 4.
  • FIG. 5 (A) two triangles constituting the triangle fan 1 05 shows a mesh fragments by the;; ⁇ 2, v 3, v 4 v & shy ⁇ ⁇ v & shy 2, v ,, v 3 ⁇ and.
  • This fan can be expressed
  • the vertex sequence of the TWN fan by two TWN primitives is ⁇ 2 , ⁇ ,, Vo, v 3 , v 5 , v 4 , v 6 , v 7 ⁇
  • the index sequence for the attribute of the corresponding vertex position is ⁇ 2, 1,0,3,5,4, 6,7 ⁇ as shown in Fig. 5 (G).
  • variable-size extended primitives include the Gatmul ⁇ Glark subdivision patch primitive (GGSP).
  • CCS P is composed of polygons represented by a quadrilateral mesh having one or more vertices with different edges and different numbers in each polygon. From the viewpoint of a quadrilateral mesh by Gatmul ⁇ Glark subdivision, one vertex with a number of adjacent edges being different from four is called an irregular vertex. It is released. The number of edges adjacent to a vertex is called a valence of vertex. In other words, vertex vertices other than 4 are considered irregular vertices from the viewpoint of Gatmul ⁇ Glark subdivision.
  • a GGSP primitive is formed by vertices in a rectangle and all adjacent vertices in a polygon that shares the vertices of the rectangle.
  • the rectangle located at the center is called the center square. If there are vertex values for irregular vertices, the center rectangle differs from other rectangles in the mesh, and the number of vertices in the GGS P primitive also change around the center rectangle.
  • a CCS P primitive can be represented by a sequence of vertices that are mapped between vertices in a sequence of vertices, thereby determining the position of the CCS P primitive.
  • the vertex sequence and mapping are formed as follows.
  • the CCS P primitive has vertices ⁇ , ⁇ ,, v 2 , v 3 ⁇ , which are square edges (VoV ,, v, v 2 , v 2 v 3 ,
  • Vertex sequences describing CCS P primitives are formed as follows: First
  • the vertices of adjacent sides that share v 0 and belong to the quadrangle that makes up the mesh are selected.
  • the sixth is on the same side as V, and is the same rectangular vertex as v 0
  • the fragment of the control mesh (106) consists of two adjacent CCS P primitives ⁇ v 9 , v 5 , v 6 , v 10 ⁇ and ⁇ v 9 , v 10 , v 16 , v 15 ⁇ is described as the center rectangle.
  • Vertex v 9 is an irregular vertex with a vertex value of 5.
  • the vertex sequences describing the first and second primitive are ⁇ 9 , ⁇ 5 , ⁇ 6 , ⁇ , ⁇ , ⁇ 8 , ⁇ 4 , ⁇ 0 , ⁇ , ⁇ 2 , ⁇ 3 , ⁇ 7 > , ⁇ ,,, ⁇ 17, ⁇ 16 , ⁇ 15, ⁇ 14, ⁇ 13, ⁇ 12 ⁇ and ⁇ 9, ⁇ 10
  • the index indicating the CCS ⁇ primitive vertex position attribute is determined in advance with the CCS P primitive in the index buffer containing the index indicating the vertex position attribute.
  • the contents of the vertex groups in the two CCS P primitives shown in Fig. 6 (A) are the index buffers 1 200 (or the top) as shown in Fig. 6 (C).
  • Point buffer 1 1 0 0) is (18, 9, 5, 6, 10, 8, 4, 0, 1, 2, 3, 7, 1 1, 1 7, 1 6, 15, 14, 13, 12, 18, 9, 10, 1 6, 15, 5, 6, 7, 1 1, 1 7, 21, 20, 1 9, 18, 14, 13, 12, 8, 4 ⁇
  • the process of identifying the index buffer sequence for the remaining vertex attributes is to identify the vertex attribute so that a different index is required so that the same index is not used to refer to all attributes of the vertex.
  • the vertices are formed by a set of attribute indexes, which are the indices associated with the different attributes of the vertices. If the above is not true, all vertex attributes can be specified by one index corresponding to the vertex position attribute. Otherwise, a set of index buffers must be specified to specify all vertex attributes. Since the index buffer for the vertex position attribute includes the primitive size, special handling is used when handling extended primitives of variable size as described above.
  • index buffers do not have information about the primitive size, and can be of a length smaller than that of the vertex position attribute when representing variable-size primitives.
  • the index of the vertex attribute for a vertex is determined as follows. For extended attributes of fixed size, the index buffer value at the i-th position is derived from the index column of the i-th vertex in the index buffer, where N is the length of the vertex column and i is greater than or equal to 0 and less than N. This is the value of all the attributes you want. In the case of extended primitives of variable size, the situation becomes more complex.
  • the index sequence of the i-th vertex is (( i + N; _ 1) is formed based on the i th value of the other index buffer and the i th vertex, where i is the initial value of the number of primitives. The number of primitives.
  • the second aspect of the present invention is a method for processing fixed-size or variable-size extended primitives at high speed using a vertex cache (vertex buffer).
  • Variable size pre from buffer Etching the primitive's primitive size and delivering the primitive size to a primitive circuit, a processor that is programmable to process a specific circuit or extended primitive assembly; If the data is not available, the vertex data for the extended primitive vertices in the vertex cache is fetched, converted, stored, and converted for assembly and operation of the extended primitive.
  • a vertex cache facility means a system for achieving high-speed processing of a simple sequence of primitives specified using a value from an index or vertex buffer. To do.
  • the function of the vertex cache device traverses (sweeps) the specified vertex sequence using the value from the index / vertex buffer, and determines the attribute sequence of the corresponding vertex in the vertex sequence to be processed. Determine whether the vertex is in the same column as the available attribute in the storage unit in the vertex cache device as the top cache storage unit (store).
  • a vertex buffer is generated sequentially. May be.
  • the assembled vertex is sent based on the attribute value for vertex transformation, and additionally the transformation result is stored in the vertex cache storage.
  • Strange The converted vertices are delivered to a fixed-size primitive integrated circuit device to represent a simple primitive sequence represented by an index / vertex buffer. If the vertex transformation is accelerated, or if the vertex data is in the vertex cache, the transformed vertex is fixed-size primitive integrated circuit to eliminate the need to sample the vertex buffer and eliminate the need to transform the vertex again.
  • the term fixed primitive assembler means a system that collects and reconstructs simple primitives from a sequence of vertices.
  • collecting primitives means gathering all the information needed to reconstruct a primitive, for example, all of the primitives for further processing. It means that information about vertices is accumulated. For example, in the case of separate triangle rows, the three vertex rows that exist in each triangle correspond to the necessary information. Similarly, for separate lines, it corresponds to two consecutive vertices that make up the line. In the case of a triangle strip, it corresponds to the three vertices of the first triangle and the vertices of adjacent triangles.
  • the fixed primitive assembler is processed and returns the sequence of vertices according to the type of primitive delivered to the state of the simple primitive so that it can be sent to a raster pipeline that only processes simple primitives. .
  • the method according to the second aspect of the present invention extends the vertex cache device in various aspects.
  • a new logic circuit is required to fetch a variable-size extended primitive. It then exchanges information with the primitive engine (a separate device for assembling and processing information about extended primitives) that functions between the vertex cache unit and the fixed primitive assembly unit.
  • the method includes fetching the size of the primitive from the index buffer for the vertex position attribute, the vertex of the variable-size extended primitive expressed using the method according to the first aspect of the present invention is used.
  • Process column Can be used for sshing.
  • the process of fetching, transporting, and accumulating extended primitive vertex data in the vertex cache can assemble vertices from vertex attribute values, transport vertices, and When referring to, the retrieval is performed quickly and the vertex can be stored in the vertex cache store. This process is the same regardless of whether the extended primitive sequence or simple primitive sequence is processed.
  • This method involves delivering the transferred vertices to the primitive engine, processing the primitives extended in the primitive engine, and processing the processed results in the form of a simple primitive sequence for the rest of the processing pipeline.
  • the process of fetching, distributing and storing vertex data in the vertex cache is substantially the same whether processing simple primitives or extended primitive sequences.
  • This method allows random access to vertices / buffers to assemble vertices of extended primitives from attributes related to attribute indexes fetched from the index buffer without using any special method. be able to.
  • most of the logic circuits used in the process of fetching, distributing, and storing vertex data must be shared between simple primitive processing and extended primitive processing. Therefore, the hardware cost of the logic circuit implemented to realize the extended primitive processing can be reduced.
  • Using a vertex cache reduces the latency when delivering primitive engine vertex data to the extended primitive assembly and processing algorithm. As a result, The performance related to the operation processing of the extended primitive will be improved.
  • the primitive size of a variable-size primitive is fetched from the index buffer of the vertex position attribute, and the primitive size is programmed by an arithmetic unit that can be processed in a specific circuit or an extended primitive assembly.
  • the process of delivering to a primitive engine is as follows. In other words, it is the process of retrieving the primitive size from the index buffer of the vertex position attribute.
  • a variable-size extended primitive is positioned there. The primitive size is delivered to the primitive engine as it may be needed in the processing of the extended primitive.
  • the information about the primitive size precedes the information about the vertex of the primitive in the index buffer of the vertex position attribute.
  • the latter Any other information about the limit is delivered to the primitive engine ahead of it. Knowing the primitive size can determine the index array offset of the vertex position attribute with respect to the beginning of the next variable-size primitive, so that the index contains the size information of the next primitive. It will be.
  • Primitive size is also needed to be able to begin operations on variable-size primitives. The latter ends after the primitive engine has obtained the top of all primitives controlled by the primitive size. This step is only required when implementing variable size extended primitives. This step can be omitted when computing other primitives.
  • the vertex data about the vertex of the extended primitive in the vertex cache is fetched, converted, and stored. is there. As described above, this is the process of fetching, converting, and storing in the vertex cache by the vertex cache device. Since the vertex cache device does not perform the operation processing of the extended primitive, the processing of this process is performed on the extended primitive sequence of a fixed size even if it is a simple primitive with respect to the vertex cache device. Even if it is processing, it does not change. Therefore, it is possible to share on the device side such as a circuit for performing simple primitive calculation processing and extended primitive calculation processing. This step can be omitted if the vertex information is already stored in the vertex cache.
  • the vertex attribute index is sampled from the index buffer. Also, when the vertex buffer represents only fixed-size extended primitives, the vertex attribute index is generated in order.
  • the process of fetching the primitive size may require separate indexing for different vertex attributes. If a data buffer is used, the process of fetching primitive vertex data may need to be modified. The reason is that the index buffer that describes the length of the index buffer for the vertex position attribute and the index buffer that describes all other attribute columns contain information about the primitive size, and the length is the number of primitive columns. This is because it is larger than other index buffers.
  • the process of fetching vertex data for a vertex needs to form an index sequence for each vertex attribute corresponding to the vertex in question. When processing separate index arrays for different vertex attributes, fixed-size primitives, or simple primitive sequences, this formation is the position of all index buffers corresponding to the vertex positions.
  • this formation is modified as follows.
  • the index values for all attributes other than the vertex position attribute are obtained by sampling the position of the index buffer determined by the vertex position, but the index of the vertex position attribute is the same as that of the other attributes. Obtained by the index buffer of the vertex position attribute of the position obtained by adding the number of previous primitives being processed.
  • This process uses an index buffer and a vertex buffer for an extended primitive string of variable size, and is introduced according to the first aspect of the present invention.
  • the process of delivering the converted vertex to the primitive engine for the assembly and operation of the extended primitive is the primitive engine for assembling and calculating the input primitive information. It is a process for delivering to the algorithm by. In the present invention, this step is accomplished by selecting a primitive engine to which the transformed vertices from the vertex cache are delivered instead of a fixed primitive assembler.
  • the term “primitive engine” is used to realize the processing of extended primitives by Means a fixed circuit or programmable system.
  • extended primitives of variable size information on primitive vertices that are received and stored internally are stored. Reconstruct primitives based on information about accumulated primitives. Arithmetic processing is performed to reconstruct primitives according to an algorithm that results in a simple primitive sequence that can be accessed by a fixed primitive assembly device as a result of arithmetic processing of extended primitives of variable size.
  • the process of delivering a fixed-size simple primitive obtained by the operation processing of the extended primitive in the primitive engine to the fixed-size primitive integrated circuit via the pipeline for primitive rasterization includes the following steps. Including. This is the process of delivering extended primitives that are formatted for processing by a fixed primitive assembly device. In the present invention, this delivery is performed in the same way as when the converted vertex is directly delivered from the vertex cache to the fixed assembly device when a simple primitive sequence is processed by the vertex cache device. Is done. Therefore, it is not necessary to modify the rasterization pipeline in order to be able to perform operations on extended primitives, rather than having to modify fixed primitive assembly devices.
  • the process of assembling and processing the extended primitive is a process for implementing a predetermined algorithm using the primitive engine and performing an arithmetic process on the extended primitive using the algorithm.
  • Preferred examples of algorithms for processing fixed-size extended primitives implemented by the primitive engine include, but are not limited to:
  • the TWN primitive sequence is processed to realize detection and visualization of mesh series.
  • mesh silhouette means a set of triangle edges shared by a set of triangles, one facing the view direction and the other facing the other direction.
  • Visualize mesh silhouette Optimized means a way to visually enhance the silhouette edge when rendering a mesh.
  • the outline of the algorithm for detecting and visualizing silhouettes is as follows. Stores vertex data for TWN primitives.
  • the first and subsequent primitives are stored differently in the strip / fan.
  • the first primitive requires six vertices before the operation on the primitive begins.
  • the second and subsequent primitives can use already stored vertices, so only two more vertices are required.
  • vertex data is accumulated in the same way as the first primitive even for the second and subsequent primitives.
  • the TWN primitive strip to be used is the contents of the index buffer corresponding to the vertex position attribute sequence indicated by ⁇ 2, 1, 0,3, 4,5, 7,6 ⁇ .
  • the vertex cache device reads the vertex corresponding to the index from the index buffer and transmits it to the primitive engine.
  • the sixth vertex transmitted from the vertex cache device to the primitive engine is equivalent to an index value of 5.
  • the primitive engine obtains information about all vertices for the center triangle 1 1 1 and the adjacent triangles 1 1 0, 1 1 2, and 1 1 3, the primitive engine will get the first triangle in the TWN strip sequence. It is possible to start detecting the side of the straight line.
  • the calculation of the direction of the triangle is evaluated by a scalar consisting of three inner products consisting of three component vectors consisting of X, y, and w at the vertex position of the triangle in system coordinates. Is done.
  • the subscripts 0, 1, and 2 mean the first, second, and third vertices in the triangle, respectively. If this sign is different from two triangles that share an edge, the edge is a silhouette. Therefore, four evaluations of the TWN primitive center triangle and three adjacent triangles are required to determine the silhouette edge.
  • an extra geometric figure that forms a rectangular flap in the direction perpendicular to the observer's visual field direction that is, the direction of the axis of the eye at the viewpoint coordinates. It extends outside the object.
  • the direction outside the object is represented by the attributes of the other vertices that can be obtained after the vertices are transported from the vertex cache unit to the primitive engine, and the normal vectors at each vertex of the mesh. Converted to normal direction.
  • the first and second vertices that make up the Silette edge are V in the Xo, y 0 , Zo, Wo viewpoint coordinate system and x,, ⁇ , ⁇ , w, viewpoint coordinate system.
  • the generated flap is a rectangle represented by vertices with the following coordinate values in the viewpoint coordinate system: ⁇ xo, y 0 , zo, wol, ⁇ ,, y, ⁇ ,, w, ⁇ , ⁇ x! + offsetx + ri! x + w ,, y, + offset Y * n 1y , ⁇ ,, w, ⁇ , and ⁇ x 0 + offset x * n 0x , y 0 + offset Y * n 0y , Zo, Wol o
  • FIG. 5 (B) is the same value as the coefficient offset x and the coefficient offset Y, if the the n 0z and n 1z 0 It is a figure which shows the structure of such a flap.
  • the vertex cache unit delivers two vertices with index values 7 and 6, thereby defining two more triangles 1 1 5 and 1 1 4.
  • the silhouette of the second triangle in the strip can be calculated.
  • the two triangles will be used again. That is, one is the triangle in question and the other is the adjacent triangle. In the case shown in Figure 4B, the two triangles are 1 1 3 and 1 1 1 respectively.
  • This method can greatly reduce the computational cost of implementing an algorithm for silhouette detection. In order to prevent the occurrence of overlapping geometric flaps on the side of the silhouette, for example, the flap must be generated to be a T W N primitive in a certain direction of a triangle, such as the direction facing the viewer. Other primitives can be ignored.
  • the primitive engine can also output a simple fixed-size primitive.
  • the output includes an invariant central triangle and two triangles that form the geometric flap that forms the silhouette edge.
  • flaps some attributes can be changed to implement the algorithm. For example, if the color forms a black appearance, it is replaced with a black one.
  • the output triangle is delivered to the fixed primitive assembler and processed as if it had been delivered directly from the vertex cache unit.
  • Gatmu l ⁇ G lark, CC, subdivision method consists of multiple processes, A smooth and fine grid-like mesh is generated from the menu, and each process is a process of subdividing the mesh obtained in the previous process. In other words, the rules in the basic mesh are applied in a recursive manner and refined.
  • the rule generates a new vertex for each face of the mesh obtained in the previous process, generates a new vertex for each side of the mesh obtained in the previous process, and Rearrange mesh vertex positions with respect to position.
  • the vertex position of the mesh in the next process is limited to the linear combination of the vertices that are adjacent to the side or face to which it belongs. More specifically, the face point (fac e po i nts) is located at the average position of the face of the original vertex. The position of an edge point is calculated as the average of the center position of the original edge and the average of two new adjacent face points.
  • the vertices from the previous process are positioned as shown in the following equation.
  • S ' (Q + 2R + S (n-3)) / n.
  • Q is the average value of new face points located around the vertex
  • R is the average value of the midpoints of the edges sharing the vertex.
  • S is the vertex position in the previous process
  • n is the number of edges sharing the vertex, that is, the vertex value.
  • the number of vertices required to calculate the vertex position in the next subdivision step is fixed to 6 for the edge vertex and 4 for the face.
  • information on all vertices belonging to one adjacent object to be rearranged is required.
  • a neighbor with vertices is formed by all vertices that share a mesh surface with it. In case of the shown in FIG. 6 (A), FIG.
  • the adjacent ones of the vertex v 9 in the base mesh 1 0 6, vertex v 13, v, 2, v 8, v 4, v 5, v 6 , v, o, is formed by v 16, v 15 and v 14, its value is 5.
  • the value of the vertices can be arbitrary, so when executing a subdivision rule, information about the number of vertices in the neighborhood is needed. However, in practice, the maximum vertex value can be limited, and the subdivision rule can be implemented without being particularly limited to subdivide a certain account.
  • the CC scheme can be used for all meshes, but different for each patch.
  • the following rules can also be used.
  • a subdivision surface patch can be formed to deviate from the face of the basic mesh as a sequence of vertices belonging to one neighbor of the face. In other words, since the subdivision rule is limited to one adjacent object, the collection of adjacent objects with face vertices includes the vertices of the face itself. After two CC subdivisions, all faces in the subdivision mesh are rectangular and it is known that there is no abnormal irregular vertex on the face.
  • the basic surface can be designed that way, but only one CC subdivision process needs to be performed to achieve the same result.
  • FIG. 6 shows an irregular vertex v 9 with a vertex value of 5 formed by the central squares ⁇ v 9 , v 5 , v 6 , v 10 ⁇ and ⁇ v 9 , v 10 , v 16 , v 15 ⁇ .
  • FIG. 11 is a diagram showing a basic control mesh 106 having two adjacent CCS P primitive sequences to be shared.
  • the index buffer that contains the index column of the vertex position attribute is as follows.
  • the processing of the CCSP sequence is as follows.
  • the process of fetching the primitive size is performed by a vertex cache device that retrieves the primitive size value 18 from the first position in the vertex position attribute index group and delivers it to the primitive engine.
  • the vertex cache unit delivers 18 vertices to the primitive engine according to the vertex position attribute index.
  • the CCS P primitive can be reconstructed by the description method introduced by the first aspect of the present invention by the algorithm implemented by the primitive engine. After rebuilding the primitive, all the information for performing the subdivision is available, so the reconstructed CCS P primitive is the Gat mult Glark subdivision surface (Jeffrey Bolz and Peter
  • the subdivision scheme used is a set of quadrilaterals that correspond to the fine tessellation of the central quadrilateral of the C C S P primitive that has been processed to be rendered as a fragment of the mosaic mesh (107).
  • Each of the resulting squares is further processed and divided into two triangles for rasterization and delivered to a fixed primitive assembler. These steps are repeated for the next CCS P primitive computation.
  • a third aspect of the present invention is a processing method according to the first aspect of the present invention in which a fixed-size primitive or a variable-size primitive sequence is processed using the method according to the second aspect of the present invention. It is related with the system for realizing.
  • the system according to the third aspect of the present invention is configured to change the processes related to the vertex processing of the prior art in order to process the extended primitive.
  • the vertex data from the vertex buffer 1 200 is fetched according to the index stored in the index buffer 1 1 00 and generated, and the vertex data is converted to the vertex processing unit 4000.
  • the converted vertex data is delivered to the Primitive Engine 9000, where the extended primitive is assembled (ass emb I e) and processed.
  • Simple primitives generated by the Primitive Engine 9000 extended primitive assembly process are a collection of fixed primitives. Delivered to the rest of the processing pipeline, such as product circuit 6 0 0 0. Such a process is particularly useful only when processing extended primitives.
  • the transferred vertex from the second step may be directly delivered to the integrated circuit 600 that processes the fixed primitive.
  • the integrated circuit 6 0 0 0 that processes the fixed primitive, for example, a known circuit as shown in FIG.
  • a specific example of the system of the present invention is a system including the following modules. That is, the system of the present invention includes a vertex cache control unit, VGG, 2000, first vertex cache storage unit, PVG, 3000, second vertex cache storage unit, SVG, 5000, one or more vertex processing units, VPU, 4000, primitive engine, PE, 9000, and fixed size primitive integration circuit, FPA, 6000.
  • the remaining pipeline relates to a system with a fixed size primitive set unit 7000 and a rasterizer 8000 that implements processing of fixed rows of simple primitives like triangles.
  • FIG. 3 does not show other units that exchange information and process information with the above system. This is for the sake of brevity and assembling what is characteristic of the present invention. Specific examples of omitting the descriptions include host CPU host memory and triangular rasterization pipeline. Information exchange between them will be explained to the extent necessary in relation to the present invention.
  • the logic circuit of the vertex cache control unit for processing the primitive engine and the variable-size extended primitive can be modified as appropriate, and other known ones can be used as appropriate.
  • modules other than those mentioned above should be adopted as appropriate in order to realize simple fixed-size primitives such as dots, lines, and triangles, as used in current 3D computer graphics. Can do.
  • the vertex cache control unit VCC, 2 0 0 0, performs the following processing. Analyzes the contents of the index buffer. Vertex index fetched from index buffer 1 1 0 0 according to the state of PVC 3 0 0 0 and SVC 5 0 0 0 Fetch the contents of the vertex buffer of PVC3000 according to In order to execute the processing for each vertex of the vertex data, the vertex data transmitted from the PVC 3000 to the VPU (s) 4000 is controlled. Controls the accumulation of transferred vertex data that can be transmitted from the VPU (s) 4000 to the S VC5000.
  • the contents of SVC5000 are sent to PE 9000 or FPA 6000 according to the type of primitive (whether it is an extended primitive of variable size) (specifically, the type of primitive is determined and the primitive is variable) If the primitive is extended in size, the contents of S VC 5000 are delivered to PE 9 000).
  • extended primitives of variable size deliver information about the size of the primitive to the P9000. Inform PE 9000 that after primitive size information and vertex data for all primitive vertices has been delivered, it may begin processing for variable-size extended primitives.
  • the preferred embodiment of the present invention is that primitive sizes are only delivered to PE 9000.
  • the contents of the SVC are delivered to the PE, and after the information about the size of the primitive and the vertex data for all primitive vertices is delivered to the PE 9000, the variable primitive is about the extended primitive. Informed that there is a possibility of starting processing.
  • the process of analyzing the index buffer is also a process peculiar to the present invention, and is related to the extraction of the primitive size and the processing of the extended primitive of the variable size. As other processing steps, known processing steps performed on simple primitives can be adopted as appropriate.
  • the first vertex cache storage unit PVC is a place where information from a large number of vertices / buffers is continuously transferred from the host memory and cached.
  • the PVC functions even if it stores the vertex data that was not processed by VP U (s) for the per-vertex processing. .
  • the PVC is filled with undelivered vertex data, so that Can reduce the problem of waiting time.
  • the PVC and VPU (s) are physically or electrically located close to each other, so that there is a waiting time for transport when there is vertex data that is not delivered in the PVC and must be processed. Can be reduced.
  • the vertex processing unit VPU (s) is a module that achieves a certain fixed function, eg, O p e n G L and / or D i e c t 3 D
  • VPU receives various non-transferred vertex attributes such as position, color, and text coordinates as input, and generates various attributes of transferred vertices such as position, color, text coordinates, and viewpoint vector.
  • the number and dimensions of inputs and outputs, that is, the format of input and output vertex information, may be different.
  • VPU (s) receives information from the PVC and passes the output to the SVC.
  • the primitive engine PE is a module that implements a fixed function for performing information processing related to an extended primitive, or a programmable module.
  • Implementing PE as a module that realizes a fixed function is effective for applications that achieve high performance and limited functions. Realizing PE as a programmable module is preferable because it gives the algorithm for processing extended primitives a degree of choice.
  • the PE receives the output about the vertices transferred from the SVC, assembles the primitive, processes the primitive, and outputs it as a sequence of simple primitives that the FPA can understand.
  • PE is a module unique to the present invention.
  • PE when implemented as a programmable module, in addition to adding a function to control the operation result of the extended primitive obtained in the form of a simple primitive sequence to be transferred to the FPGA, it is programmable. It only needs to have the same functions as programmable VP U (s), such as sharing many functional logics with VP U (s).
  • PE is F
  • input information is received from S VC and VCC.
  • the main differences between PE and other devices are the information on the extended primitive size and the vertex data to deliver all the vertices of the extended primitive when processing variable size extended primitives. This is because it is necessary to send and receive a notification signal about delivery.
  • the VCC uses the same data channel to convey information about the primitive size to the PE for vertex data as described below.
  • the fixed-size primitive integration circuit from the vertex sequence delivered from the SVC is a module that has a fixed function for assembling simple primitives such as points, lines, and triangles.
  • SVC implements a set of primitives that can be used for modern 3D graphics APIs such as openGL and direct3D, such as points, lines, line loops, triangles, triangle strips, and triangle fans.
  • 3D graphics APIs such as openGL and direct3D, such as points, lines, line loops, triangles, triangle strips, and triangle fans.
  • “8 is the input to receive input information from £”, “ ⁇ ”, and SVC and VCC are the vertices to FPA as the protocol for transferring information about transferred vertices.
  • the source of information may be completely transparent to this module in order to use the same one for data transfer.
  • the vertex cache control unit controls all primitive processing on the chip.
  • the vertex cache control unit uses the information stored in the PVC and the SVC to indirectly reference the vertex data for the primitive through the information stored in the index buffer.
  • the point buffer processing can be accelerated.
  • the vertex cache control unit can accelerate processing by referencing only the vertex buffer when the vertex data for the primitive is formed by the vertex data string in the vertex buffer using PVC.
  • VCC operates in two modes depending on the primitive: a mode for processing fixed-size primitives and a mode for processing extended-size primitives. VCC processing is done for simple primitives and extended primitives of fixed size The same processing as in the case is performed.
  • the vertex data related to the vertex that was not transferred is loaded and sent to the VPU (s). If there is no vertex data transferred from the currently sampled SVC to the PVC with the transferred vertex corresponding to the vertex attribute index string, the VCC is transferred from the host memory to the PVC as a vertex buffer. Upload a chunk of content to be stored. The chunk contains vertex data about the index sequence that is queried using the empty space in the PVC, or overwrites the unused data previously stored there. The chunk may contain vertex data from other indices. If the chunk exists in the PVC, it may be used further.
  • VCC When vertex data becomes available in P V C, O begins delivery to R (s), either by the round-robin method or by other methods. If there is a converted vertex data force ⁇ , which does not exist in the SVC and is not transferred to the PVC, VCC transfers the vertex data from VP U (s) to the PVC without accessing the host memory. Begin. For this reason, the access time to the host memory can be reduced. In this way, PVC leads to an improvement in the processing speed for processing the vertex data that is not transferred.
  • Storage locations of different vertex attributes in the host memory are preferably distributed for different spacing and attribute sizes.
  • the VCC assembles input data to the VPU from such distributed vertex attributes.
  • the input to the VPU is a floating vector of 4 components, and the input data for each vertex can span several such vectors.
  • the VCC assembles the input data from the distributed vertex attributes by performing the necessary type of conversion processing. For example, a 2-byte integer value is converted to a floating-point number, and the packing attribute is converted to 4 vectors according to the configuration. For example, two sets of two component textures and one coordinate attribute are four components. Converted to input vector. It is transferred as a lot of input data when the VPU can operate.
  • the input for the next vertex to be transferred is input to the VPU.
  • the VPU After receiving a certain number of vectors, the VPU starts processing on the vertices. The number depends on the number of attributes per vertex and their packing, and is determined via VPU.
  • the VPU finishes the transfer, the VPU outputs the four component floating-point vectors to the SVC.
  • the VCC manages that the V PU for representing the vertex sequence accesses the SVC.
  • the format of the transferred vertices, ie the number of attributes and the packing to the output vector, may be different from the input.
  • the storage entity of the SVC is also a four component floating point vector. Therefore, each transferred vector is occupied by an input string to the SVC.
  • the SVC simply performs the FI FO queue, and if the FI FO overflows when transferring the transferred data to the recently processed vertex, the oldest data is quickly output.
  • SVC simply performs the FI FO queue, and if the FI FO overflows when transferring the transferred data to the recently processed vertex, the oldest data is quickly output.
  • LRU least recently used
  • the transferred vertex data is delivered from the SVC to the FPA according to the order of the vertex sequence.
  • the FPA starts the vertex operation when a predetermined number of four component vectors are received from the SVC.
  • the predetermined number is equal to the number of vectors output by the VPU for the transferred vertices.
  • the FPA assembles triangles from the transferred vertex sequence and delivers them to the rasterization pipeline for rasterization.
  • the converted vertex data is first sent to the PE.
  • the PE also has two main operation modes: fixed-size primitive arithmetic processing and variable-size extended primitive arithmetic processing.
  • the PE starts vertex processing when it receives a predetermined number of four component vectors from the SVC, as in the FPA.
  • the result of the extended primitive processing is delivered from P F to FPA in the form of a simple sequence of primitives.
  • the sequence of simple primitives output from the PE is realized in a format that can be reproduced as the vertex sequence of simple primitives, and the primitive sequence is represented by four component vectors.
  • most hardware used for arithmetic processing of simple primitives such as VCC, PVC, VPUs, SVC, and FPA is expanded by adding an arithmetic processing mechanism for primitive processing extended to PE. This means that it can be reused when processing primitives.
  • vertex caches are used only to the same extent as simple primitives, so that the extended primitives can be processed quickly. It is possible to realize a rapid calculation process on the chip for the primitives that have been processed.
  • the PE executes the arithmetic algorithm after receiving each vertex data, that is, after receiving a specific number of four vectors from the SVC. To do.
  • the computation algorithm manages the accumulation of received vertex data, and detects the moment when the fixed-size extended primitive is reconstructed from the vertex sequence and the moment when the computation is simplified.
  • TWN primitive sequences which is the above-mentioned fixed-size extended primitives. This is done according to the calculation process.
  • the sequence of TWN primitives is the same as one of the central triangle sequences Spatial consistency.
  • the cache size is the same as that of the computation processing of the central triangle sequence with the cache hit level as the size of the secondary cache. Must be increased.
  • the T W N primitive sequence is processed in the same way as the simple primitive sequence for vertex data fetching, transport, and caching. The difference is that the destination of the converted vertex data is not PE but PE.
  • the converted vertex data is delivered from S V C to PE one by one in the order of the input vertices.
  • the V C C logic circuit is modified in the present invention to provide the following means. That is, means for accurately analyzing the contents of the index buffer and separating the primitive size from the vertex attribute index of the vertex position attribute of the vertex forming the primitive, means for delivering information about the primitive size to the PE, and Measure the number of vertices of extended primitives of variable size that have not yet been delivered, and send them to the PE, and the delivery of vertex data has been completed in order to start the primitive processing. Is a means of making PE known. This modification is unique to the present invention. The remaining functions of V CC can be used as they are for processing fixed-size primitives. Therefore, the functions for processing fixed-size primitives can be used as they are.
  • the first index of the primitive in the index buffer for the vertex position attribute is its size in the VCC logic circuit's extended primitive arithmetic processing mode.
  • VCC 2 0 0 0 initializes the vertex counter 1 2 0 0 inside it, and counts for each index processed next to detect the start of the next primitive. Reduce one price. Since PE is available for primitive processing, the size is extended When the vertex attribute of the vertex sequence to form the ive is delivered from the SVC, it is delivered to the primitive size register 9100 in the PE using the same path.
  • the primitive size register 9100 is implemented so that it can be accessed by the primitive arithmetic processing algorithm in the PE, and the primitive can be reconstructed using the vertex sequence.
  • the VCC can start sending vertex data converted from the SVC to the PE.
  • the processing for vertex data fetching, transport, and delivery to the SVC is the same as for fixed primitives.
  • Vertex delivery commands are determined by the index string stored in the index buffer following the first index containing size information.
  • the algorithm for computing the primitives in the PE accumulates vertex information and reduces the value of the VCC vertex counter 210 0 after each vertex attribute sequence is delivered to the PE. To be implemented. When the counter reaches 0, it means that all vertex data has been delivered to the PE, and a notification signal to that effect is delivered from the VCC to the PE via connection 2300. When a signal is received via connection 2300, the PE begins internal primitive processing.
  • One preferred example of a system operation is to perform processing on a CCS P primitive sequence to generate a subdivision surface from a control mesh described as a basic CCS P primitive sequence.
  • Arithmetic processing is performed according to the arithmetic processing of extended primitives of variable size. Spatial coherency with the CCS P primitive column is extremely high. In fact, in the calculation of adjacent patches other than irregular vertices, CCS P primitives can share as many as 12 vertices out of 16 vertices, and therefore to process the next GGSP primitive. It is only necessary to perform arithmetic processing that adds only four vertices to and deliver it to the SVC.
  • the use of a vertex cache to compute the extended primitives greatly reduces the data access costs, and the subdivision patch chip in the vicinity of the patch. It is possible to realize arithmetic processing on the network.
  • the SVC hit rate is about 64%, which is higher for longer sequences. It is considered to be. If a vertex cache or index buffer / vertex / buffer is not used to perform processing related to such large primitives, a very large memory cost will be required for vertex processing.
  • ACM Press can start to create subdivided meshes contained in the center rectangle of CCS P primitives according to the algorithm and other subdivided algorithms for each patch.
  • the result of the subdivision patch operation processing is a triangle, a triangle strip in FPA, etc., depending on the input information.
  • Another aspect of the present invention is to realize the above-described system in an integrated circuit, which can perform fixed-type or variable-size extended primitives on-chip.
  • the present invention relates to a system using an integrated circuit as described above in order to realize high-speed processing of a subdivision surface generated by the Gatmul ⁇ Glark subdivision scheme.
  • the apparatus for processing the extended primitive of the present invention can be used for interactive 3D computer graphics for real-time rendering.
  • 3D computer graphics Implements an algorithm for accessing multiple vertices of a polygon mesh at the same time, and for real-time rendering of complex geometric shapes represented by grid eyes and subdivided surfaces using NURBS patches. Something related to it.
  • such a device includes a hard-work selection card in 3D computer graphics for visualizing a 3D image, a personal digital assistance, a video game device, a force navigation system, etc. Can be used for
  • FIG. 8A, FIG. 8B, FIG. 9A, FIG. 9B, FIG. 10A, FIG. 10B, FIG. 11A, and FIG. 11B are image results obtained using the apparatus of the present invention. Indicates. Figures 8A and 8B compare the direction of the silhouette and examine the possibility of visualization. Figure 8A cannot be visualized, and Figure 8B can be visualized. For example, for animation applications where it is necessary to emphasize the shape of the contour line when animating a manga character, visualization using a silhouette as shown in Fig. 8B can be used.
  • FIGS. 9A and 9B show a rendering of a simple disk-like shape before and after re-division in the present invention.
  • Figure 9A shows the one using the original rough mesh
  • Figure 9B shows the one using the mesh subdivided by adding polysilver in the subdivision process.
  • Figures 10-8 and 1OB are figures showing re-division into a more complicated shape compared to Figures 9A and 9B.
  • Figures 11A and 11B relate to the shading of Figures 10A and 10B, respectively, excluding one wire frame.
  • the present invention is accelerated by the vertex cache and can implement the subdivision algorithm on-chip. Therefore, the present invention provides real-time generation and visualization of complex shapes obtained by subdivision based on simple meshes, which is an important feature in video games. Can be performed.
  • the extended primitive processing apparatus of the present invention can be used in the field of real-time 3D computer graphics.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Image Generation (AREA)

Abstract

Pour résoudre les problèmes inhérents au traitement sur puce de primitives géométriques étendues, comme une pièce de surface redivisée, une pièce NURBS, un triangle adjacent, utilisées en tant qu'informations d'entrée dans l'algorithme à utiliser en graphisme informatique tridimensionnel. Les problèmes sont résolus par un système à utiliser en graphisme informatique tridimensionnel, tels que le système comprenant un premier stockage de cache de vertex (PVC), une unité de traitement de vertex (VPU), un second stockage de cache de vertex (SVC), un moteur de primitive (PE), un ensemble de primitives fixe (FPA) et une commande de cache de vertex(VCC).
PCT/JP2007/001196 2006-11-01 2007-10-31 Dispositif pour accélérer le traitement d'un cache de vertex de primitives étendues Ceased WO2008053597A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2008541994A JP4913823B2 (ja) 2006-11-01 2007-10-31 拡張されたプリミティブの頂点キャッシュの処理を加速する装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2006-297590 2006-11-01
JP2006297590 2006-11-01

Publications (1)

Publication Number Publication Date
WO2008053597A1 true WO2008053597A1 (fr) 2008-05-08

Family

ID=39343949

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2007/001196 Ceased WO2008053597A1 (fr) 2006-11-01 2007-10-31 Dispositif pour accélérer le traitement d'un cache de vertex de primitives étendues

Country Status (2)

Country Link
JP (2) JP4913823B2 (fr)
WO (1) WO2008053597A1 (fr)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011522322A (ja) * 2008-05-29 2011-07-28 アドバンスト・マイクロ・ディバイシズ・インコーポレイテッド ジオメトリシェーダを用いる平面充填エンジンのためのシステム、方法及びコンピュータプログラム製品
JP2012137984A (ja) * 2010-12-27 2012-07-19 Digital Media Professional:Kk 画像処理装置
JP2012234533A (ja) * 2011-04-29 2012-11-29 Harman Becker Automotive Systems Gmbh ナビゲーション装置用データベース、地形の三次元表示を出力する方法、およびデータベースを生成する方法
CN109147021A (zh) * 2017-06-27 2019-01-04 三星电子株式会社 使用高速缓存状态表的用于高速缓存管理的系统和方法
CN111047503A (zh) * 2019-11-21 2020-04-21 中国航空工业集团公司西安航空计算技术研究所 一种顶点数组类命令的属性存储与组装优化电路
CN113868280A (zh) * 2021-11-25 2021-12-31 芯和半导体科技(上海)有限公司 参数化单元数据更新方法、装置、计算机设备和存储介质
CN115599491A (zh) * 2022-12-14 2023-01-13 西安纽扣软件科技有限公司(Cn) Svg矢量图展现方法、装置、设备及存储介质
CN115829825A (zh) * 2023-01-10 2023-03-21 南京砺算科技有限公司 图元数据的装载控制方法、图形处理器、设备及存储介质

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0649797B2 (ja) 1988-05-13 1994-06-29 第一工業製薬株式会社 熱硬化性樹脂用難燃剤及びその製法
WO2007067249A1 (fr) 2005-12-09 2007-06-14 Fallbrook Technologies Inc. Transmission a variation continue
EP1811202A1 (fr) 2005-12-30 2007-07-25 Fallbrook Technologies, Inc. Transmission à variation continue
WO2009065055A2 (fr) 2007-11-16 2009-05-22 Fallbrook Technologies Inc. Unité de commande pour transmission variable
US10047861B2 (en) 2016-01-15 2018-08-14 Fallbrook Intellectual Property Company Llc Systems and methods for controlling rollback in continuously variable transmissions
US10023266B2 (en) 2016-05-11 2018-07-17 Fallbrook Intellectual Property Company Llc Systems and methods for automatic configuration and automatic calibration of continuously variable transmissions and bicycles having continuously variable transmissions
JP7182863B2 (ja) * 2017-11-02 2022-12-05 キヤノン株式会社 情報生成装置、情報処理装置、制御方法、プログラム、及びデータ構造
US11215268B2 (en) 2018-11-06 2022-01-04 Fallbrook Intellectual Property Company Llc Continuously variable transmissions, synchronous shifting, twin countershafts and methods for control of same
US11174922B2 (en) 2019-02-26 2021-11-16 Fallbrook Intellectual Property Company Llc Reversible variable drives and systems and methods for control in forward and reverse directions

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075804A1 (fr) * 2000-03-31 2001-10-11 Intel Corporation Architecture graphique en mosaiques
JP2004103021A (ja) * 2002-09-12 2004-04-02 Internatl Business Mach Corp <Ibm> 三角形メッシュをレンダリングする方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5842004A (en) * 1995-08-04 1998-11-24 Sun Microsystems, Inc. Method and apparatus for decompression of compressed geometric three-dimensional graphics data
US6741249B1 (en) * 2002-03-20 2004-05-25 Electronics Arts, Inc. Method and system for generating subdivision surfaces in real-time
JP4479957B2 (ja) * 2003-07-18 2010-06-09 パナソニック株式会社 曲面細分割装置
US7439983B2 (en) * 2005-02-10 2008-10-21 Sony Computer Entertainment Inc. Method and apparatus for de-indexing geometry

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001075804A1 (fr) * 2000-03-31 2001-10-11 Intel Corporation Architecture graphique en mosaiques
JP2004103021A (ja) * 2002-09-12 2004-04-02 Internatl Business Mach Corp <Ibm> 三角形メッシュをレンダリングする方法

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011522322A (ja) * 2008-05-29 2011-07-28 アドバンスト・マイクロ・ディバイシズ・インコーポレイテッド ジオメトリシェーダを用いる平面充填エンジンのためのシステム、方法及びコンピュータプログラム製品
US8836700B2 (en) 2008-05-29 2014-09-16 Advanced Micro Devices, Inc. System, method, and computer program product for a tessellation engine using a geometry shader
JP2012137984A (ja) * 2010-12-27 2012-07-19 Digital Media Professional:Kk 画像処理装置
JP2012234533A (ja) * 2011-04-29 2012-11-29 Harman Becker Automotive Systems Gmbh ナビゲーション装置用データベース、地形の三次元表示を出力する方法、およびデータベースを生成する方法
US9441978B2 (en) 2011-04-29 2016-09-13 Harman Becker Automotive Systems Gmbh System for outputting a three-dimensional representation of a terrain
CN109147021A (zh) * 2017-06-27 2019-01-04 三星电子株式会社 使用高速缓存状态表的用于高速缓存管理的系统和方法
CN109147021B (zh) * 2017-06-27 2023-08-04 三星电子株式会社 使用高速缓存状态表的用于高速缓存管理的系统和方法
CN111047503A (zh) * 2019-11-21 2020-04-21 中国航空工业集团公司西安航空计算技术研究所 一种顶点数组类命令的属性存储与组装优化电路
CN111047503B (zh) * 2019-11-21 2023-06-13 中国航空工业集团公司西安航空计算技术研究所 一种顶点数组类命令的属性存储与组装优化电路
CN113868280A (zh) * 2021-11-25 2021-12-31 芯和半导体科技(上海)有限公司 参数化单元数据更新方法、装置、计算机设备和存储介质
CN115599491A (zh) * 2022-12-14 2023-01-13 西安纽扣软件科技有限公司(Cn) Svg矢量图展现方法、装置、设备及存储介质
CN115829825A (zh) * 2023-01-10 2023-03-21 南京砺算科技有限公司 图元数据的装载控制方法、图形处理器、设备及存储介质

Also Published As

Publication number Publication date
JP2012014744A (ja) 2012-01-19
JPWO2008053597A1 (ja) 2010-02-25
JP4913823B2 (ja) 2012-04-11
JP5216130B2 (ja) 2013-06-19

Similar Documents

Publication Publication Date Title
WO2008053597A1 (fr) Dispositif pour accélérer le traitement d&#39;un cache de vertex de primitives étendues
US6704018B1 (en) Graphic computing apparatus
US7202872B2 (en) Apparatus for compressing data in a bit stream or bit pattern
US8059119B2 (en) Method for detecting border tiles or border pixels of a primitive for tile-based rendering
TWI581209B (zh) 景深消隱方法、使用景深消隱的三維圖形處理方法及其裝置
TWI330782B (en) Subdividing geometry images in graphics hardware
US20020060685A1 (en) Method, system, and computer program product for managing terrain rendering information
KR20110093404A (ko) 3차원 그래픽스 랜더링 장치 및 그 방법
CN114092613B (zh) 对体素进行即时渲染的方法
US11436783B2 (en) Method and system of decoupled object space shading
US11715253B2 (en) Pixelation optimized delta color compression
US11087511B1 (en) Automated vectorization of a raster image using a gradient mesh with arbitrary topology
TWI361393B (en) Systems and methods for providing a shared buffer in a multiple fifo environment
CN115330986A (zh) 一种分块渲染模式图形处理方法及系统
CN118537468A (zh) 一种光场图像的多层级并行渲染方法和系统
CN118691728A (zh) 一种基于簇的ifc模型构件级lod生成和使用方法
KR102270750B1 (ko) 병렬 마이크로폴리곤 래스터라이저들
US7362335B2 (en) System and method for image-based rendering with object proxies
CN1942901B (zh) 图像处理装置及图像处理方法
KR20240163635A (ko) 점유 맵을 사용하지 않는 v-pcc 기반 동적 텍스처드 메시 코딩
EP3876205A1 (fr) Système et procédé de génération d&#39;image
CN119810284A (zh) 一种基于数字孪生的三维渲染方法及系统
US20230230197A1 (en) Texture mapping
Liu et al. Accelerating volume raycasting using proxy spheres
GB2514410A (en) Image scaling for images including low resolution text

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07827975

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2008541994

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07827975

Country of ref document: EP

Kind code of ref document: A1