Intel's KNI - "Kill
Non-Intel"?
Intel, of course, released information about MMX2 (as people used to called it) or KNI
(Katmai New Instructions) as is the new codename at their developer's conference this
year. It should be shipping in Pentium IIs in the first half of '99. Little is known
officially about the instruction set, although some general details have been released.
They are as follows:
- KNI will use 8 new 128 bit registers
- KNI will support pre-fetching of data
- KNI will 'extend existing MMX instructions'
This is obviously not very interesting, except maybe for the bit about the 128 bit
registers. These will hold 4 single precision floating point values - Here's the diagram
again :)
| 32 bit float |
32 bit float |
32 bit float |
32 bit float |
One drawback of using extra registers is that new instructions are required to save them
to allow multitasking in an operating system. Apparently Windows 98 and NT5 (or should I
say Windows 2000..) support this, and no doubt Linux will support it the day after the
chip is released. From a point of view of register width and the number of values KNI can
process at once it appears to have an advantage over 3DNow! and with Intel's marketing
strength and brand name it looks strong, but what's behind this brazen image?
To find this out I emailed Clive Turvey who discovered evidence of the new instruction
set on an Intel FTP server, as well as from his research into new and not very well
documented instructions in the Pentium II. (His page is here)
It suggests that KNI will support not only 4 single precision floating point numbers
per register but also 2 double precision floating point numbers. This is a little unusual,
as double precision (64bit instead of 32bit floating point) is not really necessary for
most 3D games and applications. In scientific calculations, engineering applications, high
quality raytracing and the like, it would be important. One answer might be that Intel
intends KNI to replace the old, and to be honest, hideous design of the x86 FPU. This late
in the lifespan of the x86 architecture and just before the introduction of what Intel
expects to be its newer and supposedly better 64-bit processors makes that seem unlikely,
but we shall just have to wait and see.
The performance of KNI is likely to be quite good, possibly faster than 3DNow!, but not
necessarily so. The KNI instructions look more general purpose, and for speed, sometimes
that's not always the best thing (think 3D accelerators versus software rendering).
Divides and square roots are performed by one instruction for each and although it could
run faster than separate partial operations, it doesn't give the programmer the power to
decide when to trade off accuracy and speed.
A recent rumour suggests that Katmai, the first processor to implement KNI will have
only one KNI unit. This means that it will have a possible peak instruction rate equal to
a similarly clocked AMD K6-2 because it processes 4 rather than 2 values in its registers.
This is good but not spectacular. Of course, given Intel's huge resources, many developer
may find themselves putting more effort into KNI optimisation than 3DNow! or AltiVec.
|