Trilinear filtering is an extension of the bilinear texture filtering method, which also performs linear interpolation between mipmaps.
Bilinear filtering has several weaknesses that make it an unattractive choice in many cases: using it on a full-detail texture when scaling to a very small size causes accuracy problems from missed texels, and compensating for this by using multiple mipmaps throughout the polygon leads to abrupt changes in blurriness, which is most pronounced in polygons that are steeply angled relative to the camera.
To solve this problem, trilinear filtering interpolates between the results of bilinear filtering on the two mipmaps nearest to the detail required for the polygon at the pixel. If the pixel would take up 1/100 of the texture in one direction, trilinear filtering would interpolate between the result of filtering the 128*128 mipmap as y1 with x1 as 128, and the result of filtering on the 64*64 mipmap as y2 with x2 as 64, and then interpolate to x = 100.
The first step in this process is of course to determine how big in terms of the texture the pixel in question is. There are a few ways to do this, and the ones mentioned here are not necessarily representative of all of them.
Once this is done the rest becomes easy: perform bilinear filtering on the two mipmaps with pixel sizes that are immediately larger and smaller than the calculated size of the pixel, and then interpolate between them as normal.
Since it uses both larger and smaller mipmaps, trilinear filtering cannot be used in places where the pixel is smaller than a texel on the original texture, because mipmaps larger than the original texture are not defined. Fortunately bilinear filtering still works, and can be used in these situations without worrying too much about abruptness because bilinear and trilinear filtering provide the same result when the pixel size is exactly the same as the size of a texel on the appropriate mipmap.
Trilinear filtering still has weaknesses, because the pixel is still assumed to take up a square area on the texture. In particular, when a texture is at a steep angle compared to the camera, detail can be lost because the pixel actually takes up a narrow but long trapezoid: in the narrow direction, the pixel is getting information from more texels than it actually covers (so details are smeared), and in the long direction the pixel is getting information from fewer texels than it actually covers (so details fall between pixels). To alleviate this, anisotropic ("direction dependent") filtering can be used.