The values of the kernel evaluation between a test set vector, x, and each training set vector should be used as the test set feature vector.
Here are the pertinent lines from the libsvm readme:
New training instance for xi:
<label> 0:i 1:K(xi,x1) ... L:K(xi,xL)
New testing instance for any x:
<label> 0:? 1:K(x,x1) ... L:K(x,xL)
The libsvm readme is saying that if you have L training set vectors, where xi is a training set vector with i from [1..L], and a test set vector, x, then the feature vector for x should be
<label of x> 0:<any number> 1:K(x^{test},x1^{train}), 2:K(x^{test},x2^{train}) ... L:K(x^{test},xL^{train})
where K(u,v) is used to denote the output of the kernel function on with vectors u and v as the arguments.
I have included some example python code below.
The results from the original feature vector representation and the precomputed (linear) kernel are not exactly the same, but this is probably due to differences in the optimization algorithm.
from svmutil import *
import numpy as np
#original example
y, x = svm_read_problem('.../heart_scale')
m = svm_train(y[:200], x[:200], '-c 4')
p_label, p_acc, p_val = svm_predict(y[200:], x[200:], m)
##############
#train the SVM using a precomputed linear kernel
#create dense data
max_key=np.max([np.max(v.keys()) for v in x])
arr=np.zeros( (len(x),max_key) )
for row,vec in enumerate(x):
for k,v in vec.iteritems():
arr[row][k-1]=v
x=arr
#create a linear kernel matrix with the training data
K_train=np.zeros( (200,201) )
K_train[:,1:]=np.dot(x[:200],x[:200].T)
K_train[:,:1]=np.arange(200)[:,np.newaxis]+1
m = svm_train(y[:200], [list(row) for row in K_train], '-c 4 -t 4')
#create a linear kernel matrix for the test data
K_test=np.zeros( (len(x)-200,201) )
K_test[:,1:]=np.dot(x[200:],x[:200].T)
K_test[:,:1]=np.arange(len(x)-200)[:,np.newaxis]+1
p_label, p_acc, p_val = svm_predict(y[200:],[list(row) for row in K_test], m)