The solution found by the SVM is automatically restricted to the space spanned by the samples, so using PCA to just get rid of dimensions with zero variance won't change the solution. And as damienfrancois wrote, reducing beyond that you risk destroying relevant information. To avoid that you have two options:
1) Trust that structural risk minimization is not just an interesting theoretical concept, but also does the right thing for your application, and just use the data as they are.
2) Use a feature selection algorithm to find out which of the features / combinations are actually informative. However, finding the optimum combination of features is clearly not feasible for so many features, so you can probably just sort the features by individual performance (in the linear case: a t-score) and then test how many of the best features you need in order get a good result.
Zaw Lin's comment is of course correct, you can always separate classes in such a high-dimensional space, but equally of course classifier performance should not be assessed on the training data but e.g. using cross-validation.
Cover has shown that if the total number of training samples less than twice the number of features,then there exists a hyperplane which can separate the training data perfectly even if the two classes are generated by the same distribution
(dtic.mil/dtic/tr/fulltext/u2/a229035.pdf)(pg 46). – Java