The only difference between Normalized Device Coordinates (NDCS) and Clip Space (CCS) is, that CCS is before the perspective divide and NDCS is afterwards. The reason why clipping doesn't work well in NDCS is that the perspective divide moves points behind the viewer to the front (since w contains -z), so triangles behind the viewer would not be clipped away correctly at the front plane.
Q: Where is the viewer in NDCS. In VCS, the viewer's location is origin point [0,0,0,1]. However, if I calculate the origin point with perspective matrix, the result is weird. The homogeneous coordinate is not 1 but 0. How can we define the viewer's position in NDCS?
In NDCS and CCS there is no finite viewing point (and I'm not sure what the viewer has to do with clipping). One has to think about both systems as the view-frustum being warped to a cube (near and far plane having the same size). In NDCS, the visible area is in [-1, 1] along each axis, whereas in CCS it is in [-w, w]. Now think about the viewer: In view space, the viewer (the projection center) was that point where all rays going from a corner of the near plane to the respective corner in the far plane intersected. When we now warp the frustum to a cube, all these rays are parallel and there is no intersection point anymore. This means the projection center is infinitely far away, which is described in projective space by vectors that have a homogeneous coordinate of 0.
Q: However, the point where z> 0 is always larger than 1 after conversion, and is also cut in NDCS. Am I wrong? If I'm wrong, can you give me one example?
You are basically right. But clipping doesn't happen at single points, clipping happens on edges spanned between these points.
Let's assume we have a line going from a point inside the frustum (A) to a point behind the viewer (B). In this case clipping should happen at the near plane and the line should go from A to B' (the intersection of the line with the near plane).
If we would first perform the perspective divide, then (as you noted) A still stays inside the frustum but B gets mapped to a point behind the far plane. When we now clip the line between those points, we get a line going from A to a point B' which is on the far plane. Obviously we don't want to get a line away from the viewer when the line was initially going through the viewer.