Had similar problem.
Failed to solved it with wireshark/tshark options only.
Below is my workaround for extracting raw json and xml from cap files.
# 1. convert to pdml with DISABLED json and xml dissectors
tshark -r "wireshark.cap" -2 -R "http" --disable-protocol json --disable-protocol xml -V -T pdml > "wireshark.cap.pdml.xml"
# 2. get hex encoded raw data from media.type pdml element
# 3. perform hex decode
I used groovy script for steps 2 and 3
import groovy.xml.*
...
def String hexDecode(String s) {
if ( null == s || 0 == s.length() ) {
return null
}
def res = ""
for (int i = 0; i < s?.length(); i += 2) {
res += (Character)((Character.digit(s.charAt(i), 16) << 4) + Character.digit(s.charAt(i+1), 16))
}
return res
}
...
def xmlFile = new File("wireshark.cap.pdml.xml")
def pdml = new XmlParser().parseText( xmlFile.text )
pdml.packet.each{ packet->
def media = packet.proto.find{ "media"==it.@name }
def hex = media?.field.find{"media.type"==it.@name }?.@value
def raw = hexDecode(hex)
}