I'm rewriting my P5 socket server in P6 using IO::Socket::Async, but the data received got truncated 1 character at the end and that 1 character is received on the next connection. Someone from Perl6 Facebook group (Jonathan Worthington) pointed that this might be due to the nature of strings and bytes are handled very differently in P6. Quoted:
In Perl 6, strings and bytes are handled very differently. Of note, strings work at grapheme level. When receiving Unicode data, it's not only possible that a multi-byte sequence will be split over packets, but also a multi-codepoint sequence. For example, one packet might have the letter "a" at the end, and the next one would be a combining acute accent. Therefore, it can't safely pass on the "a" until it's seen how the next packet starts.
My P6 is running on MoarVM
use Data::Dump;
use experimental :pack;
my $socket = IO::Socket::Async.listen('0.0.0.0', 7000);
react {
whenever $socket -> $conn {
my $line = '';
whenever $conn {
say "Received --> "~$_;
$conn.print: &translate($_) if $_.chars ge 100;
$conn.close;
}
}
CATCH {
default {
say .^name, ': ', .Str;
say "handled in $?LINE";
}
}
}
sub translate($raw) {
my $rawdata = $raw;
$raw ~~ s/^\s+|\s+$//; # remove heading/trailing whitespace
my $minus_checksum = substr($raw, 0, *-2);
my $our_checksum = generateChecksum($minus_checksum);
my $data_checksum = ($raw, *-2);
# say $our_checksum;
return $our_checksum;
}
sub generateChecksum($minus_checksum) {
# turn string into Blob
my Blob $blob = $minus_checksum.encode('utf-8');
# unpack Blob into ascii list
my @array = $blob.unpack("C*");
# perform bitwise operation for each ascii in the list
my $dec +^= $_ for $blob.unpack("C*");
# only take 2 digits
$dec = sprintf("%02d", $dec) if $dec ~~ /^\d$/;
$dec = '0'.$dec if $dec ~~ /^[a..fA..F]$/;
$dec = uc $dec;
# convert it to hex
my $hex = sprintf '%02x', $dec;
return uc $hex;
}
Result
Received --> $$0116AA861013034151986|10001000181123062657411200000000000010235444112500000000.600000000345.4335N10058.8249E00015
Received --> 0
Received --> $$0116AA861013037849727|1080100018112114435541120000000000000FBA00D5122500000000.600000000623.9080N10007.8627E00075
Received --> D
Received --> $$0108AA863835028447675|18804000181121183810421100002A300000100900000000.700000000314.8717N10125.6499E00022
Received --> 7
Received --> $$0108AA863835028447675|18804000181121183810421100002A300000100900000000.700000000314.8717N10125.6499E00022
Received --> 7
Received --> $$0108AA863835028447675|18804000181121183810421100002A300000100900000000.700000000314.8717N10125.6499E00022
Received --> 7
Received --> $$0108AA863835028447675|18804000181121183810421100002A300000100900000000.700000000314.8717N10125.6499E00022
Received --> 7
IO::Socket::Async
truncates data". Actually P6 is helping you to avoid corrupting data. P6 keeps devs' choice clear: bytes or characters. EITHER You use:bin
so data is a sequence of bytes. So the unit of transfer is a byte. OR Data is text, a sequence of "what a user thinks of as a character". So the logical unit of transfer is one character at a time. Thus P6 buffers bytes to ensure it only delivers a whole character when it's known to be complete. This buffering is a consequence of Unicode's design. – Joktan