set of WideChar: Sets may have at most 256 elements
Asked Answered
L

2

8

I have this line:

const
  MY_SET: set of WideChar = [WideChar('A')..WideChar('Z')];

The above does not compile, with error:

[Error] Sets may have at most 256 elements

But this line does compile ok:

var WS: WideString;
if WS[1] in [WideChar('A')..WideChar('Z')] then...

And this also compiles ok:

const
  MY_SET = [WideChar('A')..WideChar('Z'), WideChar('a')..WideChar('z')];
  ...
  if WS[1] in MY_SET then...

Why is that?

EDIT: My question is why if WS[1] in [WideChar('A')..WideChar('Z')] compiles? and why MY_SET = [WideChar('A')..WideChar('Z'), WideChar('a')..WideChar('z')]; compiles? aren't they also need to apply to the set rules?

Lamplighter answered 21/6, 2016 at 5:31 Comment(4)
The second code has only 26 elements. Much simpler to use >= and <= here. Do note that your code does not acknowledge non English characters.Pyxidium
@David, Doesn't the first code has also 26 elements? "Do note that your code does not acknowledge non English characters." I need to check valid ISO chars. only English characters are valid.Lamplighter
As long as the elements themself are below 256 the second expression is valid. The first expression declares a set larger than 256 in size (set of WideChar).Isomeric
The value in the first example has 26 elements but the type has more. Don't use sets here. Use inequality operators.Pyxidium
I
12

A valid set has to obey two rules:

  1. Each element in a set must have an ordinal value less than 256.
  2. The set must not have more than 256 elements.
MY_SET: set of WideChar = [WideChar('A')..WideChar('Z')];

Here you declare a set type (Set of WideChar) which has more than 256 elements -> Compiler error.

if WS[1] in [WideChar('A')..WideChar('Z')]

Here, the compiler sees WideChar('A') as an ordinal value. This value and all other values in the set are below 256. This is ok with rule 1.

The number of unique elements are also within limits (Ord('Z')-Ord('A')+1), so the 2nd rules passes.

MY_SET = [WideChar('A')..WideChar('Z'), WideChar('a')..WideChar('z')];

Here you declare a set that also fulfills the requirements as above. Note that the compiler sees this as a set of ordinal values, not as a set of WideChar.

Isomeric answered 21/6, 2016 at 8:21 Comment(5)
Interesting, in the second case (if WS[1] in ,,,), how does it do the comparison?Garboard
@TomBrunberg, there is no difference besides that the compiler uses the size of the left parameter in the comparison with a byte size set element.Isomeric
@Lamplighter So the error is in the MY_SET: set of WideChar part, not in the [WideChar('A')..WideChar('Z')]Mcfarlin
@JanDoggen, Yes. I realize that now.Lamplighter
I must have misread your initial answer that any WideChar would be converted to valid range for a set, but ofcourse you spoke only about WideChars up to #$00FF. Sorry for the noise. Great summary, btwGarboard
U
2

A set can have no more than 256 elements.
Even with so few elements the set already uses 32 bytes.

From the documentation:

A set is a bit array where each bit indicates whether an element is in the set or not. The maximum number of elements in a set is 256, so a set never occupies more than 32 bytes. The number of bytes occupied by a particular set is equal to

(Max div 8) - (Min div 8) + 1

For this reason only sets of byte, (ansi)char, boolean and enumerations with fewer than 257 elements are possible.
Because widechar uses 2 bytes it can have 65536 possible values.
A set of widechar would take up 8Kb, too large to be practical.

type
  Capitals = 'A'..'Z';

const
  MY_SET: set of Capitals = [WideChar('A')..WideChar('Z')];

Will compile and work the same.

It does seem a bit silly to use widechar if your code ignores unicode.
As written only the English capitals are recognized, you do not take into account different locales.

In this case it would be better to use code like

if (AWideChar >= 'A') and (AWideChar <= 'Z') ....

That will work no matter how many chars fall in between.
Obviously you can encapsulate this in a function to save on typing.

If you insist on having large sets, see this answer: https://mcmap.net/q/1251560/-working-with-unicode-strings-in-delphi-7

Urinary answered 21/6, 2016 at 6:34 Comment(2)
But why if WS[1] in [WideChar('A')..WideChar('Z')] compiles? and why MY_SET = [WideChar('A')..WideChar('Z'), WideChar('a')..WideChar('z')]; compiles? aren't they also apply to the set rules? what is the difference?Lamplighter
The compiler converts WideChar('A') to an ordinal value which is less than 256. If you would put in a value larger than that, the compiler will complain. Now, the compiler has a valid set with Ord('Z') - Ord('A') +1 elements.Isomeric

© 2022 - 2024 — McMap. All rights reserved.