I need to pull numbers from a string and put them into a list, there are some rules to this however such as identifying if the extracted number is a Integer or Float.
The task sounds simple enough but I am finding myself more and more confused as time goes by and could really do with some guidance.
Take the following test string as an example:
There are test values: P7 45.826.53.91.7, .5, 66.. 4 and 5.40.3.
The rules to follow when parsing the string are as follows:
numbers cannot be preceeded by a letter.
If it finds a number and is not followed by a decimal point then the number is as an Integer.
If it finds a number and is followed by a decimal point then the number is a float, eg 5.
~ If more numbers follow the decimal point then the number is still a float, eg 5.40
~ A further found decimal point should then break up the number, eg 5.40.3 becomes (5.40 Float) and (3 Float)
In the event of a letter for example following a decimal point, eg
3.H
then still add3.
as a Float to the list (even if technically it is not valid)
Example 1
To make this a little more clearer, taking the test string quoted above the desired output should be as follows:
From the image above, light blue colour illustrates Float numbers, pale red illustrates single Integers (but note also how Floats joined together are split into seperate Floats).
- 45.826 (Float)
- 53.91 (Float)
- 7 (Integer)
- 5 (Integer)
- 66 . (Float)
- 4 (Integer)
- 5.40 (Float)
- 3 . (Float)
Note there are deliberate spaces between 66 . and 3 . above due to the way the numbers were formatted.
Example 2:
Anoth3r Te5.t string .4 abc 8.1Q 123.45.67.8.9
- 4 (Integer)
- 8.1 (Float)
- 123.45 (Float)
- 67.8 (Float)
- 9 (Integer)
To give a better idea, I created a new project whilst testing which looks like this:
Now onto the actual task. I thought maybe I could read each character from the string and identify what are valid numbers as per the rules above, and then pull them into a list.
To my ability, this was the best I could manage:
The code is as follows:
unit Unit1;
{$mode objfpc}{$H+}
interface
uses
Classes, SysUtils, FileUtil, Forms, Controls, Graphics, Dialogs, StdCtrls;
type
TForm1 = class(TForm)
btnParseString: TButton;
edtTestString: TEdit;
Label1: TLabel;
Label2: TLabel;
Label3: TLabel;
lstDesiredOutput: TListBox;
lstActualOutput: TListBox;
procedure btnParseStringClick(Sender: TObject);
private
FDone: Boolean;
FIdx: Integer;
procedure ParseString(const Str: string; var OutValue, OutKind: string);
public
{ public declarations }
end;
var
Form1: TForm1;
implementation
{$R *.lfm}
{ TForm1 }
procedure TForm1.ParseString(const Str: string; var OutValue, OutKind: string);
var
CH1, CH2: Char;
begin
Inc(FIdx);
CH1 := Str[FIdx];
case CH1 of
'0'..'9': // Found a number
begin
CH2 := Str[FIdx - 1];
if not (CH2 in ['A'..'Z']) then
begin
OutKind := 'Integer';
// Try to determine float...
//while (CH1 in ['0'..'9', '.']) do
//begin
// case Str[FIdx] of
// '.':
// begin
// CH2 := Str[FIdx + 1];
// if not (CH2 in ['0'..'9']) then
// begin
// OutKind := 'Float';
// //Inc(FIdx);
// end;
// end;
// end;
//end;
end;
OutValue := Str[FIdx];
end;
end;
FDone := FIdx = Length(Str);
end;
procedure TForm1.btnParseStringClick(Sender: TObject);
var
S, SKind: string;
begin
lstActualOutput.Items.Clear;
FDone := False;
FIdx := 0;
repeat
ParseString(edtTestString.Text, S, SKind);
if (S <> '') and (SKind <> '') then
begin
lstActualOutput.Items.Add(S + ' (' + SKind + ')');
end;
until
FDone = True;
end;
end.
It clearly doesn't give the desired output (failed code has been commented out) and my approach is likely wrong but I feel I only need to make a few changes here and there for a working solution.
At this point I have found myself rather confused and quite lost despite thinking the answer is quite close, the task is becoming increasingly infuriating and I would really appreciate some help.
EDIT 1
Here I got a little closer as there is no longer duplicate numbers but the result is still clearly wrong.
unit Unit1;
{$mode objfpc}{$H+}
interface
uses
Classes, SysUtils, FileUtil, Forms, Controls, Graphics, Dialogs, StdCtrls;
type
TForm1 = class(TForm)
btnParseString: TButton;
edtTestString: TEdit;
Label1: TLabel;
Label2: TLabel;
Label3: TLabel;
lstDesiredOutput: TListBox;
lstActualOutput: TListBox;
procedure btnParseStringClick(Sender: TObject);
private
FDone: Boolean;
FIdx: Integer;
procedure ParseString(const Str: string; var OutValue, OutKind: string);
public
{ public declarations }
end;
var
Form1: TForm1;
implementation
{$R *.lfm}
{ TForm1 }
// Prepare to pull hair out!
procedure TForm1.ParseString(const Str: string; var OutValue, OutKind: string);
var
CH1, CH2: Char;
begin
Inc(FIdx);
CH1 := Str[FIdx];
case CH1 of
'0'..'9': // Found the start of a new number
begin
CH1 := Str[FIdx];
// make sure previous character is not a letter
CH2 := Str[FIdx - 1];
if not (CH2 in ['A'..'Z']) then
begin
OutKind := 'Integer';
// Try to determine float...
//while (CH1 in ['0'..'9', '.']) do
//begin
// OutKind := 'Float';
// case Str[FIdx] of
// '.':
// begin
// CH2 := Str[FIdx + 1];
// if not (CH2 in ['0'..'9']) then
// begin
// OutKind := 'Float';
// Break;
// end;
// end;
// end;
// Inc(FIdx);
// CH1 := Str[FIdx];
//end;
end;
OutValue := Str[FIdx];
end;
end;
OutValue := Str[FIdx];
FDone := Str[FIdx] = #0;
end;
procedure TForm1.btnParseStringClick(Sender: TObject);
var
S, SKind: string;
begin
lstActualOutput.Items.Clear;
FDone := False;
FIdx := 0;
repeat
ParseString(edtTestString.Text, S, SKind);
if (S <> '') and (SKind <> '') then
begin
lstActualOutput.Items.Add(S + ' (' + SKind + ')');
end;
until
FDone = True;
end;
end.
My question is how can I extract numbers from a string, add them to a list and determine if the number is integer or float?
The left pale green listbox (desired output) shows what the results should be, the right pale blue listbox (actual output) shows what we actually got.
Please advise Thanks.
Note I re-added the Delphi tag as I do use XE7 so please don't remove it, although this particular problem is in Lazarus my eventual solution should work for both XE7 and Lazarus.
System.Masks.MatchesMask
function. I didn't try, but this could maybe help you. – Abbieabbot123.45.6
should be split into two results, the first is a Float (123.45
) and the second is an Integer (6
). If the example however is,123.45.6.7
then the split would be Float (123.45
) and Float (6.7
). – Lobito45.826.53.91.7
would be45.826
and53.91
(a float cannot have more than one decimal). So you should be visualising the numbers like so: |45.826
|53.91
|7
| with the first two broken down numbers been identified as floats and the remaining7
a single Integer. This is because that particular number is continuous without spaces or letters, just numbers and decimal points. – Lobito12.34
if there is a space after the 2 (12 .34
) then you have two integers (12 and 34
). The decimal points keep it continuous. – Lobito.
to perform double-duty as both decimal-point and item separator you turn what could have been an interesting parsing exercise into an unrealistic problem that you're unlikely to learn anything useful from. Most important lesson: Don't overcomplicate things. – Lessee