RegExpException

Developer
Dec 10, 2012 at 7:14 PM

I've noticed that on pretty much all evaluations the system is spitting back a RegExpException "invalid repeat count" when adding patterns to the Expression Tokenizer.

This is occurring in the sample app as well.

I checked a couple of the expressions against the VB equivalents in the original "Flee" lib and those errors don't occur there.

Developer
Dec 10, 2012 at 7:59 PM

It looks like the following patterns aren't valid

String Literal

Char Literal

Timespan

Real Numbers

The conversion from the VB version of Flee looks correct but they won't parse.  Still investigating the correct translations (I'm horrible at regex)

Coordinator
Dec 11, 2012 at 1:29 PM

Hey,

 

Yes i am aware of that. Flee VB used an old Grammatica version, which only had a slow regex parser, this new version has 2 implementations,a fast one and a slow one.

The fast one fails to parse this expressions - but the exception is caught and it falls back to the slow one.

I didn't get time to translate the expressions so that they would work with the fast parser as i am quite horrible at regex as well.

You are more than welcome to submit a fix

Developer
Dec 14, 2012 at 5:48 PM
Edited Dec 14, 2012 at 6:00 PM

The server won't let me connect to submit a fix but the following works for REAL / STRING / CHAR

Still looking at the timespan.  

The new engine apparently only supports repeat pattens of 1 or all...  ranges and more then a single ({1,7} / {2}) are not supported.

Expression.grammar:

REAL                 = <<(\d)?+[.]\d+([eE][+-]\d+)?(d|f|m)?>>
STRING_LITERAL         = <<"([^"\r\n\\]|\\u[0-9a-f][0-9a-f][0-9a-f][0-9a-f]|\\[\\"'trn])*">>
CHAR_LITERAL         = <<'([^'\r\n\\]|\\u[0-9a-f][0-9a-f][0-9a-f][0-9a-f]|\\[\\"'trn])'>>

 

ExpressionTokenizer.cs:

 

        var customRealPattern = new RealPattern();
        customRealPattern.Initialize((int)ExpressionConstants.REAL,
                                    "REAL",
                                    TokenPattern.PatternType.REGEXP,
                                    "(\\d+)?[.]\\d+([eE][+-]\\d+)?(d|f|m)?",
                                    _expressionContext);
		AddPattern(customRealPattern);


        pattern = new TokenPattern((int) ExpressionConstants.STRING_LITERAL,
                                   "STRING_LITERAL",
                                   TokenPattern.PatternType.REGEXP,
                                   "\"([^\"\\r\\n\\\\]|\\\\u[0-9a-f][0-9a-f][0-9a-f][0-9a-f]|\\\\[\\\\\"'trn])*\"");
       AddPattern(pattern);

        pattern = new TokenPattern((int) ExpressionConstants.CHAR_LITERAL,
                                   "CHAR_LITERAL",
                                   TokenPattern.PatternType.REGEXP,
                                   "'([^\"\\r\\n\\\\]|\\\\u[0-9a-f][0-9a-f][0-9a-f][0-9a-f]|\\\\[\\\\\"'trn])*'");
        AddPattern(pattern);

 

Coordinator
Dec 16, 2012 at 9:26 AM
Edited Dec 16, 2012 at 12:12 PM

Hey! that's great! :) thanks for the fix.

I have added you as a developer,now you can connect to the server and submit fixes

Feb 14, 2013 at 4:31 PM
I have downloaded latest source code (version 16329) and one of the unit tests is not working - TestValidExpressions.
Console Output:"
Testing: ValidExpressions.txt
Failed line: char;'"';"

I have checked the regular expressions used in ExpressionTokenizer and this code seems to solve the problem:
pattern = new TokenPattern((int)ExpressionConstants.CHAR_LITERAL,
                                   "CHAR_LITERAL",
                                   TokenPattern.PatternType.REGEXP,
                                   "'([^'\\r\\n\\\\]|\\\\u[0-9a-f][0-9a-f][0-9a-f][0-9a-f]|\\\\[\\\\\"'trn])*'");
Is anybody having similar experience with that version?
Coordinator
Feb 14, 2013 at 7:51 PM
you are absolutely right!
there was a mistake in the char pattern causing the test to fail.
I have replaced the code and tested it,doesn't seem to brake anything and it is working right. Checked in.

Thank you! :)