dbTalk Databases Forums  

Doing RegEx in D3

comp.databases.pick comp.databases.pick


Discuss Doing RegEx in D3 in the comp.databases.pick forum.



Reply
 
Thread Tools Display Modes
  #41  
Old   
Tony Gravagno
 
Posts: n/a

Default Re: Doing RegEx in D3 - 12-11-2008 , 02:49 PM






Marshall wrote:
Quote:
The only thing with using a C/C++ or any other external method is
portability. It looses the ability to be used across multiple
platforms. I am currently on D3, but use Universe, jBase, etc. and
would need a way to make it work on all.
You're missing the bridge concept. The C code only links into D3.
It's tiny and has no utility of its own. Its only purpose is to
bridge to your real code which is whatever you have in some external
cross-platform code. Your common, cross-platform RegEx code can be
linked into other MV platforms through the same sort of bridge, but
obviously not exactly the same bridge code. For example, with jBASE
you might use CALLC, CALLDOTNET, or CALLJAVA, which dynamically link
external code,

Now that I'm thinking about this, there is a way to call common
ActiveX code from D3 these days. It's too proprietary so I've never
used it, and of course it's D3NT-only but I recommend you look into
this and maybe check with TL Support to see if there is an AIX
equivalent. You might be a few lines of code away from your goal -
and not C code either.

Quote:
Plus, the real gains in
RegEx speed are in the RegEx compiler (compile it once, run it many
times). I would have to have a way to compile the RegEx and return it
to D3/Universe/etc. for use later otherwise it has to recompile each
time and that would really slow some of them down.
I understand the requirement for some sort of persistence. As a
linked module your external code should be able to store some sort of
state, or you can return state info back to your BASIC call as part of
the result, and pass that state back into the next call rather than a
unique query. It's the same we'd do with any compiled regex.

Intersting stuff. Good luck.
T

Quote:
I have a very basic Thompson RegEx engine working now, written in
BASIC. I can do pretty much all of the simple stuff except {m,n} and
I haven't implemented any POSIX or Perl classes yet like [IGITS:]
and such. I can do:
ABC
.*ABC
ABC.*
A+B*
[0-9]+
[\d]+
etc.

If anyone is interested I would share the code. It's pretty straight
forward, byte compiles the regex, then runs the bytecode against a
string for matching. That is all. I have implemented: ., \d, \w and
\s, +, *, [], (), | but that is all. The () are not a real group
either, just a way to keep an alternation (|) together, but will not
return the matched text if within the () or anything.

One thing I haven't figured out yet is .*A.* to match an A anywhere in
the string, it matches anywhere except at the end. Very strange
as .*A will match it only at the end without issue. It has to do with
the backtracking but not sure where yet.

Oh well, it's all fun until someone gets hurt...

M


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.