Re: [exim-dev] ${index}

Top Page
Delete this message
Reply to this message
Author: Mike Cardwell
Date:  
To: exim-dev
Subject: Re: [exim-dev] ${index}
On 20/10/2010 02:48, Phil Pennock wrote:

>> http://github.com/mikecardwell/Exim/commit/b4ef92d69ac58a46cbb0babb93d688267fbed443
>>
>> Provides a ${index{string}{substring}} function to return the location
>> of a string within a string. Examples:
>
> Generally like it. My main commentary is at the bikeshed-painting
> level, so I'll throw it out there, but am not going to protest if you
> ignore me.


I don't know C. It's the (IT related) skill I regret not having, most. I
figured if I start hacking at the Exim code, I'll start to pick it up.
So with that in mind, any criticism, no matter how small, is useful at
this point.

> Note that Exim doesn't really do much sequential work with variables
> holding results, although it can of course be so abused, so I'm curious
> about the desire to use an expansion operator instead of an expansion
> condition for the basic check. What's the use-case?


Can you clarify the above please. How would it work as an expansion
condition? Can you show me an example of what the Exim config would look
like? I did initially think of writing a "contains" condition, ie:

condition = ${if contains{string}{substring}{true}{false}}

But I figured a general "index" operator would have more use cases. I'm
still unsure though... "contains" might be enough.

> At a code-review level: multiple case statements inside a switch which
> fall through should usually have a comment stating that this is explicit
> and intentional. Looking for prior art in expand.c, I see both /* Fall
> through */ and, as counter-example, ECOND_ISIP/ECOND_ISIP4/ECOND_ISIP6.
> I'd argue that it's clear that the ECOND_ISIP* cases are being handled
> in common, whereas in this case, I had to go check the documentation for
> read_subs() to see the return values, to confirm that the 2 and 3 values
> could reasonably be treated the same and this was not accidental code
> omission.


Fair point. I'll add a comment.

> Is %d cross-platform safe formatting for a ptrdiff_t type?


I don't know. I'm used to programming in languages that don't have this
sort of issue. How would I find out if that's cross-platform safe? Or is
the only way testing and experience? Is there a better way of doing that
part that doesn't involve string_sprintf?

> I'd be
> "impressed" if an Exim string ever holds enough data to have to worry
> about wrap-around or overflow of 32-bit here, but it might be safest to
> use %ld and a (long int) cast. FreeBSD uses 64-bit signed ptrdiff_t and
> that's a 'long'. My vague recollection is that this is normal for
> 64-bit Unix. 'long long' looks like being more portably correct.


So I'd just need to use %ld instead of %d in that sprintf? You said a
"long long" would be more portable but I came across this which makes me
wonder: http://track.sipfoundry.org/browse/XCL-92

> And there's no PRIi* macro for ptrdiff_t. *sigh*


Err, I've no idea what that means. PRIi*?

>> condition = ${if >{${index{$h_Subject:}{$sender_address}}}{-1}}
>>
>> You could use ${match} but then you'd have to worry about special regex
>> characters in the key such as '.'.
>
> Well, that's what \Q${untrusted}\E is for.


Cool. I did not know you could do that.

>> Comments?
>
> Index is a generic term, also applicable to databases, etc. There's no
> namespacing in Exim, so I'm ... "looking askew" at the name. Yeah,
> bikeshed painting, sorry. Exim's not perfect here, and since Exim is
> string-based, string tends to be the default data-type assumption.


> A number of the Exim expansion operators for pure-string issues contain
> 'str' as part of the name. But not enough to declare this a hard rule.


> Also, Python has both .find() and .index() string methods, where .find()
> returns -1 on failure and .index() raises an exception on failure.
> Whereas Perl return base_index-1, which is typically -1. Ruby returns
> nil. This is a mess. Perhaps I've just been doing too much Python
> recently and that's clouding my opinion of the naming.
>
> strindex(), strstr() [for the C programmers, but which then leads to
> strcasestr() vs the Exim-model indexi, which you rightly chose],
> something.
>
> Or freely discard what I've said about the naming. :) I just want the
> issue to have been considered, even if you decide to stick with the name
> you chose.


I spend most of my time in Perl, so that's where the naming and -1 came
from. It's not a perfect name though no. Javascript uses indexOf. I like
"find", but it wouldn't be clear if you're trying to find a string in a
string, or a string in a list, or something else. I think strindex is
the best name, but it doesn't look Eximy enough.

-- 
Mike Cardwell - Perl/Java/Web developer, Linux admin, Email admin
Read my tech Blog -              https://secure.grepular.com/
Follow me on Twitter -           http://twitter.com/mickeyc
Hire me - http://cardwellit.com/ http://uk.linkedin.com/in/mikecardwell