[lug] Regex Help

George Sexton georges at mhsoftware.com
Sat Jul 9 07:32:38 MDT 2011


Thanks. This was exactly what I was looking for. I don't do Perl, or use
Regexes a lot, so the subtleties escape me.

 

George Sexton

MH Software, Inc.

303 438-9585

www.mhsoftware.com

 

From: lug-bounces at lug.boulder.co.us [mailto:lug-bounces at lug.boulder.co.us]
On Behalf Of Chris McDermott
Sent: Friday, July 08, 2011 3:40 PM
To: Boulder (Colorado) Linux Users Group -- General Mailing List
Subject: Re: [lug] Regex Help

 

You could try this:

/https?:\/\/[^\/]+\/specificpath/


https?:\/\/ - this will match either "http://" or "https://"
[^\/]+ - this will match anything *except* a "/" character



It worked for at least some preliminary testing:

[chris at bull sandbox]$ ./test.pl http://www.example.com/specificpath
Yay!
[chris at bull sandbox]$ ./test.pl https://www.example.com/specificpath
Yay!
[chris at bull sandbox]$ ./test.pl https://192.158.282.12/specificpath

Yay!
[chris at bull sandbox]$ ./test.pl https://192.158.282.12/blah/specificpath
Boo!
[chris at bull sandbox]$ ./test.pl
https://192.158.282.12/blah/balh/specificpath
Boo!
[chris at bull sandbox]$ ./test.pl https://www.cnn.com/blah/balh/specificpath

Boo!


Chris

On Fri, Jul 8, 2011 at 3:13 PM, Davide Del Vento
<davide.del.vento at gmail.com> wrote:

I think you are not getting clear with your requirements.

Let's start with English first, regex later.

You have a string, which is an (already validated) URL.

Do you want to much ANY site (not a specific one), right?

Then you want to have a specific path, that you explicitly say. Does
this path have a generic or specific number of slashes?

Then you have "whatever" for the rest of the URL, *including* other
path, possibly with many other slashes, as well as file names.

Is this correct or not?

Last, but not least, which language you need this in? Perl, Python,
Unix Grep, GNU Grep, you name it, all have regexp. They are *not* 100%
compatible with each other.

Dav


On Fri, Jul 8, 2011 at 15:03, George Sexton <georges at mhsoftware.com> wrote:
> Not to be stupid or anything, but if I understood regular expressions well
> enough to use this, I wouldn't have asked for help.
>
> I'm using an application that matches regular expressions in URLs.
>
> I'd like it to match
>
> /somepath/*
>
> But not
>
> /somethingelse/somepath/*
>
> I can write an expression to match /somepath/*. The problem is it's
matching
> the second thing which I don't want.
>
> I don't get to write a lot of code.
>
> I don't know what the host name will be. It might be a fqdn, might be an
IP
> Address.
>
> The input has the full URL syntax:
>
> Scheme:hostname/path/
>
>
>
> George Sexton
> MH Software, Inc.
> 303 438-9585 <tel:303%20438-9585> 
> www.mhsoftware.com
>
>
>> -----Original Message-----
>> From: lug-bounces at lug.boulder.co.us [mailto:lug-
>> bounces at lug.boulder.co.us] On Behalf Of Chip Atkinson
>> Sent: Friday, July 08, 2011 2:44 PM
>> To: Boulder (Colorado) Linux Users Group -- General Mailing List
>> Subject: Re: [lug] Regex Help
>>
>> How about this:
>>
>> http://txt2re.com/
>>
>>
>>
>> On Fri, 8 Jul 2011, George Sexton wrote:
>>
>> > I'm just dying on a regular expression here. I'm always rotten. If
>> someone
>> > could help me out I would appreciate it.
>> >
>> >
>> >
>> > I'm looking for a regex that will match:
>> >
>> >
>> >
>> > http://some.host/specificpath/
>> >
>> >
>> >
>> > but not
>> >
>> >
>> >
>> > http://some.host/otherjunk/specificpath/
>> >
>> >
>> >
>> > I'd really appreciate any help I can get.
>> >
>> >
>> >
>> >
>> >
>> > George Sexton
>> >
>> > MH Software, Inc.
>> >
>> > 303 438-9585 <tel:303%20438-9585> 
>> >
>> > www.mhsoftware.com
>> >
>> >
>> >
>> >
>>
>> _______________________________________________
>> Web Page:  http://lug.boulder.co.us
>> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
>> Join us on IRC: irc.hackingsociety.org port=6667
>> channel=#hackingsociety
>
>
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
>
_______________________________________________
Web Page:  http://lug.boulder.co.us
Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20110709/eabb424d/attachment.html>


More information about the LUG mailing list