Just for clarify (it is not well-documented. At least I can't find any
documentation about): Squid's regex supports only POSIX Basic grammar? -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
Administrator
|
On 27/10/17 13:06, Yuri wrote:
> Just for clarify (it is not well-documented. At least I can't find any > documentation about): > > Squid's regex supports only POSIX Basic grammar? > The specific grammar depends on your regex library used to build Squid, so YMMV. Basic POSIX is the only portable grammar that *all* regex libraries can be expected to support. So Squid does not officially support other grammars (yet) even if they work in your particular build. Amos _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
27.10.2017 12:01, Amos Jeffries пишет: > On 27/10/17 13:06, Yuri wrote: >> Just for clarify (it is not well-documented. At least I can't find any >> documentation about): >> >> Squid's regex supports only POSIX Basic grammar? >> > > The specific grammar depends on your regex library used to build > Squid, so YMMV. > > Basic POSIX is the only portable grammar that *all* regex libraries > can be expected to support. So Squid does not officially support other > grammars (yet) even if they work in your particular build. to check what regex library Squid uses. I'm trying to find it in configuration, found that: root @ cthulhu /patch/squid-5.0.0-patched-v2.26 # ./configure --help|grep regex --enable-gnuregex Compile GNUregex. Unless you have reason to use this Unix boxes which do not have their own regex library Then see ldd: root @ cthulhu /patch/squid-5.0.0-patched-v2.26 # ldd /usr/local/squid/sbin/squid libpthread.so.1 => /lib/64/libpthread.so.1 libnettle.so.6 => /opt/csw/lib/amd64/libnettle.so.6 libmd5.so.1 => /lib/64/libmd5.so.1 libecap.so.3 => /usr/local/lib/libecap.so.3 libatomic.so.1 => /opt/csw/lib/amd64/libatomic.so.1 libssl.so.1.0.0 => /opt/csw/lib/amd64/libssl.so.1.0.0 libcrypto.so.1.0.0 => /opt/csw/lib/amd64/libcrypto.so.1.0.0 libkrb5.so.1 => /usr/lib/64/libkrb5.so.1 libstdc++.so.6 => /opt/csw/lib/amd64/libstdc++.so.6 libsocket.so.1 => /lib/64/libsocket.so.1 libresolv.so.2 => /lib/64/libresolv.so.2 libnsl.so.1 => /lib/64/libnsl.so.1 libltdl.so.7 => /opt/csw/lib/amd64/libltdl.so.7 libm.so.2 => /lib/64/libm.so.2 librt.so.1 => /lib/64/librt.so.1 libgcc_s.so.1 => /opt/csw/lib/amd64/libgcc_s.so.1 libc.so.1 => /lib/64/libc.so.1 libmp.so.2 => /lib/64/libmp.so.2 libmd.so.1 => /lib/64/libmd.so.1 libscf.so.1 => /lib/64/libscf.so.1 libaio.so.1 => /lib/64/libaio.so.1 libdoor.so.1 => /lib/64/libdoor.so.1 libuutil.so.1 => /lib/64/libuutil.so.1 libgen.so.1 => /lib/64/libgen.so.1 mech_krb5.so.1 => /usr/lib/64/gss/mech_krb5.so.1 libgss.so.1 => /usr/lib/64/libgss.so.1 libpkcs11.so.1 => /usr/lib/64/libpkcs11.so.1 libcmd.so.1 => /lib/64/libcmd.so.1 libcryptoutil.so.1 => /usr/lib/64/libcryptoutil.so.1 From this output, you can not determine the regular expression library that is being used. Although maybe I'm just not looking there. Experimentally, I was able to find out that the grammar of POSIX Extended does not work in any case. However, I believe that such things should be well documented, otherwise the regular expression is simply silently ignored and it is extremely difficult to detect. > > Amos > _______________________________________________ > squid-users mailing list > [hidden email] > http://lists.squid-cache.org/listinfo/squid-users -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
Administrator
|
On 28/10/17 02:59, Yuri wrote:
> > > 27.10.2017 12:01, Amos Jeffries пишет: >> On 27/10/17 13:06, Yuri wrote: >>> Just for clarify (it is not well-documented. At least I can't find any >>> documentation about): >>> >>> Squid's regex supports only POSIX Basic grammar? >>> >> >> The specific grammar depends on your regex library used to build >> Squid, so YMMV. >> >> Basic POSIX is the only portable grammar that *all* regex libraries >> can be expected to support. So Squid does not officially support other >> grammars (yet) even if they work in your particular build. > That's why I'm asking that the POSIX Extended in the Squid does not > work. And it is not well documented anywhere. And there is no easy way > to check what regex library Squid uses. > > I'm trying to find it in configuration, found that: > root @ cthulhu /patch/squid-5.0.0-patched-v2.26 # ./configure > --help|grep regex > --enable-gnuregex Compile GNUregex. Unless you have reason to > use this > Unix boxes which do not have their own regex > library > The full text there is: " --enable-gnuregex Compile GNUregex. Unless you have reason to use this option, you should not enable it. This library file is usually only required on Windows and very old Unix boxes which do not have their own regex library built in. " If you *dont* override the local environment by setting that build option Squid uses whatever your build tools link to with "-lregex". > Then see ldd: > > root @ cthulhu /patch/squid-5.0.0-patched-v2.26 # ldd > /usr/local/squid/sbin/squid libpthread.so.1 => > /lib/64/libpthread.so.1 > libnettle.so.6 => /opt/csw/lib/amd64/libnettle.so.6 > libmd5.so.1 => /lib/64/libmd5.so.1 > libecap.so.3 => /usr/local/lib/libecap.so.3 > libatomic.so.1 => /opt/csw/lib/amd64/libatomic.so.1 > libssl.so.1.0.0 => /opt/csw/lib/amd64/libssl.so.1.0.0 > libcrypto.so.1.0.0 => /opt/csw/lib/amd64/libcrypto.so.1.0.0 > libkrb5.so.1 => /usr/lib/64/libkrb5.so.1 > libstdc++.so.6 => /opt/csw/lib/amd64/libstdc++.so.6 > libsocket.so.1 => /lib/64/libsocket.so.1 > libresolv.so.2 => /lib/64/libresolv.so.2 > libnsl.so.1 => /lib/64/libnsl.so.1 > libltdl.so.7 => /opt/csw/lib/amd64/libltdl.so.7 > libm.so.2 => /lib/64/libm.so.2 > librt.so.1 => /lib/64/librt.so.1 > libgcc_s.so.1 => /opt/csw/lib/amd64/libgcc_s.so.1 > libc.so.1 => /lib/64/libc.so.1 > libmp.so.2 => /lib/64/libmp.so.2 > libmd.so.1 => /lib/64/libmd.so.1 > libscf.so.1 => /lib/64/libscf.so.1 > libaio.so.1 => /lib/64/libaio.so.1 > libdoor.so.1 => /lib/64/libdoor.so.1 > libuutil.so.1 => /lib/64/libuutil.so.1 > libgen.so.1 => /lib/64/libgen.so.1 > mech_krb5.so.1 => /usr/lib/64/gss/mech_krb5.so.1 > libgss.so.1 => /usr/lib/64/libgss.so.1 > libpkcs11.so.1 => /usr/lib/64/libpkcs11.so.1 > libcmd.so.1 => /lib/64/libcmd.so.1 > libcryptoutil.so.1 => /usr/lib/64/libcryptoutil.so.1 > > From this output, you can not determine the regular expression library > that is being used. Although maybe I'm just not looking there. > I believe the -lregex ABI is presented by libstdc++ nowdays since regex was made part of the C++11 standard library. So quite difficult to see. OR, if that Squid was built with the GNUregex setting it will show up in "squid -v" output rather than the ldd dependency list. > Experimentally, I was able to find out that the grammar of POSIX > Extended does not work in any case. > > However, I believe that such things should be well documented, otherwise > the regular expression is simply silently ignored and it is extremely > difficult to detect. That sounds like a library problem. If Squid receives a regex error code from the library when compiling any regex from your squid.conf it logs the relevant error to cache.log. Amos _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
27.10.2017 20:32, Amos Jeffries пишет: > On 28/10/17 02:59, Yuri wrote: >> >> >> 27.10.2017 12:01, Amos Jeffries пишет: >>> On 27/10/17 13:06, Yuri wrote: >>>> Just for clarify (it is not well-documented. At least I can't find any >>>> documentation about): >>>> >>>> Squid's regex supports only POSIX Basic grammar? >>>> >>> >>> The specific grammar depends on your regex library used to build >>> Squid, so YMMV. >>> >>> Basic POSIX is the only portable grammar that *all* regex libraries >>> can be expected to support. So Squid does not officially support other >>> grammars (yet) even if they work in your particular build. >> That's why I'm asking that the POSIX Extended in the Squid does not >> work. And it is not well documented anywhere. And there is no easy way >> to check what regex library Squid uses. >> >> I'm trying to find it in configuration, found that: >> root @ cthulhu /patch/squid-5.0.0-patched-v2.26 # ./configure >> --help|grep regex >> --enable-gnuregex Compile GNUregex. Unless you have reason to >> use this >> Unix boxes which do not have their own regex >> library >> > > The full text there is: > " > --enable-gnuregex > > Compile GNUregex. Unless you have reason to use this > option, you should not enable it. This library file > is usually only required on Windows and very old > Unix boxes which do not have their own regex library > built in. > " > > If you *dont* override the local environment by setting that build > option Squid uses whatever your build tools link to with "-lregex". > >> Then see ldd: >> >> root @ cthulhu /patch/squid-5.0.0-patched-v2.26 # ldd >> /usr/local/squid/sbin/squid libpthread.so.1 => >> /lib/64/libpthread.so.1 >> libnettle.so.6 => /opt/csw/lib/amd64/libnettle.so.6 >> libmd5.so.1 => /lib/64/libmd5.so.1 >> libecap.so.3 => /usr/local/lib/libecap.so.3 >> libatomic.so.1 => /opt/csw/lib/amd64/libatomic.so.1 >> libssl.so.1.0.0 => /opt/csw/lib/amd64/libssl.so.1.0.0 >> libcrypto.so.1.0.0 => /opt/csw/lib/amd64/libcrypto.so.1.0.0 >> libkrb5.so.1 => /usr/lib/64/libkrb5.so.1 >> libstdc++.so.6 => /opt/csw/lib/amd64/libstdc++.so.6 >> libsocket.so.1 => /lib/64/libsocket.so.1 >> libresolv.so.2 => /lib/64/libresolv.so.2 >> libnsl.so.1 => /lib/64/libnsl.so.1 >> libltdl.so.7 => /opt/csw/lib/amd64/libltdl.so.7 >> libm.so.2 => /lib/64/libm.so.2 >> librt.so.1 => /lib/64/librt.so.1 >> libgcc_s.so.1 => /opt/csw/lib/amd64/libgcc_s.so.1 >> libc.so.1 => /lib/64/libc.so.1 >> libmp.so.2 => /lib/64/libmp.so.2 >> libmd.so.1 => /lib/64/libmd.so.1 >> libscf.so.1 => /lib/64/libscf.so.1 >> libaio.so.1 => /lib/64/libaio.so.1 >> libdoor.so.1 => /lib/64/libdoor.so.1 >> libuutil.so.1 => /lib/64/libuutil.so.1 >> libgen.so.1 => /lib/64/libgen.so.1 >> mech_krb5.so.1 => /usr/lib/64/gss/mech_krb5.so.1 >> libgss.so.1 => /usr/lib/64/libgss.so.1 >> libpkcs11.so.1 => /usr/lib/64/libpkcs11.so.1 >> libcmd.so.1 => /lib/64/libcmd.so.1 >> libcryptoutil.so.1 => /usr/lib/64/libcryptoutil.so.1 >> >> From this output, you can not determine the regular expression library >> that is being used. Although maybe I'm just not looking there. >> > > I believe the -lregex ABI is presented by libstdc++ nowdays since > regex was made part of the C++11 standard library. So quite difficult > to see. But acl's regexes behaviour demonstrate POSIX Basic behaviour. This is simple to check: \w and \d metacharacters does not work in regex acl's. > > OR, if that Squid was built with the GNUregex setting it will show up > in "squid -v" output rather than the ldd dependency list. > > >> Experimentally, I was able to find out that the grammar of POSIX >> Extended does not work in any case. >> >> However, I believe that such things should be well documented, otherwise >> the regular expression is simply silently ignored and it is extremely >> difficult to detect. > > That sounds like a library problem. If Squid receives a regex error > code from the library when compiling any regex from your squid.conf it > logs the relevant error to cache.log. configuration/version/configs. And I see not any regex error in cache.log. I want to clarify. I asked the question not because there are some errors. And because the regular expressions in the ECMAScript syntax do not work in the ACL. Without any errors. Just simple ignores acl parts with ECMAS grammar constructions. > > Amos > _______________________________________________ > squid-users mailing list > [hidden email] > http://lists.squid-cache.org/listinfo/squid-users -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
In reply to this post by Amos Jeffries
On 10/27/2017 08:32 AM, Amos Jeffries wrote:
> On 28/10/17 02:59, Yuri wrote: >> the regular expression is simply silently ignored and it is extremely >> difficult to detect. > That sounds like a library problem. If Squid receives a regex error code > from the library when compiling any regex from your squid.conf it logs > the relevant error to cache.log. When a regular expression is using extended features, the basic regular expression compiler often (or even always?!) does not fail because it views the extended features as ordinary plain characters. Thus, Squid cannot tell that something went wrong. I cannot give a Squid-based example quickly, but here is a related illustration using grep (which is not exactly the same as what happens inside Squid, but I suspect it is similar enough for the illustration purposes in this context): > $ echo "foobar" | grep --basic-regexp 'foo|bar' > $ echo "foobar" | grep --extended-regexp 'foo|bar' > foobar As you can see, the basic compiler is silent about the "|" character that it does not support. Here is a similar example where a malformed extended regular expression is silently accepted by the basic compiler: > $ echo "foobar" | grep --basic-regexp 'foo(bar' > $ echo "foobar" | grep --extended-regexp 'foo(bar' > grep: Unmatched ( or \( In theory, Squid itself could detect special characters unsupported by the current regex library but doing so correctly without breaking many existing working configurations may be impossible. On the other hand, this validation could become an optional feature that admins can control. The best strategy for a Squid admin working with complex regex ACLs may be to add external test cases that validate ACL matching expectations, but doing so requires significant amount of work and discipline. Alex. _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
27.10.2017 20:55, Alex Rousskov пишет: > On 10/27/2017 08:32 AM, Amos Jeffries wrote: >> On 28/10/17 02:59, Yuri wrote: >>> the regular expression is simply silently ignored and it is extremely >>> difficult to detect. >> That sounds like a library problem. If Squid receives a regex error code >> from the library when compiling any regex from your squid.conf it logs >> the relevant error to cache.log. > When a regular expression is using extended features, the basic regular > expression compiler often (or even always?!) does not fail because it > views the extended features as ordinary plain characters. Thus, Squid > cannot tell that something went wrong. > > I cannot give a Squid-based example quickly, but here is a related > illustration using grep (which is not exactly the same as what happens > inside Squid, but I suspect it is similar enough for the illustration > purposes in this context): > >> $ echo "foobar" | grep --basic-regexp 'foo|bar' >> $ echo "foobar" | grep --extended-regexp 'foo|bar' >> foobar > As you can see, the basic compiler is silent about the "|" character > that it does not support. Here is a similar example where a malformed > extended regular expression is silently accepted by the basic compiler: > > >> $ echo "foobar" | grep --basic-regexp 'foo(bar' >> $ echo "foobar" | grep --extended-regexp 'foo(bar' >> grep: Unmatched ( or \( > > In theory, Squid itself could detect special characters unsupported by > the current regex library but doing so correctly without breaking many > existing working configurations may be impossible. On the other hand, > this validation could become an optional feature that admins can control. > > The best strategy for a Squid admin working with complex regex ACLs may > be to add external test cases that validate ACL matching expectations, > but doing so requires significant amount of work and discipline. thousands of regular expressions - this approach seems not too acceptable. Therefore, I would like to see that the grammars used are clearly documented. Squid with a simple configuration check often does not show anything (if there are no obvious errors - i.e. incomplete regex or similar) and, in a productive configuration, it is extremely difficult to detect a non-working access control list parts. The websites are also thousands. Therefore, I would like either a clear documentation or some tool for checking whether the regular expression is correct from the point of view of the current library used by Squid or not. The existing opportunities seem completely unsatisfactory. > > Alex. > _______________________________________________ > squid-users mailing list > [hidden email] > http://lists.squid-cache.org/listinfo/squid-users -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
On Friday 27 October 2017 at 17:06:01, Yuri wrote:
> 27.10.2017 20:55, Alex Rousskov пишет: > > > > When a regular expression is using extended features, the basic regular > > expression compiler often (or even always?!) does not fail because it > > views the extended features as ordinary plain characters. Thus, Squid > > cannot tell that something went wrong. > >> $ echo "foobar" | grep --basic-regexp 'foo|bar' > >> $ echo "foobar" | grep --extended-regexp 'foo|bar' > >> foobar > > > > As you can see, the basic compiler is silent about the "|" character > > that it does not support. Here is a similar example where a malformed > > > > extended regular expression is silently accepted by the basic compiler: > >> $ echo "foobar" | grep --basic-regexp 'foo(bar' > >> $ echo "foobar" | grep --extended-regexp 'foo(bar' > >> grep: Unmatched ( or \( > I would like either a clear documentation That sounds entirely reasonable - a statement something like "Squid is guaranteed to use basic POSIX grammar, but extended grammar may be available on different systems; the sysadmin should check"? > or some tool for checking whether the regular expression is correct from the > point of view of the current library used by Squid or not. What does "correct" mean? As Alex's examples above demonstrate, both are "correct" regexes from the basic POSIX point of view; they just don't do what the admin might have wanted or expected. How could Squid know whether you expect ( in a regex to be a literal character or a meta-character? > The existing opportunities seem completely unsatisfactory. Nothing documents that Squid uses other than basic POSIX grammar, so why would you assume that it does? Antony. -- It is also possible that putting the birds in a laboratory setting inadvertently renders them relatively incompetent. - Daniel C Dennett Please reply to the list; please *don't* CC me. _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
27.10.2017 21:17, Antony Stone пишет: > On Friday 27 October 2017 at 17:06:01, Yuri wrote: > >> 27.10.2017 20:55, Alex Rousskov пишет: >>> When a regular expression is using extended features, the basic regular >>> expression compiler often (or even always?!) does not fail because it >>> views the extended features as ordinary plain characters. Thus, Squid >>> cannot tell that something went wrong. >>>> $ echo "foobar" | grep --basic-regexp 'foo|bar' >>>> $ echo "foobar" | grep --extended-regexp 'foo|bar' >>>> foobar >>> As you can see, the basic compiler is silent about the "|" character >>> that it does not support. Here is a similar example where a malformed >>> >>> extended regular expression is silently accepted by the basic compiler: >>>> $ echo "foobar" | grep --basic-regexp 'foo(bar' >>>> $ echo "foobar" | grep --extended-regexp 'foo(bar' >>>> grep: Unmatched ( or \( >> I would like either a clear documentation > That sounds entirely reasonable - a statement something like "Squid is > guaranteed to use basic POSIX grammar, but extended grammar may be available > on different systems; the sysadmin should check"? > >> or some tool for checking whether the regular expression is correct from the >> point of view of the current library used by Squid or not. > What does "correct" mean? > > As Alex's examples above demonstrate, both are "correct" regexes from the > basic POSIX point of view; they just don't do what the admin might have wanted > or expected. > > How could Squid know whether you expect ( in a regex to be a literal character > or a meta-character? I expect following known documented behaviour. And not a casket with a surprise, which should be investigated in each specific configuration. Adherence to standards provides interoperability - a familiar word? > >> The existing opportunities seem completely unsatisfactory. > Nothing documents that Squid uses other than basic POSIX grammar, so why would > you assume that it does? Antonio, the problem is that this too is not documented. Maybe someone will work hard to clearly describe the behavior in the documentation? Because I did not find, as I said, a direct mention of the default grammar. Do I clearly express my thoughts? > > > Antony. > I asked a simple question. And wanted a simple answer. And not reasoning, what can be, and what can not. Interoperability is a simple thing. -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
On Friday 27 October 2017 at 17:26:18, Yuri wrote:
> 27.10.2017 21:17, Antony Stone пишет: > > On Friday 27 October 2017 at 17:06:01, Yuri wrote: > >> 27.10.2017 20:55, Alex Rousskov пишет: > >>> When a regular expression is using extended features, the basic regular > >>> expression compiler often (or even always?!) does not fail because it > >>> views the extended features as ordinary plain characters. Thus, Squid > >>> cannot tell that something went wrong. > >>> > >>>> $ echo "foobar" | grep --basic-regexp 'foo|bar' > >>>> $ echo "foobar" | grep --extended-regexp 'foo|bar' > >>>> foobar > >>> > >>> As you can see, the basic compiler is silent about the "|" character > >>> that it does not support. Here is a similar example where a malformed > >>> > >>> extended regular expression is silently accepted by the basic compiler: > >>>> $ echo "foobar" | grep --basic-regexp 'foo(bar' > >>>> $ echo "foobar" | grep --extended-regexp 'foo(bar' > >>>> grep: Unmatched ( or \( > >> > >> I would like either a clear documentation > > > > That sounds entirely reasonable - a statement something like "Squid is > > guaranteed to use basic POSIX grammar, but extended grammar may be > > available on different systems; the sysadmin should check"? > > > >> or some tool for checking whether the regular expression is correct from > >> the point of view of the current library used by Squid or not. > > > > What does "correct" mean? > > "correct" mean "this will correctly works in Squid, not silently > ignored". This is simple and obvious, isn't it? No. Suppose I write a | character (as per Alex's first example above) in my regex. Basic POSIX will match that literally. Extended grep will not. Judging purely from what is written in my regex, did I mean the character to be matched literally, or not? Squid cannot tell. > Adherence to standards provides interoperability - a familiar word? Indeed. > I asked a simple question. And wanted a simple answer. Maybe there isn't one. > And not reasoning, what can be, and what can not. Then I apologise for trying to explain. > Interoperability is a simple thing. Er, no, it isn't. Antony. -- If the human brain were so simple that we could understand it, we'd be so simple that we couldn't. Please reply to the list; please *don't* CC me. _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
27.10.2017 21:33, Antony Stone пишет: > On Friday 27 October 2017 at 17:26:18, Yuri wrote: > >> 27.10.2017 21:17, Antony Stone пишет: >>> On Friday 27 October 2017 at 17:06:01, Yuri wrote: >>>> 27.10.2017 20:55, Alex Rousskov пишет: >>>>> When a regular expression is using extended features, the basic regular >>>>> expression compiler often (or even always?!) does not fail because it >>>>> views the extended features as ordinary plain characters. Thus, Squid >>>>> cannot tell that something went wrong. >>>>> >>>>>> $ echo "foobar" | grep --basic-regexp 'foo|bar' >>>>>> $ echo "foobar" | grep --extended-regexp 'foo|bar' >>>>>> foobar >>>>> As you can see, the basic compiler is silent about the "|" character >>>>> that it does not support. Here is a similar example where a malformed >>>>> >>>>> extended regular expression is silently accepted by the basic compiler: >>>>>> $ echo "foobar" | grep --basic-regexp 'foo(bar' >>>>>> $ echo "foobar" | grep --extended-regexp 'foo(bar' >>>>>> grep: Unmatched ( or \( >>>> I would like either a clear documentation >>> That sounds entirely reasonable - a statement something like "Squid is >>> guaranteed to use basic POSIX grammar, but extended grammar may be >>> available on different systems; the sysadmin should check"? >>> >>>> or some tool for checking whether the regular expression is correct from >>>> the point of view of the current library used by Squid or not. >>> What does "correct" mean? >> "correct" mean "this will correctly works in Squid, not silently >> ignored". This is simple and obvious, isn't it? > No. > > Suppose I write a | character (as per Alex's first example above) in my regex. > > Basic POSIX will match that literally. > > Extended grep will not. > > Judging purely from what is written in my regex, did I mean the character to > be matched literally, or not? > > Squid cannot tell. can expecting POSIX Basic behaviour and only it. Agree? But point is: we're don't know and can't know, what library functionality exists and what will work or will not. So, in each separate case we're should make testcase for EACH regex in acl to make sure it will or not will work. Generally speaking, with thousands of regular expressions and thousands of sites - it sounds pretty dumb, right? Many to many relasions, thousands tests etc. > >> Adherence to standards provides interoperability - a familiar word? > Indeed. > >> I asked a simple question. And wanted a simple answer. > Maybe there isn't one. Noooooooo. What could be simpler is to clearly document the following: "Never use anything other than POSIX Basic in regular expressions because we do not guarantee and can not guarantee it will work"? > >> And not reasoning, what can be, and what can not. > Then I apologise for trying to explain. Yes, I understand everything, Anthony. It's easier to unsubscribe - "Test every regular expression yourself." > >> Interoperability is a simple thing. > Er, no, it isn't. Simple. You just have to follow standards and standard *documented* behavior. As soon as rabbid's dances begin with self-made interpretations of the standard, problems begin. > > > Antony. > -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
In reply to this post by Antony Stone
27.10.2017 21:33, Antony Stone пишет:
> On Friday 27 October 2017 at 17:26:18, Yuri wrote: > >> 27.10.2017 21:17, Antony Stone пишет: >>> On Friday 27 October 2017 at 17:06:01, Yuri wrote: >>>> 27.10.2017 20:55, Alex Rousskov пишет: >>>>> When a regular expression is using extended features, the basic regular >>>>> expression compiler often (or even always?!) does not fail because it >>>>> views the extended features as ordinary plain characters. Thus, Squid >>>>> cannot tell that something went wrong. >>>>> >>>>>> $ echo "foobar" | grep --basic-regexp 'foo|bar' >>>>>> $ echo "foobar" | grep --extended-regexp 'foo|bar' >>>>>> foobar >>>>> As you can see, the basic compiler is silent about the "|" character >>>>> that it does not support. Here is a similar example where a malformed >>>>> >>>>> extended regular expression is silently accepted by the basic compiler: >>>>>> $ echo "foobar" | grep --basic-regexp 'foo(bar' >>>>>> $ echo "foobar" | grep --extended-regexp 'foo(bar' >>>>>> grep: Unmatched ( or \( regular expressions of access control lists by default. :) -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
In reply to this post by Yuri Voinov
On 10/27/2017 09:43 AM, Yuri wrote:
> So, in each separate case we're should make testcase for EACH regex in > acl to make sure it will or not will work. > > Generally speaking, with thousands of regular expressions and thousands > of sites - it sounds pretty dumb, right? Many to many relasions, > thousands tests etc. What an admin has to do is onerous, but not as bad as you make it sound: * A handful of test cases is sufficient to validate whether Squid instance X supports all extended regular expressions used by its ACLs. In fact, Squid can be easily modified to run such test cases on startup! * If you want to test that each of the 10K ACLs matches what you want it to match, then you have to write a lot more test cases, of course, but such deployment-specific functionality testing is a completely different topic out of this thread scope. > What could be simpler is to clearly document the following: "Never use > anything other than POSIX Basic in regular expressions because we do not > guarantee and can not guarantee it will work"? You are right: Adding the above text to squid.cond.documented is fairly simple. If Squid actually supports extended regular expressions in some environments, then such a text will also be a bit misleading/misguiding. Wiki updates or pull requests improving Squid documentation are always welcomed! Personally, I cannot volunteer to add this documentation because I do not know whether Squid can support extended regular expressions in some environments, and I do not want to spend time adding potentially misleading/misguiding documentation. Long-term, we should introduce a configuration option that specifies the exact regex flavor an admin wants _and_ forces Squid to quit if that exact flavor is not supported by the running Squid instance. Alex. _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
In reply to this post by Yuri Voinov
On 10/27/2017 09:52 AM, Yuri wrote:
> As for me personally, I would like ECMAScript syntax to be supported in > regular expressions of access control lists by default. :) I think it is pointless to argue whether regex flavor X should be supported. Once the necessary infrastructure is in place, the cost of adding support for one more popular flavor is negligible compared to the benefits it offers. The admin should be able to select the regex flavor they want (even if they want a flavor that cannot look behind or match an arbitrary character with a dot :-). Alex. _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
In reply to this post by Alex Rousskov
27.10.2017 22:01, Alex Rousskov пишет: > On 10/27/2017 09:43 AM, Yuri wrote: > >> So, in each separate case we're should make testcase for EACH regex in >> acl to make sure it will or not will work. >> >> Generally speaking, with thousands of regular expressions and thousands >> of sites - it sounds pretty dumb, right? Many to many relasions, >> thousands tests etc. > What an admin has to do is onerous, but not as bad as you make it sound: > > * A handful of test cases is sufficient to validate whether Squid > instance X supports all extended regular expressions used by its ACLs. > In fact, Squid can be easily modified to run such test cases on startup! > > * If you want to test that each of the 10K ACLs matches what you want it > to match, then you have to write a lot more test cases, of course, but > such deployment-specific functionality testing is a completely different > topic out of this thread scope. > > >> What could be simpler is to clearly document the following: "Never use >> anything other than POSIX Basic in regular expressions because we do not >> guarantee and can not guarantee it will work"? > You are right: Adding the above text to squid.cond.documented is fairly > simple. If Squid actually supports extended regular expressions in some > environments, then such a text will also be a bit misleading/misguiding. > Wiki updates or pull requests improving Squid documentation are always > welcomed! environment to another (especially when it slightly different). > > Personally, I cannot volunteer to add this documentation because I do > not know whether Squid can support extended regular expressions in some > environments, and I do not want to spend time adding potentially > misleading/misguiding documentation. > > Long-term, we should introduce a configuration option that specifies the > exact regex flavor an admin wants _and_ forces Squid to quit if that > exact flavor is not supported by the running Squid instance. Or put warning in cache.log. This will be ideally. > > Alex. > _______________________________________________ > squid-users mailing list > [hidden email] > http://lists.squid-cache.org/listinfo/squid-users -- ************************** * C++: Bug to the future * ************************** _______________________________________________ squid-users mailing list [hidden email] http://lists.squid-cache.org/listinfo/squid-users |
Free forum by Nabble | Edit this page |