I want to ban certain URLs for example having the word ‘archive’ or ‘comment’ or ‘label’ or ‘tag’. How can I do that?
I tried to apply a Request Filter like this “\S*(comment|archive|tag|label)\S*” Match Success: Reject. But the files keep on being downloaded.
As you mentioned, the best way to reject URLs which contain certain words is to create a Request Filter and set its Match Success flag to ‘Reject’.
Strange is that, using the Regular Expression you mentioned, everything worked as expected for my tests, as you can see in the following image:
There could be a few reasons why this might not work for your case:
- The Regular Expression does not actually match the URL. To debug this, use the Darcy built-in Regular Expression Editor (“Menu Bar -> Utilities -> Regular Expression Tester“) and test the URL against your Regular Expression;
- There are other filters set that may override yours. Try to create a new Job and use only one Regular Expression and see how things work out;
- The Job might not be valid. I was not able to reproduce this, but there were some cases in the past where after editing a Job, the changes were not being used right away. Make sure that the Job uses the filters. Close the Job and re-open it and see if your filters are there.
Hope this helps. Let me know how this goes.