Professional Documents
Culture Documents
Tech Note
PAN-OS 4.1
Revision C
Overview
This document describes the URL categorization components and resolution process used in PAN-OS.
PA-5060
PA-5050
PA-5020
PA-4060
PA-4050
PA-4020
PA-2050
PA-2020
PA-500
PA-200
100,000
100,000
40,000
100,000
100,000
40,000
40,000
40,000
10,000
5,000
Entries in this cache never age out, but will get pushed out if not among the most recently accessed
Upon reboot, DP URL cache is cleared
Can be manually cleared using the CLI command clear url-cache
[2]
Entries in this cache never age out, but will get pushed out if not among the most recently accessed
Cache is persistent- every 1 hour the cache is written to disk, in case of power failure
[3]
When a user attempts to access a URL and the URL category needs to be determined, the Palo Alto Networks device will
compare the URL with the following components and will stop when a match is found:
1. Block list of the matching URL profile
2. Allow list of the matching URL profile
3. Custom categories that have been defined
4. DP URL cache
5. MP URL database
6. MP dynamic URL cache (if dynamic URL filtering is enabled on the URL profile)
7. Cloud servers (if dynamic URL filtering is enabled on the URL profile)
If there is no response from the MP URL DB within 5 seconds (configurable timeout), the URL will be categorized as notresolved.
If there is no response from the cloud servers within 5 seconds (configurable timeout), the URL will be categorized as
not-resolved.
If the URL is categorized as not-resolved, PAN-OS will take the action configured in the URL profile for not-resolved,
but will continue to attempt resolution of the category. When the final category match is resolved, the result will be entered
into the appropriate cache(s). If the action for the not resolved category is allow or alert, the URL requests are
allowed and forwarded, but the response from the sever will be discarded. Typically the client will retry the request. Since
PAN-OS continued with resolution for the original request, the category is likely to be resolved and entered into the
appropriate cache(s). In this case, the retry will typically match a cache entry.
Entries marked as unknown are truly unknown by BrightCloudthey are likely new sites that have never been classified.
[4]
URL Category Resolution Process High Performance (PAN-OS 3.1.6 and higher)
In terms of speed of lookup, URLs that match the DP cache are resolved quicker than matches to the MP caches. A
match in memory will be quicker than a match on disk. Each of these methods will be faster than querying the cloud
servers. In high-performance environments, the default resolution methods may need to be improved upon.
There are two optional components that can be enabled in order to reduce the time it takes to resolve URLs: the Bloom
filter and MP URL cache. These two components should be enabled in environments that require a combination of
high/new session rates, high URL lookup rates and high logging rates (5,000+ logs/sec). When these two components are
enabled, URLs are compared in the following order:
1. Block list of the matching URL profile
2. Allow list of the matching URL profile
3. Custom categories that have been defined
4. DP URL cache
5. MP Bloom filter hash table:
If there is a match, the URL is in the on-disk database. Check the following:
6. MP URL cache
7. MP URL database
If there is NOT a match, the URL is not in the on-disk database. Check the following:
6. MP dynamic URL cache (if dynamic URL filtering is enabled on the URL profile)
7. Cloud servers (if dynamic URL filtering is enabled on the URL profile)
Note: Since the Bloom filter and MP URL cache use additional MP memory, it is recommended that you only implement
these features where high performance URL filtering is required.
The following diagram shows the sequence that includes the Bloom filter and MP URL cache:
The CLI commands to enable the Bloom filter and MP URL cache are:
admin@PAN(active)> set system setting url-filtering-feature filter true
admin@PAN(active)> set system setting url-filtering-feature cache true
To activate these settings a restart of the device-server is required:
admin@PAN(active)> debug software restart device-server
Confirm that these settings took effect:
admin@PAN(active)> show system setting url-filtering-feature
cfg.url-feature.basedb-cache: True
cfg.url-feature.bloom-filter: True
[5]
These two settings are persistentthey will survive a reboot. These commands will need to be executed on each device
in an HA pair.
You can examine the cache hit rate using the following command:
debug device-server bc-url-db show-stats
Example output of that command is shown below:
[6]
A second use case where wildcard usage frequently becomes a point of confusion is the coverage of subdomains. A
wildcard will only match a single subdomain string. Administrators often expect a wildcard in the left most token position to
cover any number of subdomains. In other words, they expect that *.domain.com would cover sub1.domain.com,
sub2.sub1.domaincom, and so on. This is not the case; an entry must be included in the custom list/category for each
subdomain token. Typically, finding two subdomains will suffice.
[7]
Revision History
Date
7/11/12
Revision
C
1/31/12
12/22/11
B
A
Comment
Added URL Lookup and Matching section and paragraph on
handling of not-resolved.
Added command to check status of MP cache.
First release of this document.
[8]