NAME
    SquidClamav v6.0 - Antivirus redirector for Squid based on ClamAv.

DESCRIPTION
    SquidClamav is an antivirus redirector for Squid proxy based on the
    Awards winnings ClamAv anti-virus toolkit. Using it will help you
    securing your home or enterprise network web traffic. SquidClamav is the
    most efficient Squid Redirector antivirus tool for HTTP traffic
    available for free, it is written in C and can handle thousand of
    connections.

USAGE
  Generic Program Information
    -h, --help
        Print a usage message briefly summarizing these command-line options
        and the bug-reporting address, then exit.

    -v, --version
        Print the version number of squidclamav to the standard output
        stream. This version number should be included in all bug reports.

    -d, --debug level
        The debug level can be set from 1 up to 3. At level 3 SquidClamav
        will dump HTTP headers of files it download.

    -c, --config filename
        Use this option if you want to change the default configuration file
        path that SquidClamav will look for.

  Setting SquidClamav as Squid redirector
   Squid 2.5 configuration
    To integrate squidclamav to your squid cache just edit squid.conf and
    set the following:

    on ACL definition you should have declared:

            acl localhost src 127.0.0.1/255.255.255.255
            acl to_localhost dst 127.0.0.0/8
            acl purge method PURGE

    on http_acces definition you should declared the following:

            http_access deny to_localhost
            http_access allow localhost
            http_access allow purge localhost
            http_access deny purge
            redirector_access deny localhost

    and on the redirect section the following:

            redirect_program /usr/local/bin/squidclamav
            redirect_children 15

    If you have huge access and enough memory set the redirect_children to
    upper value.

    Note that the purge acl is only required if you enable the recommanded
    'trust_cache' option.

   Squid 2.6, 2.7, 3.x configuration
    As 2.6+ has signifiant change in the configuration file regarding
    redirector, to integrate SquidClamav to your squid cache just edit
    squid.conf and set the following:

    on ACL definition you should have declared:

            acl localhost src 127.0.0.1/255.255.255.255
            acl to_localhost dst 127.0.0.0/8
            acl purge method PURGE

    on http_acces definition you should declared the following :

            http_access deny to_localhost
            http_access allow localhost
            http_access allow purge localhost
            http_access deny purge
            url_rewrite_access deny localhost

    and on the redirect section the following:

            url_rewrite_program /usr/local/bin/squidclamav
            url_rewrite_children 15

    If you have huge access and enough system resources set the
    url_rewrite_children to an upper value.

    Note that the purge acl is only required if you enable the high
    recommanded 'trust_cache' option.

SIGNALS
    To force SquidClamav to reread his configuration file you have to
    reconfigure Squid, to do that just send the 'reconfigure' signal to
    Squid:

            squid -k reconfigure

    Squid will reread his configuration file and restart all redirectors.

CONFIGURATION
    By default, the configuration file is located at: /etc/squidclamav.conf
    or /etc/squidclamav.conf depending of the install configuration prefix.
    If you need an other path just set the path in command line argument as
    follow:

            /usr/local/bin/squidclamav -c /usr/local/etc/squidclamav.conf

    SquidClamav installation will create a default file optimized for speed.
    Feel free to modify it to match your security level.

    The format of the configuration file consists in lower case
    configuration directives name followed by a value. The name and the
    value must be separated by a single space character. Comments are lines
    starting with a '#' character.

  Global configuration
   Squid ip address and port
    SquidClamav use Squid to download files for virus scan, you have to set
    the ip address and port where SquidClamav can connect to Squid. Here are
    the default values for these directives.

            squid_ip 127.0.0.1
            squid_port 3128

    squid_ip
        May always be the localhost ip address as squidclamav and squid run
        in the same host.

    squid_port
        By default squid listen on port 3128, you can set it to any other
        listening port.

   Log file and debug
    SquidClamav log file by default stay at /var/log/squid/squidclamav.log
    as directory '/var/log/squid/' may already be writable by squid running
    user. SquidClamav is executed by this user.

            logfile /var/log/squid/squidclamav.log
            debug 0
            stat 0

    Debug and statistic informations are disable by default, do not enable
    it on production server as it cost a lot of system performance. The
    debug level can be set from 1 up to 3. At level 3 SquidClamav will dump
    HTTP headers of files it download.

    You also have the 'stat' directive that logs time performance statistics
    of SquidClamav. Set it to 1 to see SquidClamav performance and analyze
    where SquidClamav waste its time.

   Clamd daemon
    SquidClamav need to know where to contact clamd, the ClamAv daemon, for
    on stream virus scanning.

            clamd_local /tmp/clamd
            #clamd_ip 192.168.1.5
            #clamd_port 3310

    By default SquidClamav will contact clamd locally on the /tmp/clamd unix
    socket (clamd_local). If your clamd daemon use INET socket or stay in a
    remote server, you have to set the ip address and the port with clamd_ip
    and clamd_port.

    If you use INET socket the 'clamd_local' directive must be commented
    else SquidClamav will always used the clamd_local directive.

   Clamd failover
    If you have multiple ClamAv servers, SquidClamav is able to do failover
    between them. You just have to set 'clamd_ip' to a list of ip adresses
    separated by a coma. Do not insert space character in this list it will
    break all. For example:

            clamd_ip 192.168.1.5,192.168.1.13,192.168.1.9
            clamd_port 3310

    You can set up to 5 clamd server. The clamd port must be the same for
    all these server as 'clamd_port' only accept one value.

    SquidClamav will always connect to the firt available ip address. If it
    can not connect after 1 second it will try the next defined ip address.
    When a connect can be establish SquidClamav will reuse this last
    "working" ip address first to not slow down process the next time.

    If you think 1 second is a too high value, you can change the connect
    timeout to have millisecond resolution in configuration file
    squidclamav.conf with the 'clamd_timeout' configuration directive. Value
    must be set in millisecond, default is 1000, one second. This directive
    is not included in the default configuration file. Note that this
    directive will also set the timeout to purge squid cache.

   Redirection
    When a virus is detected SquidClamav need to redirect the client to a
    warning page. You can find in the SquidClamav distribution a set of perl
    CGI scripts with different language that you can use. To specify this
    redirection you have to use the 'redirect' directive as follow:

            redirect http://proxy.samse.fr/cgi-bin/clwarn.cgi

    Take a look in the cgi-bin directory to see all translation of this cgi
    script.

    Squidclamav will pass to this CGI the following parameters:

            url=ORIGNAL_HTTP_REQUEST
            virus=NAME_OF_THE_VIRUS
            source=DOWNLOADER_IP_ADDRESS
            user=DOWNLOADER_IDENT

   Chained redirector
    SquidClamav allow you to chain an other redirector with the 'squidguard'
    directive. You just have to give the path to the program. It is named
    squidguard but can be any other redirector.

            squidguard /usr/local/squidGuard/bin/squidGuard

    The chained program is called before the virus scan and any other
    SquidClamav operation. The call to this program can be disable with the
    'whitelist', 'trustuser' and 'trustclient' directives see SquidClamav
    Patterns for more information.

    To log every chained program redirection enable the 'logredir'
    configuration directive as follow:

            logredir 1

    By default it is disabled.

   Override User Agent
    Some uggly sites require IE like client browser. You can force these
    sites to accept you by using the 'useragent' directive. The default
    setting should be enough. Without that SquidClamav will comes with the
    LibCurl user agent.

            useragent Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

   Site redirection
    When SquidClamav is downloading a file with libCurl it will not follow
    infinitly the site redirections. By default it allow 30 redirection from
    a single file.

            maxredir 30

   Download timeout
    The entire process of file download and virus scan may end before the
    client internal timeout. Most of them have a timeout of 1 minute before
    displaying an error message to the client. The 'timeout' directive is
    done for that.

            timeout 60

   Trust your cache
    One of the main configuration directive for performance improvement is
    'trust_cache'. SquidClamav detect if the file to download is already
    stored in Squid cache. If you activate 'trust_cache', SquidClamav will
    not scan a file comming from Squid cache as it may have already been
    scanned during the first download. If trust_cache is disabled, no matter
    if the file is stored in the cache, SquidClamav will rescan the same
    file at each client request. I really recommand you to activate this
    directive.

            trust_cache 0

    Trusted cache is disable by default as you may want to start with a
    fresh cache.

   Maxsize
    This directive allow to completely disable virus scan for files bigger
    than the value in bytes. Default is 0, no size limit.

            maxsize 2000000

    If you want to abort virus scan after a certain amount of data you must
    take a look at the clamd configuration directive 'StreamMaxLength' that
    will close stream when the given size is reach.

  Controlling SquidClamav behaviour
    As we says SquidClamav since v5.x will scan all downloaded files by
    default. You have five directives to control the way things must work.
    For lower version read the README file in the tarball to know how it
    works and the configuration directives but I really recommand you an
    upgrade to v5.x.

    All these directives used extended regex pattern matching and are case
    insensitive.

   Control both chained program and virus scan
    There's 3 configuration directives that allow you to disable virus scan
    and call to chained redirector like SquidGuard. Those pattern matching
    are searched as soon as a Squid entry is received.

    whitelist
        The 'whitelist' configuration directive allow you to disable chained
        program and virus scan at URL level. When the given pattern match
        the URL SquidClamav fallback to Squid instantly.

        For example:

                whitelist \.clamav\.net

        will deliver any files from hosts on clamav.net domain directly.

    trustuser
        The 'trustuser' directive allow you to disable chained program and
        virus scan when an ident match the search pattern. On regex found
        SquidClamav fallback to Squid instantly. Of course you must have
        Squid authentication helper enabled.

        For example:

                trustuser administrator

        will let user logged as administrator to not be bored by chained
        program and virus scan.

    trustclient
        The 'trustclient' directive allow you to disable chained program and
        virus scan if the client source ip address or DNS name match the
        search pattern. The source ip address can be a single ip or a
        network following the given regex pattern.

        For example:

                trustclient ^192\.168\.1\.1$
                trustclient ^192\.168\.1\..*$
                trustclient ^mypc\.domain\.dom$

        The first and the last entry will disable chained program and virus
        scan for a single computer and the second will do for en entire
        class C network.

   Control virus scan
    There's 2 configuration directives that allow you to disable virus scan
    for downloaded files.

    abort
        The 'abort' directive will let you disable virus scanning at URL
        level (not chained program). When the URL match the regex pattern
        SquidClamav fallback to Squid immediately after the call to the
        chained program if there's one defined.

        For example:

                abort \.squid-cache\.org
                abort .*\.(png|gif|jpg)$

        The first regexp will exclude from virus scanning any file hosted on
        domain squid-cache.org, the last one will exclude all PNG, GIF and
        JPEG image from scanning.

    abortcontent
        The 'abortcontent' directive allow you to exclude from virus
        scanning any file whose Content-Type match the regex pattern. This
        directive cost more time because SquidClamav need to download the
        HTTP header for a file with a HEAD request. Note that some sites do
        not answer to HEAD request so content type could not be retrieved so
        they will be scanned.

        Example:

                abortcontent ^image\/.*$
                abortcontent ^video\/x-flv$

        The first directive will complete the "abort .*\.(png|gif|jpg)$"
        previous directive to match dynamic image or with parameters at end.
        The second will allow your users to view streamed video instantly.

  Configuration file example
    Here is the configuration file I use for 1500 Internet users per day.

            squid_ip 127.0.0.1
            squid_port 3128
            logfile /var/log/squid/squidclamav.log
            debug 0
            stat 0
            clamd_local /tmp/clamd
            #clamd_ip 192.168.1.5
            #clamd_port 3310
            maxsize 5000000
            redirect http://proxy.mydom.dom/cgi-bin/clwarn.cgi
            squidguard /usr/local/squidGuard/bin/squidGuard
            maxredir 30
            timeout 60
            useragent Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
            trust_cache 1

            # Do not scan standard HTTP images
            abort ^.*\.(ico|gif|png|jpg)$
            abortcontent ^image\/.*$
            # Do not scan text and javascript files
            abort ^.*\.(css|xml|xsl|js|html|jsp)$
            abortcontent ^text\/.*$
            abortcontent ^application\/x-javascript$
            # Do not scan streaming videos
            abortcontent ^video\/mp4$
            abortcontent ^video\/x-flv$
            # Do not scan pdf and flash
            #abort ^.*\.(pdf|swf)$

            # Do not scan sequence of framed Microsoft Media Server (MMS)
            abortcontent ^.*application\/x-mms-framed.*$

            # White list some sites
            whitelist .*\.clamav.net

  Testing SquidClamav
    Once you have installed+configured squidclamav and modified Squid
    configuration the best way to see if squidclamav is working well is to
    test it. If you want to see detailled output set the debug option to 1
    in squidclamav.conf file. If you want more debug trace set debug option
    to 2.

    Open a terminal onto your proxy server and run squidclamav, this will
    give you this kind of output:

            root@theproxy# squidclamav 
            SquidClamav running as UID 0: writing logs to stderr
            Thu ... 2008 LOG Reading configuration from /etc/squidclamav.conf
            Thu ... 2008 LOG Chaining with /usr/local/squidGuard/bin/squidGuard
            Thu ... 2008 LOG SquidClamav (PID 7012) started
            Thu ... 2008 bidirectional pipe to squidGuard childs ready...

    At this point squidclamav is waiting for squid input. The input line
    consists of four fields:

            URL ip-address/fqdn ident method

    For example, let's check slashdot:

            http://www.slashdot.org/ 192.168.1.3/mypc.domain.dom mylog GET

    As this site doesn't contains any virus :-) squidclamav simply return an
    empty line. Now to test clamav antivir let's type the following entry:

            http://www.eicar.org/download/eicar.com 192.168.1.3 mylog GET

    The result must be a redirection the clwarn.cgi as follow:

            Thu ... 2008 LOG Redirecting URL to: http://theproxy.com/cgi-bin/clwarn.cgi?url=http://www.eicar.org/download/eicar.com&source=192.168.1.3&user=mylog&virus=stream:+Eicar-Test-Signature+FOUND
            http://theproxy.com/cgi-bin/clwarn.cgi?url=http://www.eicar.org/download/eicar.com&source=192.168.1.3&user=mylog&virus=stream:+Eicar-Test-Signature+FOUND 192.168.1.3 mylog GET

    This last line is the request returned to squid.

    Type Ctrl+C to quit.

SQUIDCLAMAV TUNING
   Trust your cache!
    Begining with version 4.x squidclamav detect if the file to download is
    already stored in Squid cache. If you activate 'trust_cache'
    configuration option, squidclamav will not scan anymore a file coming
    from Squid cache as it may have already been scanned during the first
    download. This save some system load and improve speed a lot!

    If trust_cache is disabled, no matter if the file is stored in the
    cache, squidclamav will rescan the same file at each client request. But
    if trust_cache is enabled squidclamav "think" this file has already been
    scanned and so it is delivered as is to the client without a new scan.

    What's going on if a downloaded file contain a virus as it is now stored
    in the cache ? To prevent this squidclamav send a PURGE request to squid
    to remove this file from cache. This mean that you MUST edit your acl to
    allow localhost to send PURGE method.

    Trusted cache feature will be automatically disabled if the squidclient
    command fail or the PURGE method is forbidden.

   Increase the number of listening process
    Most of the time if your cache is going slow this is because Squid have
    to wait a free redirector to send the incoming request. In this case you
    will see message about redirector queue length in the squid cache.log.
    To fix that edit your squid.conf file and increase the numbers of
    'redirect_children' or 'url_rewrite_children' depending of your Squid
    version.

   My proxy is still slow after that!
    Verify that you don't have enabled debug in squidclamav.conf. The
    'degug' directive must never be activate on a production server or it
    will look like and old 486 computer.

    Do not scan images as sites now use a lot of images, see
    squidclamav.conf to abort image scanning. Text files can also be removed
    from scan for better performance.

    If you set the maxsize limit to a high value, you may experience client
    timeout and Squid may be very slow depending of the Internet usage. Try
    to lower this value in squidclamav.conf and the 'StreamMaxLength' in
    clamd.conf

    You also can try to move your clamd daemon on a dedicated server or
    upgrade your hardware.

    Still very slow? Maybe you think you can virus scan a Gigabyte iso on
    the fly. SquidClamav isnot the tool for that !

   What hardware configuration should I use?
    It really depends on the number of users and their Internet usage. With
    1500+ users I use a bi Dual Core with 2 Gb of memory. Maxsize is set to
    10Mb and user can not download files bigger than 20Mb. With 4000+ users
    a bi Quad Core with 4Gb of memory may be enough.

BUGS
    Please report any bugs, patches, discussion, etc. to <gilles AT darold
    DOT net>.

FEATURE REQUEST
    If you need new features let me know at <gilles AT darold DOT net>. This
    help a lot to develop a better/useful tool.

HOW TO CONTRIBUTE ?
    Any contribution to build a better tool is welcome, you just have to
    send me your ideas, features request or patches and there will be
    applied.

AUTHOR
    Gilles Darold <gilles AT darold DOT net>

ACKNOWLEDGEMENT
    Thanks to Squid-cache.org and Clamav.net for their great softwares.

    I must thanks a lot all the great contributors:

            - Leonardo Humberto Liporati from www.ig.com.br
            - Dale Laushman from The Uptime Group
            - Rainer schoepf from Proteosys.com

    and all others who help me to build a usefull and reliable product.

LICENSE
    Copyright (c) 2002-2010 Gilles Darold - All rights reserved.

    This program is free software: you can redistribute it and/or modify it
    under the terms of the GNU General Public License as published by the
    Free Software Foundation, either version 3 of the License, or any later
    version.

    This program is distributed in the hope that it will be useful, but
    WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
    Public License for more details.

    You should have received a copy of the GNU General Public License along
    with this program. If not, see < http://www.gnu.org/licenses/ >.

