<!--
     The FreeBSD Documentation Project

     $FreeBSD: doc/en_US.ISO8859-1/books/handbook/config/chapter.sgml,v 1.44 2002/02/28 22:38:15 tom Exp $
-->

<chapter id="config-tuning">
  <chapterinfo>
    <authorgroup>
      <author>
        <firstname>Chern</firstname>
	<surname>Lee</surname>
	<contrib>Written by </contrib>
      </author>
    </authorgroup>
    <authorgroup>
      <author>
        <firstname>Mike</firstname>
	<surname>Smith</surname>
	<contrib>Based on a tutorial written by </contrib>
      </author>
    </authorgroup>
    <authorgroup>
      <author>
        <firstname>Matt</firstname>
	<surname>Dillon</surname>
	<contrib>Also based on tuning(7) written by </contrib>
      </author>
    </authorgroup>
  </chapterinfo>

  <title>Configuration and Tuning</title>

  <sect1>
    <title>Synopsis</title>

    <indexterm><primary>system configuration/optimization</primary></indexterm>

    <para>Configuring a system correctly can substantially reduce the
      amount of work involved in maintaining and upgrading it in the
      future.  This chapter describes some of the aspects of
      administrative configuration of FreeBSD systems.</para>

    <para>This chapter will also describe some of the parameters that
      can be set to tune a FreeBSD system for optimum
      performance.</para>

    <para>After reading this chapter, you will know:</para>

    <itemizedlist>
      <listitem>
	<para>Why and how to efficiently size, layout, and place
	  filesystems and swap partitions on your hard drive.</para>
      </listitem>
      <listitem>
	<para>The basics of the <filename>rc.conf</filename> configuration and
	  <filename>/usr/local/etc/rc.d</filename> startup systems.</para>
      </listitem>
      <listitem>
	<para>How to configure virtual hosts on your network devices.</para>
      </listitem>
      <listitem>
	<para>How to use the various configuration files in
	  <filename>/etc</filename>.</para>
      </listitem>
      <listitem>
        <para>How to tune FreeBSD using <command>sysctl</command>
          variables.</para>
      </listitem>
      <listitem>
	<para>How to tune disk performance and modify kernel
	  limitations.</para>
      </listitem>
    </itemizedlist>

    <para>Before reading this chapter, you should:</para>

    <itemizedlist>
      <listitem>
	<para>Understand the basics of Unix and FreeBSD (<xref
	    linkend="basics">).</para>
      </listitem>
      <listitem>
	<para>Be familiar with keeping FreeBSD sources up to date
	  (<xref linkend="cutting-edge">), and
	  the basics of kernel configuration/compilation
	  (<xref linkend="kernelconfig">).</para>
      </listitem>
    </itemizedlist>
  </sect1>

  <sect1 id="configtuning-initial">
    <title>Initial Configuration</title>

    <sect2>
      <title>Partition Layout</title>

      <indexterm><primary>Partition layout</primary></indexterm>
      <indexterm>
        <primary><filename>/etc</filename></primary>
      </indexterm>
      <indexterm>
        <primary><filename>/var</filename></primary>
      </indexterm>
      <indexterm>
        <primary><filename>/usr</filename></primary>
      </indexterm>

      <sect3>
	<title>Base Partitions</title>

	<para>When laying out your filesystem with &man.disklabel.8;
	  or &man.sysinstall.8;, it is important to remember that hard
	  drives can transfer data at a faster rate from the outer
	  tracks than the inner.  Knowing this, you should place your
	  smaller, heavily-accessed filesystems, such as root and
	  swap, closer to the outside of the drive, while placing
	  larger partitions, such as <filename>/usr</filename>,
	  towards the inner.  To do so, it is a good idea to create
	  partitions in a similar order: root, swap,
	  <filename>/var</filename>, <filename>/usr</filename>.</para>

	<para>The size of your <filename>/var</filename> partition
	  reflects the intended use of your machine.
	  <filename>/var</filename> is primarily used to hold
	  mailboxes, log files, and printer spools.  Mailboxes and log
	  files, in particular, can grow to unexpected sizes based
	  upon how many users are on your system and how long your log
	  files are kept.  If you intend to run a mail server, a
	  <filename>/var</filename> partition of over a gigabyte can
	  be suitable.  Additionally, <filename>/var/tmp</filename>
	  must be large enough to contain any packages you may wish to
	  add.</para>

	<para>The <filename>/usr</filename> partition holds the bulk
	  of the files required to support the system and a
	  subdirectory within it called
	  <filename>/usr/local</filename> holds the bulk of the files
	  installed from the &man.ports.7; hierarchy.  If you do not
	  use ports all that much and do not intend to keep system
	  source (<filename>/usr/src</filename>) on the machine, you
	  can get away with a 1 gigabyte <filename>/usr</filename>
	  partition.  However, if you install a lot of ports
	  (especially window managers and Linux binaries), we
	  recommend at least a two gigabyte <filename>/usr</filename>
	  and if you also intend to keep system source on the machine,
	  we recommend a three gigabyte <filename>/usr</filename>.  Do
	  not underestimate the amount of space you will need in this
	  partition, it can creep up and surprise you!</para>

	<para>When sizing your partitions, keep in mind the space
	  requirements for your system to grow.  Running out of space in
	  one partition while having plenty in another can lead to much
	  frustration.</para>

	<note><para>Some users who have used &man.sysinstall.8;'s
	    <literal>Auto-defaults</literal> partition sizer have found
	    either their root or <filename>/var</filename> partitions too
	    small later on.  Partition wisely and
	    generously.</para></note>

      </sect3>

      <sect3 id="swap-design">
	<title>Swap Partition</title> 

	<indexterm><primary>swap sizing</primary></indexterm>
	<indexterm><primary>swap partition</primary></indexterm>

	<para>As a rule of thumb, your swap space should typically be
	  double the amount of main memory.  For example, if the machine
	  has 128 megabytes of memory, the swap file should be 256
	  megabytes. Systems with lesser memory may perform better with
	  a lot more swap. It is not recommended that you configure any
	  less than 256 megabytes of swap on a system and you should
	  keep in mind future memory expansion when sizing the swap
	  partition.  The kernel's VM paging algorithms are tuned to
	  perform best when the swap partition is at least two times the
	  size of main memory.  Configuring too little swap can lead to
	  inefficiencies in the VM page scanning code as well as create
	  issues later on if you add more memory to your machine.</para>

	<para>Finally, on larger systems with multiple SCSI disks (or
	  multiple IDE disks operating on different controllers), it is
	  strongly recommend that you configure swap on each drive (up
	  to four drives).  The swap partitions on the drives should be
	  approximately the same size.  The kernel can handle arbitrary
	  sizes but internal data structures scale to 4 times the
	  largest swap partition.  Keeping the swap partitions near the
	  same size will allow the kernel to optimally stripe swap space
	  across the disks.  Do not worry about overdoing it a little,
	  swap space is the saving grace of Unix.  Even if you do not
	  normally use much swap, it can give you more time to recover
	  from a runaway program before being forced to reboot.</para>
      </sect3>

      <sect3>
	<title>Why Partition?</title>

	<para> Why partition at all?  Why not create one big root
	  partition and be done with it?  Then I do not have to worry
	  about undersizing things!</para>

	<para>There are several reasons this is not a good idea.
	  First, each partition has different operational
	  characteristics and separating them allows the filesystem to
	  tune itself to those characteristics.  For example, the root
	  and <filename>/usr</filename> partitions are read-mostly, with
	  very little writing, while a lot of reading and writing could
	  occur in <filename>/var</filename> and
	  <filename>/var/tmp</filename>.</para>

	<para>By properly partitioning your system, fragmentation
	  introduced in the smaller more heavily write-loaded partitions
	  will not bleed over into the mostly-read partitions.
	  Additionally, keeping the write-loaded partitions closer to
	  the edge of the disk, for example before the really big
	  partition instead of after in the partition table, will
	  increase I/O performance in the partitions where you need it
	  the most.  Now it is true that you might also need I/O
	  performance in the larger partitions, but they are so large
	  that shifting them more towards the edge of the disk will not
	  lead to a significant performance improvement whereas moving
	  <filename>/var</filename> to the edge can have a huge impact.
	  Finally, there are safety concerns.  Having a small, neat root
	  partition that is essentially read-only gives it a greater
	  chance of surviving a bad crash intact.</para>
      </sect3>
    </sect2>

  </sect1>

  <sect1 id="configtuning-core-configuration">
    <title>Core Configuration</title>

    <indexterm>
      <primary>rc files</primary>
      <secondary><filename>rc.conf</filename></secondary>
    </indexterm>

    <para>The principal location for system configuration information
      is within <filename>/etc/rc.conf</filename>.  This file
      contains a wide range of configuration information, principally
      used at system startup to configure the system. Its name
      directly implies this; it is configuration information for the
      <filename>rc*</filename> files.</para>

    <para>An administrator should make entries in the 
      <filename>rc.conf</filename> file to
      override the default settings from
      <filename>/etc/defaults/rc.conf</filename>.  The defaults file
      should not be copied verbatim to <filename>/etc</filename> - it
      contains default values, not examples.  All system-specific
      changes should be made in the <filename>rc.conf</filename>
      file itself.</para>

    <para>A number of strategies may be applied in clustered
      applications to separate site-wide configuration from
      system-specific configuration in order to keep administration
      overhead down.  The recommended approach is to place site-wide
      configuration into another file,
      such as <filename>/etc/rc.conf.site</filename>, and then include
      this file into <filename>/etc/rc.conf</filename>, which will
      contain only system-specific information.</para>

    <para>As <filename>rc.conf</filename> is read by &man.sh.1; it is
      trivial to achieve this.  For example:</para>

    <itemizedlist>
      <listitem><para>rc.conf:</para>
<programlisting>	. rc.conf.site
	hostname="node15.example.com"
	network_interfaces="fxp0 lo0"
	ifconfig_fxp0="inet 10.1.1.1"</programlisting></listitem>
      <listitem><para>rc.conf.site:</para>
<programlisting>	defaultrouter="10.1.1.254"
	saver="daemon"
	blanktime="100"</programlisting></listitem>
    </itemizedlist>

    <para>The <filename>rc.conf.site</filename> file can then be
      distributed to every system using <command>rsync</command> or a
      similar program, while the <filename>rc.conf</filename> file
      remains unique.</para>

    <para>Upgrading the system using &man.sysinstall.8;
      or <command>make world</command> will not overwrite the 
      <filename>rc.conf</filename>
      file, so system configuration information will not be lost.</para>

  </sect1>

  <sect1 id="configtuning-appconfig">
    <title>Application Configuration</title>

    <para>Typically, installed applications have their own
      configuration files, with their own syntax, etc.  It is
      important that these files be kept separate from the base
      system, so that they may be easily located and managed by the
      package management tools.</para>

    <indexterm><primary>/usr/local/etc</primary></indexterm>

    <para>Typically, these files are installed in
      <filename>/usr/local/etc</filename>.  In the case where an
      application has a large number of configuration files, a
      subdirectory will be created to hold them.</para>

    <para>Normally, when a port or package is installed, sample
      configuration files are also installed.  These are usually
      identified with a <quote>.default</quote> suffix.  If there 
      are no existing
      configuration files for the application, they will be created by
      copying the .default files.</para>

    <para>For example, consider the contents of the directory 
    <filename>/usr/local/etc/apache</filename>:</para>

<literallayout class="monospaced">-rw-r--r--  1 root  wheel   2184 May 20  1998 access.conf
-rw-r--r--  1 root  wheel   2184 May 20  1998 access.conf.default
-rw-r--r--  1 root  wheel   9555 May 20  1998 httpd.conf
-rw-r--r--  1 root  wheel   9555 May 20  1998 httpd.conf.default
-rw-r--r--  1 root  wheel  12205 May 20  1998 magic
-rw-r--r--  1 root  wheel  12205 May 20  1998 magic.default
-rw-r--r--  1 root  wheel   2700 May 20  1998 mime.types
-rw-r--r--  1 root  wheel   2700 May 20  1998 mime.types.default
-rw-r--r--  1 root  wheel   7980 May 20  1998 srm.conf
-rw-r--r--  1 root  wheel   7933 May 20  1998 srm.conf.default</literallayout>

    <para>The filesize difference shows that only the <filename>srm.conf</filename>
      file has been changed.  A later update of the apache port would not
      overwrite this changed file.</para>

  </sect1>

  <sect1 id="configtuning-starting-services">
    <title>Starting Services</title>

    <indexterm><primary>services</primary></indexterm>

    <para>It is common for a system to host a number of services.
      These may be started in several different fashions, each having
      different advantages.</para>

    <indexterm><primary>/usr/local/etc/rc.d</primary></indexterm>

    <para>Software installed from a port or the packages collection
      will often place a script in
      <filename>/usr/local/etc/rc.d</filename> which is invoked at
      system startup with a <option>start</option> argument, and at
      system shutdown with a <option>stop</option> argument.
      This is the recommended way for
      starting system-wide services that are to be run as
      <username>root</username>, or that
      expect to be started as <username>root</username>.
      These scripts are registered as
      part of the installation of the package, and will be removed
      when the package is removed.</para>

    <para>A generic startup script in 
      <filename>/usr/local/etc/rc.d</filename> looks like:</para>

    <programlisting>#!/bin/sh
echo -n ' FooBar'

case "$1" in
start)
        /usr/local/bin/foobar
        ;;
stop)
        kill -9 `cat /var/run/foobar.pid`
        ;;
*)
        echo "Usage: `basename $0` {start|stop}" >&2
        exit 64
        ;;
esac

exit 0
    </programlisting>

    <para>This script is called with <option>start</option>
      at startup, and the <option>stop</option> at shutdown to allow
      it to carry out its purpose.</para>

    <para>Some services expect to be invoked by &man.inetd.8; when a
      connection is received on a suitable port.  This is common for
      mail reader servers (POP and IMAP, etc.).  These services are
      enabled by editing the file <filename>/etc/inetd.conf</filename>.
      See &man.inetd.8; for details on editing this file.</para>

    <para>Some additional system services may not be covered by the
      toggles in <filename>/etc/rc.conf</filename>.  These are
      traditionally enabled by placing the command(s) to invoke them
      in <filename>/etc/rc.local</filename>.  As of FreeBSD 3.1 there
      is no default <filename>/etc/rc.local</filename>; if it is
      created by the administrator it will however be honored in the
      normal fashion.  Note that <filename>rc.local</filename> is
      generally regarded as the location of last resort; if there is a
      better place to start a service, do it there.</para>

    <note><para>Do <emphasis>not</emphasis> place any commands in 
      <filename>/etc/rc.conf</filename>.  To start daemons, or
      run any commands at boot time, place a script in 
      <filename>/usr/local/etc/rc.d</filename> instead.</para>
    </note>

    <para>It is also possible to use the &man.cron.8; daemon to start
      system services.  This approach has a number of advantages, not
      least being that because &man.cron.8; runs these processes as the
      owner of the <command>crontab</command>, services may be started
      and maintained by non-<username>root</username> users.</para>
    
    <para>This takes advantage of a feature of &man.cron.8;: the
      time specification may be replaced by <literal>@reboot</literal>,
      which will
      cause the job to be run when &man.cron.8; is started shortly after
      system boot.</para>
  </sect1>

  <sect1 id="configtuning-virtual-hosts">
    <title>Virtual Hosts</title>

    <indexterm><primary>virtual hosts</primary></indexterm>
    <indexterm><primary>ip aliases</primary></indexterm>

    <para>A very common use of FreeBSD is virtual site hosting, where
      one server appears to the network as many servers.  This is
      achieved by assigning multiple network addresses to a single
      interface.</para>

    <para>A given network interface has one <quote>real</quote> address,
      and may have any number of <quote>alias</quote> addresses.
      These aliases are
      normally added by placing alias entries in
      <filename>/etc/rc.conf</filename>.</para>

    <para>An alias entry for the interface <devicename>fxp0</devicename>
      looks like:</para>

<programlisting>ifconfig_fxp0_alias0="inet xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx"</programlisting>

    <para>Note that alias entries must start with alias0 and proceed
      upwards in order, (for example, _alias1, _alias2, and so on).
      The configuration process will stop at the first missing number.
    </para>

    <para>The calculation of alias netmasks is important, but
      fortunately quite simple.  For a given interface, there must be
      one address which correctly represents the network's netmask.
      Any other addresses which fall within this network must have a
      netmask of all 1's.</para>

    <para>For example, consider the case where the 
      <devicename>fxp0</devicename> interface is
      connected to two networks, the 10.1.1.0 network with a netmask
      of 255.255.255.0 and the 202.0.75.16 network with a netmask of
      255.255.255.240.  We want the system to appear at 10.1.1.1
      through 10.1.1.5 and at 202.0.75.17 through 202.0.75.20.</para>

    <para>The following entries configure the adapter correctly for
      this arrangement:</para>

<programlisting> ifconfig_fxp0="inet 10.1.1.1 netmask 255.255.255.0"
 ifconfig_fxp0_alias0="inet 10.1.1.2 netmask 255.255.255.255"
 ifconfig_fxp0_alias1="inet 10.1.1.3 netmask 255.255.255.255"
 ifconfig_fxp0_alias2="inet 10.1.1.4 netmask 255.255.255.255"
 ifconfig_fxp0_alias3="inet 10.1.1.5 netmask 255.255.255.255"
 ifconfig_fxp0_alias4="inet 202.0.75.17 netmask 255.255.255.240"
 ifconfig_fxp0_alias5="inet 202.0.75.18 netmask 255.255.255.255"
 ifconfig_fxp0_alias6="inet 202.0.75.19 netmask 255.255.255.255"
 ifconfig_fxp0_alias7="inet 202.0.75.20 netmask 255.255.255.255"</programlisting>

  </sect1>

  <sect1 id="configtuning-configfiles">
    <title>Configuration Files</title>

    <sect2>
      <title><filename>/etc</filename> Layout</title>
      <para>There are a number of directories in which configuration
	information is kept.  These include:</para>

      <informaltable frame="none">
	<tgroup cols="2">
	  <tbody>
	    <row>
	      <entry><filename>/etc</filename></entry>
	      <entry>Generic system configuration information; data here is 
		system-specific.</entry>
	    </row>
	    <row>
	      <entry><filename>/etc/defaults</filename></entry>
	      <entry>Default versions of system configuration files.</entry>
	    </row>
	    <row>
	      <entry><filename>/etc/mail</filename></entry>
	      <entry>Extra &man.sendmail.8; configuration, other 
		MTA configuration files.
	      </entry>
	    </row>
	    <row>
	      <entry><filename>/etc/ppp</filename></entry>
	      <entry>Configuration for both user- and kernel-ppp programs.
	      </entry>
	    </row>
	    <row>
	      <entry><filename>/etc/namedb</filename></entry>
	      <entry>Default location for &man.named.8; data.  Normally the
		boot file is located here, and contains a directive to
		refer to other data in <filename>/var/db</filename>.</entry>
	    </row>
	    <row>
	      <entry><filename>/usr/local/etc</filename></entry> 
	      <entry>Configuration files for installed applications.
		May contain per-application subdirectories.</entry>
	    </row>
	    <row>
	      <entry><filename>/usr/local/etc/rc.d</filename></entry>
	      <entry>Start/stop scripts for installed applications.</entry>
	    </row>
	    <row>
	      <entry><filename>/var/db</filename></entry>
	      <entry>Persistent system-specific data files, such as 
		&man.named.8; zone files, database files, and so on.</entry>
	    </row>
	  </tbody>
	</tgroup>
      </informaltable>
    </sect2>

    <sect2>
      <title>Hostnames</title>

      <indexterm><primary>hostname</primary></indexterm>
      <indexterm><primary>DNS</primary></indexterm>

      <sect3>
	<title><filename>/etc/resolv.conf</filename></title>

	<indexterm>
	  <primary><filename>resolv.conf</filename></primary>
	</indexterm>

	<para><filename>/etc/resolv.conf</filename> dictates how FreeBSD's
	  resolver accesses the Internet Domain Name System (DNS).</para>

	<para>The most common entries to <filename>resolv.conf</filename> are:
	</para>
      
	<informaltable frame="none">
	  <tgroup cols="2">
	    <tbody>
	      <row>
		<entry><literal>nameserver</literal></entry> 
		<entry>The IP address of a name server the resolver
		  should query.  The servers are queried in the order
		  listed with a maximum of three.</entry>
	      </row>
	      <row>
		<entry><literal>search</literal></entry>
		<entry>Search list for hostname lookup.  This is normally
		  determined by the domain of the local hostname.</entry>
	      </row>
	      <row>
		<entry><literal>domain</literal></entry>
		<entry>The local domain name.</entry>
	      </row>
	    </tbody>
	  </tgroup>
	</informaltable>

	<para>A typical <filename>resolv.conf</filename>:</para>

	<programlisting>search example.com
nameserver 147.11.1.11
nameserver 147.11.100.30</programlisting>

	<note><para>Only one of the <literal>search</literal> and
	  <literal>domain</literal> options should be used.</para></note>

	<para>If you are using DHCP, &man.dhclient.8; usually rewrites 
	  <filename>resolv.conf</filename> with information received from the 
	  DHCP server.</para>
      </sect3>

      <sect3>
	<title><filename>/etc/hosts</filename></title>

	<indexterm><primary>hosts</primary></indexterm>
	
	<para><filename>/etc/hosts</filename> is a simple text
	  database reminiscent of the old Internet.  It works in
	  conjunction with DNS and NIS providing name to IP address
	  mappings.  Local computers connected via a LAN can be placed
	  in here for simplistic naming purposes instead of setting up
	  a &man.named.8; server.  Additionally,
	  <filename>/etc/hosts</filename> can be used to provide a
	  local record of Internet names, reducing the need to query
	  externally for commonly accessed names.</para>

	<programlisting># &dollar;FreeBSD&dollar;
#
# Host Database
# This file should contain the addresses and aliases
# for local hosts that share this file.
# In the presence of the domain name service or NIS, this file may
# not be consulted at all; see /etc/nsswitch.conf for the resolution order.
#
#
::1                     localhost localhost.my.domain myname.my.domain
127.0.0.1               localhost localhost.my.domain myname.my.domain

#
# Imaginary network.
#10.0.0.2               myname.my.domain myname
#10.0.0.3               myfriend.my.domain myfriend
#
# According to RFC 1918, you can use the following IP networks for
# private nets which will never be connected to the Internet:
#
#       10.0.0.0        -   10.255.255.255
#       172.16.0.0      -   172.31.255.255
#       192.168.0.0     -   192.168.255.255
#
# In case you want to be able to connect to the Internet, you need
# real official assigned numbers.  PLEASE PLEASE PLEASE do not try
# to invent your own network numbers but instead get one from your
# network provider (if any) or from the Internet Registry (ftp to
# rs.internic.net, directory `/templates').
#</programlisting>

	<para><filename>/etc/hosts</filename> takes on the simple format
	  of:</para>
	
	<programlisting>[Internet address] [official hostname] [alias1] [alias2] ...</programlisting>

	<para>For example:</para>

	<programlisting>10.0.0.1 myRealHostname.example.com myRealHostname foobar1 foobar2</programlisting>

	<para>Consult &man.hosts.5; for more information.</para>
      </sect3>
    </sect2>

    <sect2>
      <title>Log File Configuration</title>
     
      <indexterm><primary>log files</primary></indexterm>
      
      <sect3>
	<title><filename>syslog.conf</filename></title>
	
	<indexterm><primary>syslog.conf</primary></indexterm>
	
	<para><filename>syslog.conf</filename> is the configuration file
	  for the &man.syslogd.8; program.  It indicates which types
	  of <command>syslog</command> messages are logged to particular
	  log files.</para>

	<programlisting># &dollar;FreeBSD&dollar;
#
#       Spaces ARE valid field separators in this file. However,
#       other *nix-like systems still insist on using tabs as field
#       separators. If you are sharing this file between systems, you
#       may want to use only tabs as field separators here.
#       Consult the syslog.conf(5) manual page.
*.err;kern.debug;auth.notice;mail.crit          /dev/console
*.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages
security.*                                      /var/log/security
mail.info                                       /var/log/maillog
lpr.info                                        /var/log/lpd-errs
cron.*                                          /var/log/cron
*.err                                           root
*.notice;news.err                               root
*.alert                                         root
*.emerg                                         *
# uncomment this to log all writes to /dev/console to /var/log/console.log
#console.info                                   /var/log/console.log
# uncomment this to enable logging of all log messages to /var/log/all.log
#*.*                                            /var/log/all.log
# uncomment this to enable logging to a remote log host named loghost
#*.*                                            @loghost
# uncomment these if you're running inn
# news.crit                                     /var/log/news/news.crit
# news.err                                      /var/log/news/news.err
# news.notice                                   /var/log/news/news.notice
!startslip
*.*                                             /var/log/slip.log
!ppp
*.*                                             /var/log/ppp.log</programlisting>

	<para>Consult the &man.syslog.conf.5; manual page for more
	  information.</para>
      </sect3>

      <sect3>
	<title><filename>newsyslog.conf</filename></title>

	<indexterm><primary>newsyslog.conf</primary></indexterm>
	
	<para><filename>newsyslog.conf</filename> is the configuration
	  file for &man.newsyslog.8;, a program that is normally scheduled
	  to run by &man.cron.8;.  &man.newsyslog.8; determines when log
	  files require archiving or rearranging.
	  <filename>logfile</filename> is moved to
	  <filename>logfile.0</filename>, <filename>logfile.0</filename>
	  is moved to <filename>logfile.1</filename>, and so on.
	  Alternatively, the log files may be archived in &man.gzip.1; format
	  causing them to be named: <filename>logfile.0.gz</filename>,
	  <filename>logfile.1.gz</filename>, and so on.</para>

	<para><filename>newsyslog.conf</filename> indicates which log
	  files are to be managed, how many are to be kept, and when
	  they are to be touched.  Log files can be rearranged and/or
	  archived when they have either reached a certain size, or at a
	  certain periodic time/date.</para>
	
	<programlisting># configuration file for newsyslog
# &dollar;FreeBSD&dollar;
#
# filename          [owner:group]    mode count size when [ZB] [/pid_file] [sig_num]
/var/log/cron                           600  3     100  *     Z
/var/log/amd.log                        644  7     100  *     Z
/var/log/kerberos.log                   644  7     100  *     Z
/var/log/lpd-errs                       644  7     100  *     Z
/var/log/maillog                        644  7     *    @T00  Z
/var/log/sendmail.st                    644  10    *    168   B
/var/log/messages                       644  5     100  *     Z
/var/log/all.log                        600  7     *    @T00  Z
/var/log/slip.log                       600  3     100  *     Z
/var/log/ppp.log                        600  3     100  *     Z
/var/log/security                       600  10    100  *     Z
/var/log/wtmp                           644  3     *    @01T05 B
/var/log/daily.log                      640  7     *    @T00  Z
/var/log/weekly.log                     640  5     1    $W6D0 Z
/var/log/monthly.log                    640  12    *    $M1D0 Z
/var/log/console.log                    640  5     100  *     Z</programlisting>

	<para>Consult the &man.newsyslog.8; manual page for more
	  information.</para>
      </sect3>
    </sect2>

    <sect2>
      <title><filename>sysctl.conf</filename></title>

      <indexterm><primary>sysctl.conf</primary></indexterm>
      <indexterm><primary>sysctl</primary></indexterm>

      <para><filename>sysctl.conf</filename> looks much like 
	<filename>rc.conf</filename>.  Values are set in a 
	<literal>variable=value</literal>
	form.  The specified values are set after the system goes into
	multi-user mode.  Not all variables are settable in this mode.</para>

      <para>A sample <filename>sysctl.conf</filename> turning off logging
	of fatal signal exits and letting Linux programs know they are really
	running under FreeBSD.</para>

      <programlisting>kern.logsigexit=0       # Do not log fatal signal exits (e.g. sig 11)
compat.linux.osname=FreeBSD
compat.linux.osrelease=4.3-STABLE</programlisting>
    </sect2>
  </sect1>

  <sect1 id="configtuning-sysctl">
    <title>Tuning with sysctl</title>

    <indexterm><primary>sysctl</primary></indexterm>
    <indexterm><primary>Tuning with sysctl</primary></indexterm>
    
    <para>&man.sysctl.8; is an interface that allows you to make changes
      to a running FreeBSD system.  This includes many advanced
      options of the TCP/IP stack and virtual memory system that can
      dramatically improve performance for an experienced system
      administrator.  Over five hundred system variables can be read
      and set using &man.sysctl.8;.</para>
    
    <para>At its core, &man.sysctl.8; serves two functions: to read and
      to modify system settings.</para>

    <para>To view all readable variables:</para>

    <screen>&prompt.user; <userinput>sysctl -a</userinput></screen>
    
    <para>To read a particular variable, for example,
      <varname>kern.maxproc</varname>:</para>
    
    <screen>&prompt.user; <userinput>sysctl kern.maxproc</userinput>
kern.maxproc: 1044</screen>

    <para>To set a particular variable, use the intuitive
      <replaceable>variable</replaceable>=<replaceable>value</replaceable>
      syntax:</para>
    
    <screen>&prompt.root; <userinput>sysctl kern.maxfiles=5000</userinput>
kern.maxfiles: 2088 -> 5000</screen>

    <para>Settings of sysctl variables are usually either strings,
      numbers, or booleans (a  boolean being <literal>1</literal> for yes
      or a <literal>0</literal> for no).</para>
  </sect1>

  <sect1 id="configtuning-disk">
    <title>Tuning Disks</title>

    <sect2>
      <title>Sysctl Variables</title>
      
      <sect3>
	<title><varname>vfs.vmiodirenable</varname></title>
     
	<indexterm>
	  <primary><varname>vfs.vmiodirenable</varname></primary>
	</indexterm>
	
	<para>The <varname>vfs.vmiodirenable</varname> sysctl variable
	  may be set to either 0 (off) or 1 (on); it is 1 by default.  This variable controls how
	  directories are cached by the system.  Most directories are
	  small, using just a single fragment (typically 1K) in the
	  filesystem and less (typically 512 bytes) in the buffer
	  cache.  However, when operating in the default mode the buffer
	  cache will only cache a fixed number of directories even if
	  you have a huge amount of memory.  Turning on this sysctl
	  allows the buffer cache to use the VM Page Cache to cache the
	  directories, making all the memory available for caching
	  directories.  However,
	  the minimum in-core memory used to cache a directory is the
	  physical page size (typically 4K) rather than 512 bytes.  We
	  recommend turning this option on if you are running any
	  services which manipulate large numbers of files.  Such
	  services can include web caches, large mail systems, and news
	  systems.  Turning on this option will generally not reduce
	  performance even with the wasted memory but you should
	  experiment to find out.</para>
      </sect3>
     
      <sect3>
	<title><varname>hw.ata.wc</varname></title>
      
	<indexterm>
	  <primary><varname>hw.ata.wc</varname></primary>
	</indexterm>

	<para>FreeBSD 4.3 flirted with turning off IDE write caching.
	  This reduced write bandwidth to IDE disks but was considered
	  necessary due to serious data consistency issues introduced
	  by hard drive vendors.  The problem is that IDE
	  drives lie about when a write completes.  With IDE write
	  caching turned on, IDE hard drives not only write data
	  to disk out of order, but will sometimes delay writing some
	  blocks indefinitely when under heavy disk loads.  A crash or
	  power failure may cause serious filesystem corruption.
	  FreeBSD's default was changed to be safe.  Unfortunately, the
	  result was such a huge performance loss that we changed
	  write caching back to on by default after the release.  You
	  should check the default on your system by observing the
	  <varname>hw.ata.wc</varname> sysctl variable.  If IDE write
	  caching is turned off, you can turn it back on by setting
	  the kernel variable back to 1.  This must be done from the
	  boot loader at boot time.  Attempting to do it after the
	  kernel boots will have no effect.</para>
	
	<para>For more information, please see &man.ata.4;.</para>
      </sect3>
    </sect2>

    <sect2>
      <title>Soft Updates</title>

      <indexterm><primary>Soft Updates</primary></indexterm>
      <indexterm><primary>tunefs</primary></indexterm>
      
      <para>The &man.tunefs.8; program can be used to fine-tune a
	filesystem.  This program has many different options, but for
	now we are only concerned with toggling Soft Updates on and
	off, which is done by:</para>

      <screen>&prompt.root; tunefs -n enable /filesystem
&prompt.root; tunefs -n disable /filesystem</screen>

      <para>A filesystem cannot be modified with &man.tunefs.8; while
	it is mounted.  A good time to enable Soft Updates is before any
	partitions have been mounted, in single-user mode.</para>

      <note><para>As of FreeBSD 4.5, it is possible to enable Soft Updates
	at filesystem creation time, through use of the <literal>-U</literal>
	option to &man.newfs.8;.</para></note>

      <para>Soft Updates drastically improves meta-data performance, mainly 
        file creation and deletion, through the use of a memory cache.  We
        recommend turning Soft Updates on on all of your filesystems.  There
        are two downsides to Soft Updates that you should be aware of:  First,
        Soft Updates guarantees filesystem consistency in the case of a crash
        but could very easily be several seconds (even a minute!) behind
        updating the physical disk.  If your system crashes you may lose more
        work than otherwise.  Secondly, Soft Updates delays the freeing of
        filesystem blocks.  If you have a filesystem (such as the root
	filesystem) which is almost full, performing a major update, such as
        <command>make installworld</command>, can cause the filesystem to run
	out of space and the update to fail.</para>

      <sect3>
	<title>More details about Soft Updates</title>
	
	<indexterm><primary>Soft Updates (Details)</primary></indexterm>

	<para>There are two traditional approaches to writing a filesystem's meta-data
    	  back to disk.  (Meta-data updates are updates to
	  non-content data like inodes or directories.)</para>
	
	<para>Historically, the default behaviour was to write out
	  meta-data updates synchronously.  If a directory had been
	  changed, the system waited until the change was actually
	  written to disk.  The file data buffers (file contents) were
	  passed through the buffer cache and backed up
	  to disk later on asynchronously.  The advantage of this
	  implementation is that it operates safely.  If there is
	  a failure during an update, the meta-data are always in a
	  consistent state.  A file is either created completely
	  or not at all.  If the data blocks of a file did not find
	  their way out of the buffer cache onto the disk by the time
	  of the crash, &man.fsck.8; is able to recognize this and
	  repair the filesystem by setting the file length to
	  0).  Additionally, the implementation is clear and simple.
	  The disadvantage is that meta-data changes are slow.  An
	  <command>rm -r</command>, for instance, touches all the files in a
	  directory sequentially, but each directory
	  change (deletion of a file) will be written synchronously
	  to the disk.  This includes updates to the directory itself,
	  to the i-node table, and possibly to indirect blocks
	  allocated by the file.  Similar considerations apply for
	  unrolling large hierarchies (<command>tar -x</command>).</para>

	<para>The second case is asynchronous meta-data updates.  This
  	  is the default for Linux/ext2fs and
  	  <command>mount -o async</command> for *BSD ufs.  All
  	  meta-data updates are simply being passed through the buffer
  	  cache too, that is, they will be intermixed with the updates
  	  of the file content data.  The advantage of this
  	  implementation is there is no need to wait until each
  	  meta-data update has been written to disk, so all operations
  	  which cause huge amounts of meta-data updates work much
  	  faster than in the synchronous case.  Also, the
  	  implementation is still clear and simple, so there is a low
  	  risk for bugs creeping into the code.  The disadvantage is
  	  that there is no guarantee at all for a consistent state of
  	  the filesystem.  If there is a failure during an operation
  	  that updated large amounts of meta-data (like a power
  	  failure, or someone pressing the reset button),
	  the file system
  	  will be left in an unpredictable state.  There is no opportunity
  	  to examine the state of the file system when the system
  	  comes up again; the data blocks of a file could already have
  	  been written to the disk while the updates of the i-node
  	  table or the associated directory were not.  It is actually
  	  impossible to implement a <command>fsck</command> which is
  	  able to clean up the resulting chaos (because the necessary
  	  information is not available on the disk).  If the
	  filesystem has been damaged beyond repair, the only choice
	  is to <command>newfs</command> it and restore it from backup.
	  </para>

	<para>The usual solution for this problem was to implement
	  <emphasis>dirty region logging</emphasis>, which is also
	  referred to as <emphasis>journaling</emphasis>, although that
	  term is not used consistently and is occasionally applied
	  to other forms of transaction logging as well.  Meta-data
	  updates are still written synchronously, but only into a
	  small region of the disk.  Later on they will be moved
	  to their proper location.  Because the logging
	  area is a small, contiguous region on the disk, there
	  are no long distances for the disk heads to move, even
	  during heavy operations, so these operations are quicker
	  than synchronous updates.
	  Additionally the complexity of the implementation is fairly
	  limited, so the risk of bugs being present is low.  A disadvatage
	  is that all meta-data are written twice (once into the
	  logging region and once to the proper location) so for
	  normal work, a performance <quote>pessimization</quote>
	  might result.  On the other hand, in case of a crash, all
	  pending meta-data operations can be quickly either rolled-back
	  or completed from the logging area after the system comes
	  up again, resulting in a fast filesystem startup.</para>
     
	<para>Kirk McKusick, the developer of Berkeley FFS,
	   solved this problem with Soft Updates: all pending
	   meta-data updates are kept in memory and written out to disk
	   in a sorted sequence (<quote>ordered meta-data
	   updates</quote>).  This has the effect that, in case of
	   heavy meta-data operations, later updates to an item
	   <quote>catch</quote> the earlier ones if the earlier ones are still in
	   memory and have not already been written to disk.  So all
	   operations on, say, a directory are generally performed in
	   memory before the update is written to disk (the data
	   blocks are sorted according to their position so
	   that they will not be on the disk ahead of their meta-data).
	   If the system crashes, this causes an implicit <quote>log
	   rewind</quote>: all operations which did not find their way
	   to the disk appear as if they had never happened.  A
	   consistent filesystem state is maintained that appears to
	   be the one of 30 to 60 seconds earlier.  The
	   algorithm used guarantees that all resources in use
	   are marked as such in their appropriate bitmaps: blocks and inodes.
	   After a crash, the only resource allocation error
	   that occurs is that resources are
	   marked as <quote>used</quote> which are actually <quote>free</quote>.
	   &man.fsck.8; recognizes this situation,
	   and frees the resources that are no longer used.  It is safe to
	   ignore the dirty state of the filesystem after a crash by
	   forcibly mounting it with <command>mount -f</command>.  In
	   order to free resources that may be unused, &man.fsck.8;
	   needs to be run at a later time.  This is the idea behind
	   the <emphasis>background fsck</emphasis>: at system startup
	   time, only a <emphasis>snapshot</emphasis> of the
	   filesystem is recorded.  The <command>fsck</command> can be
	   run later on.  All filesystems can then be mounted
	   <quote>dirty</quote>, so the system startup proceeds in
	   multiuser mode.  Then, background <command>fsck</command>s
	   will be scheduled for all filesystems where this is required, to free
	   resources that may be unused.  (Filesystems that do not use
	   Soft Updates still need the usual foreground
	   <command>fsck</command> though.)</para>

	 <para>The advantage is that meta-data operations are nearly as
	   fast as asynchronous updates (i.e. faster than with
	   <emphasis>logging</emphasis>, which has to write the
	   meta-data twice).  The disadvantages are the complexity of
	   the code (implying a higher risk for bugs in an area that
	   is highly sensitive regarding loss of user data), and a
	   higher memory consumption.  Additionally there are some
	   idiosyncrasies one has to get used to.
	   After a crash, the state of the filesystem appears to be
	   somewhat <quote>older</quote>.  In situations where
	   the standard synchronous approach would have caused some
	   zero-length files to remain after the
	   <command>fsck</command>, these files do not exist at all
	   with a Soft Updates filesystem because neither the meta-data
	   nor the file contents have ever been written to disk.
	   Disk space is not released until the updates have been
	   written to disk, which may take place some time after
	   running <command>rm</command>.  This may cause problems
	   when installing large amounts of data on a filesystem
	   that does not have enough free space to hold all the files
	   twice.</para>
      </sect3>
    </sect2>
  </sect1>

  <sect1 id="configtuning-kernel-limits">
    <title>Tuning Kernel Limits</title>

    <indexterm><primary>Tuning kernel limits</primary></indexterm>
    
    <sect2 id="file-process-limits">
      <title>File/Process Limits</title>

      <sect3 id="kern-maxfiles">
	<title><varname>kern.maxfiles</varname></title>

	<indexterm>
	  <primary><varname>kern.maxfiles</varname></primary>
	</indexterm>
	
	<para><varname>kern.maxfiles</varname> can be raised or
	  lowered based upon your system requirements.  This variable
	  indicates the maximum number of file descriptors on your
	  system.  When the file descriptor table is full,
	  <errorname>file: table is full</errorname> will show up repeatedly
	  in the system message buffer, which can be viewed with the
	  <command>dmesg</command> command.</para>

	<para>Each open file, socket, or fifo uses one file
	  descriptor.  A large-scale production server may easily
	  require many thousands of file descriptors, depending on the
	  kind and number of services running concurrently.</para>

	<para><varname>kern.maxfile</varname>'s default value is
	  dictated by the <option>MAXUSERS</option> option in your
          kernel configuration file.  <varname>kern.maxfiles</varname> grows
          proportionally to the value of <option>MAXUSERS</option>.  When
          compiling a custom kernel, it is a good idea to set this kernel
          configuration option according to the uses of your system.  From
          this number, the kernel is given most of its pre-defined limits.
          Even though a production machine may not actually have 256 users
          connected as once, the resources needed may be similar to a
          high-scale webserver.</para>

	<note><para>As of FreeBSD 4.5, setting <option>MAXUSERS</option> to
	  <literal>0</literal> in your kernel configuration file will choose
	  a reasonable default value based on the amount of RAM present in
	  your system.</para></note>

      </sect3>
    </sect2>
    <sect2>
      <title>Network Limits</title>

      <para>The <option>NMBCLUSTERS</option> kernel configuration
	option dictates the amount of network mbufs available to the
	system.  A heavily-trafficked server with a low number of MBUFs
	will hinder FreeBSD's ability.  Each cluster represents
	approximately 2K of memory, so a value of 1024 represents 2
	megabytes of kernel memory reserved for network buffers.  A
	simple calculation can be done to figure out how many are
	needed. If you have a web server which maxes out at 1000
	simultaneous connections, and each connection eats a 16K receive
	and 16K send buffer, you need approximately 32MB worth of
	network buffers to cover the webserver.  A good rule of thumb is
	to multiply by 2, so 32MBx2 = 64MB/2K = 32768.</para>
    </sect2>
  </sect1>

  <sect1 id="adding-swap-space">
    <title>Adding Swap Space</title>

    <para>No matter how well you plan, sometimes a system doesn't run
      as you expect.  If you find you need more swap space, it's
      simple enough to add.  You have three ways to increase swap
      space: adding a new hard drive, enabling swap over NFS, and
      creating a swap file on an existing partition.</para>

    <sect2 id="new-drive-swap">
      <title>Swap on a New Hard Drive</title>

      <para>The best way to add swap, of course, is to use this as an
	excuse to add another hard drive.  You can always use another
	hard drive, after all.  If you can do this, go reread the
	discussion of <ulink
	url="configtuning-initial.html#SWAP-DESIGN">swap space
	</ulink> from the <ulink
	url="configtuning-initial.html">Initial Configuration</ulink>
	section of the Handbook for some suggestions on how to best
	arrange your swap.</para>
    </sect2>

    <sect2 id="nfs-swap">
      <title>Swapping over NFS</title>

      <para>Swapping over NFS is only recommended if you do not have a
	local hard disk to swap to.  Swapping over NFS is slow and
	inefficient in versions of FreeBSD prior to 4.x.  It is
	reasonably fast and efficient in 4.0-RELEASE and newer.  Even
	with newer versions of FreeBSD, NFS swapping will be limited
	by the available network bandwidth and puts an additional
	burden on the NFS server.</para>
    </sect2>

    <sect2 id="create-swapfile">
      <title>Swapfiles</title>

      <para>You can create a file of a specified size to use as a swap
	file.  In our example here we will use a 64Mb file called
	<filename>/usr/swap0</filename>.  You can use any name you
	want, of course.</para>

      <example>
        <title>Creating a Swapfile</title>

      <orderedlist>
        <listitem>
          <para>Be certain that your kernel configuration includes
            the vnode driver.  It is <emphasis>not</emphasis> in recent versions of
            <filename>GENERIC</filename>.</para>

          <programlisting>pseudo-device   vn 1   #Vnode driver (turns a file into a device)</programlisting>
        </listitem>

	<listitem>
	  <para>create a vn-device:</para>
	  <screen>&prompt.root; <userinput>cd /dev</userinput>
&prompt.root; <userinput>sh MAKEDEV vn0</userinput></screen>
	</listitem>

	<listitem>
	  <para>create a swapfile (<filename>/usr/swap0</filename>):</para>

	  <screen>&prompt.root; <userinput>dd if=/dev/zero of=/usr/swap0 bs=1024k count=64</userinput></screen>
	</listitem>

	<listitem>
	  <para>set proper permissions on (<filename>/usr/swap0</filename>):</para>

	  <screen>&prompt.root; <userinput>chmod 0600 /usr/swap0</userinput></screen>
	</listitem>

	<listitem>
	  <para>enable the swap file in <filename>/etc/rc.conf</filename>:</para>

	  <programlisting>swapfile="/usr/swap0"   # Set to name of swapfile if aux swapfile desired.</programlisting>
	</listitem>

	<listitem>

          <para>Reboot the machine or to enable the swap file immediately,
            type:</para>

          <screen>&prompt.root; <userinput>vnconfig -e /dev/vn0b /usr/swap0 swap</userinput></screen>
        </listitem>
      </orderedlist>

      </example>
    </sect2>
  </sect1>
</chapter>

<!--
     Local Variables:
     mode: sgml
     sgml-declaration: "../chapter.decl"
     sgml-indent-data: t
     sgml-omittag: nil
     sgml-always-quote-attributes: t
     sgml-parent-document: ("../book.sgml" "part" "chapter")
     End:
-->
