Follow me on Twitter:

How Much Server do you Need?

Posted: March 1st, 2010 | Author: | Filed under: IT Management | Tags: , , , | 3 Comments »

When purchasing server hardware, do you tend to purchase more power than you need, or not enough? Specifying the correct server for your current need is a fine art, and it’s easy to get wrong. Here are some helpful hints and considerations to remember that will ensure you make the right server purchasing decision.

We’re going to focus on standalone (non-blade) servers for the moment, but many aspects are also applicable to blade servers. Blade servers are wonderful for centralized management of the hardware, but the specs of the individual server blades can vary tremendously.

Hardware Management

Want to avoid trudging down to the datacenter late at night, or even worse, across the world if something breaks? Then don’t skimp on the management controller, lights out manager, or whatever the vendor is calling it. Many vendors ship a simple version by default: it may allow serial console access only, for example. Make sure to get the full-featured controller, because even if the hardware is only a few doors down, getting up from your desk should never be necessary.

If you aren’t thinking of switching vendors any time soon, you might think that the management interface will always work the same as it has on all your other servers. Unfortunately, that’s not the case. Sun x86 hardware, for example, has many different hardware management controllers to choose from. The more expensive and feature-rich servers have the better controllers, but don’t make the mistake of thinking the interface never changes. The unfortunate part is that you never know how well it works until you get a server on-site.

Hardware management comes in two forms: IPMI (most support), and the user interface. The user interface is more often than not, a Web-based java application that provides remote console access. Some are extremely buggy, and others work quite well from all Web browsers. We can’t make a recommendation, though, because these things change often.

Memory

Shucks, this one is a no-brainer: as much as you can afford. Within reason, that is. If you aren’t going to run virtual machines, and this server’s only job is to serve up some simple Web pages, then 16GB of RAM is likely overkill. Likewise, make sure you know what your application can support. Many java applications are limited to a heap size of 2 or 4GB.

It’s also overkill to purchase more than 4GB of RAM if you need to run a 32-bit operating system. Yes, Windows Server does some tricks and it can use more than 4GB, but it’s a huge performance it.

If virtualization is in your future, load up as much as possible. You also want to pay attention to how many DIMM slots the server has. The 8GB DIMMs are horribly expensive now, so you’ll probably want to stick with 4GB sticks. Just remember, if you fill all the slots in the server, the only memory upgrade path is to buy higher capacity DIMMs.

CPU

Do you want to run many threads at an even pace or just a few threads as fast as possible? Sun’s T2 processors aren’t fast by any measure, but they can run many threads at the same speeds consistently. These are ideal for database servers, but not for Web servers.

Will this server be executing a wide variety of processes over and over again, as opposed to just running the same big application server constantly? If so, make sure you pay attention to the amount of cache each core of the CPU has.

For virtualization, you want the fastest multi-core processors available, with the largest amount of L2 cache. Cache is very important as it minimizes the number of times the CPU needs to fetch data from slower RAM. It makes a very noticeable difference on heavily used servers.

Disks, Controllers, and RAID

If you need local storage, do pay attention to the type of disks you’re ordering. A SATA disk is likely to disappoint if you have an IO-heavy workload. SAS, and FC disks should perform equally well, since they are both SCSI disks underneath.

Even if you don’t need much local storage, you should always buy a server with a RAID controller that can mirror the operating system disks, unless you’re SAN booting of course. You don’t want the OS to crash just because of a failed disk. Likewise, if you’re keeping tons of local storage for some reason, make sure to get a RAID card that does RAID-5, so that you can at least lose one disk at a time without losing data. If performance is a concern you should really be using iSCSI or SAN storage, but you may also think about a RAID 0+1 configuration to avoid the slower RAID-5 parity calculations.

If you’re attaching to a SAN, make sure to include the correct HBA as well.

Networking

When servers started showing up with two or four gigabit NICs I must admit, I was confused. Why would someone need that many? Aside from large servers that do a lot of network IO, you might also want to separate out your iSCSI traffic from normal Ethernet. It’s also important these days to make sure that the network cards support TOE, or a TCP Offload Engine. This will task the network card with computing TCP checksums, freeing your CPUs for more important things.

In summary, most of these things may seem common sense, but you need to remember to ask all the right questions every time you spec a server. Here’s a good checklist:

  • Adequate hardware management controller
  • Enough (but not too much) RAM, that’s fast enough, but not faster than the CPU’s front-side bus
  • Enough memory slots for expansion, if that seems likely
  • Correct CPU for this server’s needs
  • RAID-1 for the OS, and (optionally) other RAID levels for other local storage
  • FC HBAs?
  • Multiple gigabit NICs with TOE capabilities


3 Comments »

Related posts:

  1. Managing Virtual Machine and Cloud Sprawl
  2. Understanding Linux Virtual Memory
  3. Squeeze Your Gigabit NIC for Top Performance
  4. Is Cheap Web Hosting Worth It?
  5. Back to Basics: Unix System Stats Utilities

Networking 101: More Subnets, and IPv6

Posted: February 27th, 2010 | Author: | Filed under: Networking, Networking 101 | Tags: , , , | 3 Comments »

What’s the point of creating subnets anyways? How do I remember those strange looking subnet masks? How the heck does this work with those crazy looking IPv6 addresses? This edition of Networking 101 will expand on the previous Subnets and CIDR article, in the interest of promoting a thorough understanding of subnetting.

An oft-asked question in networking classes is “why can’t we just put everyone on the same subnet and stop worrying about routing?” The reason is very simple. Every time someone needs to talk, be it to a router or another host, they have to send an ARP request. Also, there’s broadcast packets that aren’t necessarily limited to ARP, which everyone hears. When there are only 255 devices on a /24 subnet, the amount of broadcast packets are fairly limited. It is important to keep this number low, because every time a packet destined for a specific host or a broadcast address is seen, the host must handle the packet. A hardware interrupt is created, and the kernel of the operating system must read enough of the packet to determine whether or not it cares about it.

Broadcast storms happen at times, mainly because of layer 2 topology loops. We’ll explain layer 2 topology issues in excruciating (actually, enlightening) detail in a future issue. When thousands of packets hit a computer at a time, slow and fast computers alike can become very slow. The kernel spends so much time handling interrupts that it doesn’t have much left for dealing with “trivial” things like making sure your web browser process gets a chance to run. So that, my friends, is why subnets are very important. This is also known as a broadcast domain, because it limits the amount of broadcasts that you will hear.

The natural follow-up question normally involves a host’s notion of a broadcast address and netmask. We hopefully understand that a host needs to understand what computers are on the same subnet. Those IP addresses can be spoken to directly, making a router unnecessary. When the netmask or broadcast address is incorrectly configured, you’ll quickly find that some hosts are unreachable.

The most common erroneous configuration happens when someone configure an IP address without specifying the netmask and broadcast address. For some reason, most operating systems don’t take the liberty of updating these things, even though one can be determined from the other. If you run ‘ifconfig eth0 130.211.0.1 netmask 255.255.255.0′ you might expect that everything is ready to go. Unfortunately, it’s very likely that your broadcast address was set to 255.255.0.0. It largely depends on the router’s configuration, but normally this results in all broadcast packets being dropped. Conversely, if the netmask is configured incorrectly, the computer wouldn’t know where the subnet starts and begins. If a computer thinks a host is on the same subnet when it actually isn’t, it will attempt to ARP for it instead of the router. Routers can be configured to handle this and pretend they are the host (called Proxy Arp), but normally the result is unreachable hosts.

Understand how the netmask is configured, to avoid this problem. Figuring out the network and broadcast address isn’t very difficult when you remember that the netmask simply means “cover some bits,” but deciphering netmask representation can induce a double-take. The netmask for a /24 network is 255.255.255.0, that’s easy. But what does 255.255.240.0 mean? The best way to decipher it is to begin with the masked off part. Comparing it to the /24, which had three octets masked, we see that 255.255.240.0 has two octets masked, and part of another. We know it’s between a /16 and a /24. We have to understand binary, and realize how many bits are masked. The last 16 bits are clearly part of the network portion. The third octet, 240, allows 16 IP addresses beyond the mask, so it must mean that four bits are left (2^4=16). The four remaining bits, plus the 16 bits used for the first two octets means that we’re dealing with a /20!

What about 1.0.0.0/255.255.255.248? We’re definitely in a land smaller than the /24 subnet. If we look at the remaining bits in the last octet, we can see that there are eight IP addresses available. Remember that only 2^3 can make eight, so we’re using all but three bits in the network portion. This is a /29 network. Of course, the easy ones are pretty clear: 255.255.255.128 allows half as many host addresses in the last octet compared to the /24 network, so it’s a /25.

On the topic of confusing netmasks, IPv6 addresses certainly have a place. The netmask isn’t really an issue–the same concept applies, just with larger numbers to remember. The real problem lies within the address representation itself; the IETF seemed to take pride in creating confusion. Typically an IPv6 address is represented in hex, or base-16. Our old friend IPv4 could represent an IP address in hex too, which would look like B.B.B.B for the address 11.11.11.11. Unfortunately, IPv6 isn’t quite that nice looking. To represent 128 bits, IPv6 normally breaks up the address into eight 16-bit segments.

An IPv6 address looks like: 2013:4567:0000:CDEF:0000:0000:00AD:0000. It does get a bit easier. For example, leading zeros are not written, and contiguous quads of zeros get collapsed to ::. Trailing zeros ,however, must be shown. This is a bit confusing, but the rules always allow for a non-ambiguous IP address. Leading zeros in each quad can always be removed, but the collapsing of contiguous blocks of zeros can only happen once per address. The above address with collapsed zeros will look like: 2013:4567:0000:CDEF::AD:0000. IPv6 provides 2^128 addresses, more than enough to allocate roughly 1000+ IP addresses per square meter of the earth.

If you remember the rules of binary, the address representation rules with IPv6, and a few simple subnets for reference, you’ll be Master of Subnets – the one who everyone asks for help.


3 Comments »

Related posts:

  1. Networking 101: Subnetting – Slice Up 32-bits
  2. Networking 101: IP addresses
  3. Networking 101: Understanding Layers
  4. Networking 101: Layer 2, Link and Spanning Tree
  5. What the Heck is a TCAM?

The Perils of Sudo With User Passwords

Posted: February 25th, 2010 | Author: | Filed under: IT Management, Linux / Unix | Tags: , | 28 Comments »

The consensus among new Unix and Linux users seems to be that sudo is more secure than using the root account, because it requires you type your password to perform potentially harmful actions. In reality, a compromised user account, which is no big deal normally, is instantly root in most setups. This sudo thinking is flawed, but sudo is actually useful for what’s it was designed for.

The (wrong) idea is that you shouldn’t use the root account, because apparently it’s too “dangerous.” This argument usually comes from new Linux users and people that call themselves “network administrators,” but has no basis in reality. We’ll come back to that in a moment.

The concept behind sudo is to give non-root users access to perform specific tasks without giving away the root password. It can also be used to log activity, if desired. Role-based access control isn’t available in Linux, so sudo is a great alternative, if used properly. Solaris 10 has greatly improved RBAC capabilities; so you can easily allow a junior admin access to web server restart scripts with the appropriate access levels, for example. Sudo is supposed to be configured to allow a certain set of people to run a very limited set of commands, as a different user.

Unfortunately, sysadmins and home users alike have begun using sudo for everything. Instead of running ‘su’ and becoming root, they believe that ‘sudo’ plus ‘command’ is a better alternative. Most of the time, sysadmins with full sudo access just end up running ‘sudo bash’ and doing all their work from that root shell. This is a problem.

Using a user account password to get a root shell is a bad idea.

Why is there a separate root account anyway? It isn’t to simply protect you from your own mistakes. If all sysadmins just become root using their user password (running: sudo bash), then why not just give them uid 0 (aka root) and be done with it? For a group of sysadmins, the only reason they should want to use sudo is for logging of commands. Unfortunately, this provides zero additional security or auditing, because an attacker would just run a shell. If sysadmins are un-trusted such that they need to be audited, they shouldn’t have root access in the first place.

Surprisingly, the home-user rational makes its way into the workplace as well. The recurring argument is that running a root shell is dangerous. Partially to blame for this grave misunderstanding is X login managers, for allowing the root user to login. New users are always scolded and explained to that running X as root is wrong. The same goes for many other applications, too. As time progressed, people started remembering that “running as root” is wrong, passing this idology down to their children, but without any details. A genetic mutation may have occurred, but insufficient research has been done on that topic thus far. Now that Ubuntu Linux doesn’t enable a root account by default, but instead allows full root access to the user via sudo, the world will never be the same.

People praise sudo, while demeaning Windows at the same time for not having any separation of privileges by default. The answer to security clearly is a multi-user system with privilege separation, but sudo blurs these lines in its most common usage. The Ubuntu usage of sudo simply provides a hoop to jump through, requiring users to type their password more often than they’d like. Of course this will prevent a user’s web browser from running something as root, but it isn’t security.

We’d really like to focus on the Enterprise, where sudo has very little place.

The sudo purists, or sudoists, we’ll call them, would have you run sudo before every command that requires root. Apparently running ‘sudo vi /etc/resolv.conf’ is supposed to make you remember that you’re root, and prevent mistakes. Sudoists will also say that it protects against “accidentally left open root shells” as well. If there are accidental shells left on computers with public access, well that’s an HR action item.

Sudo atheists will quickly point out that using sudo without specifically defined commands in the configuration file is a security risk. Sudoists user account passwords have root access, so in essence, sudo has un-done all security mechanisms in place. SSH doesn’t allow root to login, but with sudo, a compromised user password removes that restriction.

In a true multi-user environment, every so often a root compromise will happen. If users can login, they can eventually become root, and that’s just a fact of life. The first thing any old-school cracker installs is a hacked SSH program, to log user passwords. Ideally, this single hacked machine doesn’t have any sort of trust relationship with other computers, because users are allowed access. The next time an administrator logs into the hacked machine, his user account is compromised. Generally this isn’t a big deal, but with sudo, this means a complete root compromise, probably for all machines. Of course SSH keys can help, as will requiring separate passwords for administrators on the more important (non user accessible) servers; but if they’re willing to allow their user account access to unrestricted root-level commands, then it’s unlikely that there’s any other security in place elsewhere.

As we mentioned, sudo has its place. Allowing a single command to be run with elevated privileges in an operating system that doesn’t support such things is quite useful. Still, be very careful about who gets this access, even for one item. As with all software, sudo isn’t without bugs.

For the love of security, please, we beg of you, do not use sudo for full root access. Administrators keep separate, non-UID 0 accounts for a reason, and it’s not for “limiting the mistakes.” Everything should be done from a root shell, and you should have to know an uber-secret root password to access anything as root.


28 Comments »

Related posts:

  1. Multi-user Security in Linux
  2. Back to Basics: Unix File Permissions
  3. Zenoss: We Can Ditch Nagios Now
  4. Managing Virtual Machine and Cloud Sprawl
  5. Is Cheap Web Hosting Worth It?

Back to Basics: Unix System Stats Utilities

Posted: February 24th, 2010 | Author: | Filed under: Linux / Unix | Tags: , , , , , | No Comments »

Unix and Linux systems have forever been obtuse and mysterious for many people. They generally don’t have nice graphical utilities for displaying system performance information; you need to know how to coax the information you need. Furthermore, you need to know how to interpret the information you’re given. Let’s take a look at some common system tools that can provide tons of visibility into what the opaque OS is really doing.

Unfortunately, the same tools don’t exist universally across all Unix variants. A few commonly underused ones do, however, and that is what we’ll focus on first.

Disk Activity
A common source of “slowness” is disk I/O, or rather the lack of available I/O. On Linux especially, it may be a difficult diagnosis. Often the load average will climb quickly, but without any corresponding processes in top eating much CPU. Linux counts “iowait” as CPU time when calculating load average. I’ve seen load numbers in the tens of thousands, on more than one occasion.

The easiest way to see what’s happening to your disks is to run the ‘iostat’ program. Via iostat, you can see how many read and write operations are happening per device, how much CPU is being utilized, and how long each transaction takes. Many arguments are available for iostat, so do spend some time with the man page on your specific system. By default, running ‘iostat’ with no arguments produces a report about disk IO since boot. To get a snapshot of “now” add a numerical argument last, which will prompt iostat to gather statistics for that number of seconds.

Linux will show number of blocks read or written per second, along with some useful CPU statistics. This is one particularly busy server:

 avg-cpu:  %user   %nice %system %iowait  %steal   %idle
 1.36    0.07    5.21   23.80    0.00   69.57
Device:   tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda       18.22     15723.35       643.25 65474958946 2678596632

Notice that iowait is at 23%. This means that 23% of the time this server is waiting on disk I/O. Some Solaris iostat output shows a similar thing, just represented differently(iostat -xnz):

    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
295.3   79.7 5657.8  211.0  0.0 10.3    0.0   27.4   0 100 d101
134.8   16.4 4069.8  116.0  0.0  3.5    0.0   23.3   0  90 d105

The %b (block) column shows that I/O to device d101 is 100% blocked waiting for the device to complete transaction. The average service time isn’t good either: disk reads shouldn’t take 27.4ms. Arguably, Solaris’s output is more friendly to parse, since it gives the reads per second in kilobytes rather than blocks. We can quickly calculate that this server is reading about 19KB per read by dividing the number of KB read per second by the number of reads that happened. In short: this disk array is being taxed by large amounts of read requests.

Vmstat
The ‘vmstat’ program is also universally available, and extremely useful. It, too, provides vastly different information among operating systems. The vmstat utility will show you statistics about the virtual memory subsystem, or to put it simply: swap space. It is much more complex than just swap, as nearly every IO operation involves the VM system when pages of memory are allocated.A disk write, network packet send, and the obvious “program allocates RAM” all impact what you see in vmstat.

Running vmstat with the -p argument will print out statistics about disk IO. In Solaris you get some disk information anyway, as seen below:

 kthr      memory            page            disk
 faults      cpu  r b w   swap
free  re  mf pi po fr de sr m0 m1 m2 m7
in   sy   cs us sy id  0 0 0 7856104 526824 386 2401 0 0 0  0  0  3  0  0  0
16586 22969 12576 8 9 83  1 0 0 7851344 522016 18 678 32 0  0  0  0  2
0  0  0 13048 11737 10197 7 6 86  0 0 0 7843584 514128 76 3330 197 0
0 0  0  2  0  0  0 4762 131492 4441 16 8 76

A subtle, but important differences between Solaris and Linux is that Solaris will start scanning for pages of memory that can be freed before it will actually start swapping RAM to disk. The ‘sr’ column, scan rate, will start increasing right before swapping takes place, and continue until some RAM is available. The normal things are available in all operating systems; these include: swap space, free memory, pages in and out (careful, this doesn’t mean swapping is happening), page faults, context switches, and some CPU idle/system/user statistics. Once you know how to interpret these items you quickly learn to infer what they indicate about the usage of your system.

The two main programs for finding “slowness” are therefore iostat and vmstat. Before the obligatory tangent into “what Dtrace can do for you,” here’s a few other tools that no Unix junkie should leave home without:

lsof
Lists open files (including network ports) for all processes
netstat
Lists all sockets in use by the system
mpstat
Shows CPU statistics (including IO), per-processor

Dtrace
We cannot talk about system visibility without mentioning Dtrace. Invented by Sun, Dtrace provides dynamic tracing of everything about a system. Dtrace gives you the ability to ask any arbitrary question about the state of a system, which works by calling “probes” within the kernel. That sounds intimidating, doesn’t it?

Let’s say that we wanted to know what files were being read or written on our Linux server that has a high iowait percentage. There’s simply no way to know. Let’s ask the same question of Solaris, and instead of learning Dtrace, we’ll find something useful in the Dtrace ToolKit. In the kit, you’ll find a few neat programs like iosnoop and iotop, which will tell you which processes are doing all the disk IO operations. Neat, but we really want to know what files are being accessed so much. In the FS directory, the rfileio.d script will provide this information. Run it, and you’ll see every file that’s read or written, and cache hit statistics. There’s no way to get this information in other Unixes, and this is just one simple example of how Dtrace is invaluable.

The script itself is about 90 lines, inclusive of comments, but the bulk of it is dealing with cache statistics. An excellent way to start learning Dtrace is to simply read the Dtrace ToolKit scripts.

Don’t worry if you’re not a Solaris admin: Dtrace is coming soon to a FreeBSD near you. SystemTap, a replica of Dtrace, will be available for Linux soon as well. Until then, and even afterward, the above mentioned tools will still be invaluable. If you can quickly get disk IO statistics and see if you’re swapping the majority of system performance problems are solved. Dtrace also provides amazing application tracing functionality, and if you’re looking at the application itself, you already know the slowness isn’t likely being caused by a system problem.

Soon, I’ll publish a few Dtrace tutorials.

Some things have surely been left out – discuss below!


No Comments yet... be the first »

Related posts:

  1. Back To Basics: Unix Differences in Performing Tasks
  2. Back to Basics: Unix File Permissions
  3. Working With Unix Variant Differences
  4. Understanding Linux Virtual Memory
  5. Zenoss: We Can Ditch Nagios Now

Back To Basics: Unix Differences in Performing Tasks

Posted: February 23rd, 2010 | Author: | Filed under: Linux / Unix | Tags: , , , , , | 3 Comments »

It has often been said that a skilled sysadmin can quickly come up to speed with any Unix system in a matter of hours. After all, the underlying principals are all the same. Fortunately, this is somewhat correct. Unfortunately, this also leads to people making changes on systems they do not understand, often times in suboptimal ways.

In this final Back to Basics With Unix piece, we’d like to spend some time talking about some common, routine sysadmin tasks and how they differ between Unix variants.

Sure, you can clunk around and change configuration files to mostly make something work on a foreign system. But will those changes remain after security patches get applied and stomp all over your work? Did you just change a file that was meant to never change, because there’s a separate file for local modifications? If you’re not familiar with “how it’s done” in that particular OS, it’s as likely as not.

GUIs
Yes, I make fun of GUI configuration utilities. People that don’t understand systems often use them and “get by,” but they cannot fix things when they break, unless the GUI tool can do it for them. That said, they do have their place. When learning a new system, it often makes sense to use the provided configuration utilities, as you know without a doubt they will adjust the necessary setting they way the OS wants it done. Here’s a list of some handy general administration GUIs:

- AIX: smitty (does pretty much everything)
- FreeBSD: sysinstall (not recommended for use after the initial install, but it works)
- HP-UX: sam (like AIX’s smitty)
- Linux: system-config, webmin and many others (distro-dependant)
- Solaris: admintool, wbem (use with caution)

Often, these tools still don’t do what you need. They certainly don’t help you learn a system unless you take the time to examine what the tool actually changed. Let’s start off with the basics: gathering system information and managing hardware. It can be a nightmare to add a disk to a foreign system, so hopefully this list will get you steered in the proper direction.

Show hardware configuration:
- AIX: lsdev, lscfg, prtconf
- FreeBSD: /var/run/dmesg.boot, pciconf
- HP-UX: ioscan, model, getconf, print_manifest
- Linux: dmesg, lspci, lshw, dmidecode
- Solaris: prtconf, prtdiag, psrinfo, cfgadm

Note that ‘dmesg’ is a circular kernel buffer on most systems, and after the machine has been up for a while the boot information listing devices gets overwritten. FreeBSD thoughtfully saves it in dmesg.boot for you, but in other systems you’re left relying on the above-mentioned exploratory tools.

Add a new device (have the OS discover it without a reboot):
- AIX: cfgmgr
- FreeBSD: atacontrol, camcontrol
- HP-UX: ioscan, insf
- Linux: udev, hotplug (automatic)
- Solaris: devfsadm, disks, devlinks (all a hardlink to the same binary now)

If you connect a new internal disk and need it recognized, you should not need to reboot in the Unix world. The above commands will discover new devices and make them available. If you’re talking about SAN disks, the utilities are mostly the same, but there are other programs that make the process much easier and also allow for multipathing configurations.

Label and partition a disk:
- AIX: mkvg then mklv
- FreeBSD: fdisk or sysinstall
- HP-UX: pvcreate then lvcreate, or sam
- Linux: fdisk or others
- Solaris: format or fmthard

Of course, you’ll also want to create a file system on your new disk. This is newfs or mkfs everywhere, with the exception of AIX which forces you to use crfs. The filesystem tab file, which describes file systems and mount options, vary a bit as well. In Linux, FreeBSD, and HP-UX it is /etc/fstab, Solaris uses /etc/vfstab, and AIX references /etc/filesystems. We spent so much time on filesystems and hardware because that’s the generally the biggest hurdle when learning a new system, and when you’re needing to do it, often you’re in a hurry.

Other tasks may or may not be covered by GUI utilities in the various flavors of Unix, so here’s a few more that we deem crucial to understand.

Display IP information and change IP address permanently:
- AIX: ifconfig/lsattr; smitty or chdev
- FreeBSD: ifconfig; /etc/rc.conf
- HP-UX: ifconfig/lanadmin; set_params
- Linux: ‘ip addr’; /etc/sysconfig/network or /etc/network/interfaces
- Solaris: ifconfig; edit /etc/hosts, /etc/hostname.*

Linux will of course vary, but those two files cover the most popular distros.

When taking over a foreign system, we frequently want to two two things: install missing software (like GNU utilities), and verify that the system is up-to-date on security patches. Where to get packages and where to gete latest security patches varies too much to cover here—you’ll likely need to search to the OS in question—but the way you install packages and show installed patches is extremely useful to know.

List installed patches:
- AIX: instfix, oslevel
- FreeBSD: uname
- HP-UX: swlist
- Linux: rpm, dpkg
- Solaris: showrev

Install packages:
- AIX: smitty, rpm, installp
- FreeBSD: pkg_add, portinstall, sysinstall
- HP-UX: swinstall
- Linux: rpm, yum, apt, yast, etc.
- Solaris: pkgadd

As you can see, things vary immensely between the Unix variants. Even within all of Linux you can easily find yourself lost. Google is a friend to all sysadmins, but too often the conceptual questions go unanswered. Here’s a general rule of thumb, and something I’ve seen done incorrectly too many times: if you see a configuration file in /etc/, say syslog.conf, and there is an accompanying syslog.d directory, you are not supposed to edit the syslog.conf file directly. The same goes for pam.conf and pam.d. Each service will have their own file within the .d directory, and that is where they are configured.

The .d directory example is mostly applicable to Linux, but be sure to pay attention when you see similar multi-config layouts anywhere else. Future sysadmins using the system will thank you if the OS’s conventions are followed and it’s easy to identify customizations. It also means that your changes aren’t likely to be stomped over by updates.


3 Comments »

Related posts:

  1. Back to Basics: Unix System Stats Utilities
  2. Back to Basics: Unix File Permissions
  3. Working With Unix Variant Differences
  4. Multi-user Security in Linux
  5. The Perils of Sudo With User Passwords

LDAP: Understand the Protocol and Work With Entries

Posted: February 22nd, 2010 | Author: | Filed under: Linux / Unix | Tags: , | No Comments »

Last week we explained how LDAP directories work, without really explaining how to use them. This week we’ll show how LDAP queries work, after explaining how the protocol works.

The LDAP protocol supports just a few fairly easy to understand operations. Knowing what’s available provides administrators with the ability to surmise how various applications are using LDAP, troubleshoot issues, and construct their own search queries and filters more effectively.

A client, be it a PHP script, command-line program like ldapsearch, or LDAP libraries for user authentication in Unix, will connect to a server on port 389 (or 636 with SSL), and send one of roughly a dozen operation requests. The following operations define how the LDAP protocol works:

Bind
Binding is the pivotal concept to understand. It is optional, depending on access control restrictions defined in the server. The act of binding is authentication: it sends a user’s DN and password. Binding anonymously may not allow access to all directory entries, or it may not be allowed at all, again depending on how the server is configured.

Search or Compare
Search is used to both list entries and search for them. Searching supports a number of parameters, which define how the search is carried out.

  • Base: object to start at
  • Scope: how much to search; one entry only, a single level below, or the entire subtree below
  • Filter: limit (optimize) search based on attribute/value or object filters
  • derefAliases: whether or not to follow alias entries
  • attributes: which attributes to return (none specified means return all)
  • sizeLimit, timeLimit: number of entries to return, and a time limit
  • typesOnly: just return the attribute types, not the actual values

Add, Delete, Modify (Update types)
Updating an LDAP entry can take the form of three operations: add, delete, or modify. Actually four, because modify can modify either an entry or a DN. As was explained last week, modifying the DN simply means moving an entry. Add and Delete do the obvious.

Extended Operations
Extended operations can be added at will. For example, many servers support the STARTTLS command, tells the server to start a secure connection.

Abandon
An Abandon operation will abandon any operation, hopefully. There is no guarantee the server will honor an abandon request.

Unbind
Unbind abandons any outstanding operations and disconnects a client.

As mentioned before, LDAP is pretty simple. You can connect, search or update entries, and then disconnect. Nearly every LDAP communication follows those three steps.

So how does one connect? The majority of connections to an LDAP server are made by LDAP client programs on a Unix machine, in environments that use LDAP for server directory services. Web applications often gather and display directory information, or use LDAP to authenticate people. Aside from those, LDAP connections can also be made by Perl or even shell scripts to manage the information within. When you want to manually search or update information, you will generally use some common tools such as ldapsearch, ldapvi, or ldapmodify.

Searching an LDAP directory can be challenging if you’ve never done it before. The command-line utilities have a few arguments that aren’t optional. Let’s take a look at an ldapsearch example:
ldapsearch –h ldapserver.example.com –b ou=People,dc=example,dc=com uid=charlie

The ldapsearch program, in most Unix/Linux environments, take the same arguments. You must specify a server (-h) and a base (-b) to begin searching at. The base can be as broad or as specific as you’d like. We’ve chose to start searching at the ou (organizational unit) called people, withing the domain components used to designate our portion of the tree. I could have left out the ou=People portion, but if there is anything else at the level below dc=example, then it would search through those too. It faster to specify the subtree as close to the entry as possible, if you know it. Finally, the last argument was a search filter. I stated that I was interested in all entries where the value of the attribute uid was “charlie.”

The previous example used an anonymous bind, since a DN wasn’t specified. If you need to search information that is restricted to certain people, then specifying –D followed by a user DN will cause ldapsearch to bind as that user, and prompt for a password.

Search filters can be quite complex. When you’re searching manually with ldapsearch, you probably won’t get very complex. When writing a script that could potentially be run very often, you want as optimal a search as possible. Search filters can specify many thing, including what object classes to look for. It’s all about providing as many hints to the server as possible, so that it may make best use of its search indexes.

A search filter has a few basic operators, including “and” and “or” operators. The general syntax is similar to RPN (for math geeks) or functional languages (for programmers). If we want to search for a person whose given name is Bob, and mail attribute is also bob, we could use a search filter of:
(&(givenName=bob)(mail=bob))

If we wanted to return all entries where either bob is the givenName or the mail attribute, we could simply specify: (|(givenName=bob)(mail=bob))
Notice the | symbol, followed by two or more attribute/value pairs. In reality, we would really want to specify what object class we’re looking for, if this was used in a script: (&(objectClass=person)(|(givenName=bob)(mail=bob)))
The filter ensures that the objectClass is person, and the other nested statement is true. Again, we’re just trying to give as many hints to the server as possible.

An LDAP URL is similar, but it contains all the information necessary to both identify a server and perform a search. URLs similar to this one, or portions of it, may be required to configure some LDAP clients: ldap://ldap.example.com/ou=People,dc=example,dc=com?one?(pod=evil)

The general format is: ldap://host:port/BaseDN?attributes?scope?filter

LDAP is extremely powerful, and is certainly the best place for server-based directory information and people information. If you already live in an LDAP environment, hopefully you have a better understanding now. If you’re pondering an LDAP deployment, go and unleash the power now.


No Comments yet... be the first »

Related posts:

  1. An Introduction to LDAP
  2. Managing Virtual Machine and Cloud Sprawl
  3. Zenoss: We Can Ditch Nagios Now
  4. Understanding Linux Virtual Memory
  5. Back To Basics: Unix Differences in Performing Tasks

Working With Unix Variant Differences

Posted: February 20th, 2010 | Author: | Filed under: Linux / Unix | Tags: , , , , | No Comments »

One thing is for certain: Unix is complicated. Linux does it one way, Solaris another, and all the BSDs, yet another. Fortunately there is some logic behind the differences. Some differences have to do with where the OS came from, and some were deign choices, intended to improve usability. In this article we’ll talk about a few major differences between the Unix variants, and tell you what you need to know about various differences in command-line utilities.

Systems

First, recall that Unix started off in research labs, and two main flavors came about: System V (SysV), and BSD. SysV (five, not “vee”) spawned from AT&T Unix, in their fourth version, SVR4. BSD, from Berkeley, is the competing Unix variant. They both derived from the same Unix from Bell labs, but quickly diverged. Despite POSIX efforts, there are still BSD and SysV systems today, and their functionality still diverges.

Most operating systems are pretty clearly associated with one or the other, and generalizations about BSD vs. SysV prove correct. FreeBSD is the main branch from the traditional BSD, soon followed by NetBSD and OpenBSD. Then OS X came about, which was loosely based on FreeBSD (but is very BSD-like). On the SysV side of the house, AIX, IRIX, and HP-UX were the main variants. In short: commercial entities focused on SysV, academics focused on BSD.

Linux, however, is an oddball. Linux certainly adopted many SysV methodologies, but these days it is also very BSD-like. Sun Solaris, too, is confusing. SunOS started off as BSD, but SunOS 4 was the last BSD version; SunOS 5.x (aka Solaris) is now SysV. The details are much crazier than I’ve alluded to here, and we probably don’t want yet another Unix history lesson. A fun place to start for further reading is the Wikipedia page on Unix_wars.

Fundamental Differences
It has been said that one can tell which system they are using based on two indicators: whether or not the system boots with inittab, and the format of their accounting file. Process accounting isn’t really used any longer, and most people don’t even know what it’s for, so that’s mostly moot. The boot system, however, is still critical to understand.

SysV booting means you use inittab. The init program, when run by the kernel, will check /etc/inittab for the initdefault entry, and then boots to the runlevel defined there. Entering a runlevel means that each startup script in the directory will be run in order. Sequentially, and slowly. Sun was so annoyed with this they implemented a mechanism to fire up services in parallel, among other things, with the Service Management Facility (SMF). Ubuntu Linux implemented Upstart, which basically works around the sequential nature of init scripts too.

BSD booting means that init simply runs /etc/rc, and that’s all. Well, it used to. Soon BSD systems implemented rc.local, so that software and sysadmins alike could implement changes without fear of harming the critical system startup routines. Then /etc/rc.d/ was implemented, so that each script could live separately, just like SysV init scripts. Traditionally, BSD-style scripts didn’t take arguments, because there are no runlevels, and they only run once: on startup. There are still no runlevels in BSD, but the startup scripts generally take “start” and “stop” arguments, to allow sysadmins and package management tools to restart services easily.

Command Arguments
The most frustrating, and quickest to surface differences between SysV and BSD, are in the traditional utilities. Some common commands take very different arguments, and even have some very different functionality. This isn’t so important if you’re in Linux now, as it generally supports both, but once you find yourself in BSD-land, you’re up for some confusion.

The first command people usually run into is ‘ps.’ The arguments differ:

  • SysV: ps –elf
  • BSD: ps aux

Linux supports both, BSD does not. Often we may want to list all processes owned by a particular user. In BSD, you must run, “ps aux |grep username” but in SysV you can run, “ps –u username.” Just plain ‘ps’ will list your own processes in both flavors.

Another commonly noticed difference is with the ‘du’ command. Not because some older systems don’t support the –h argument to provide human-readable output, but because they display different things.

  • SysV: shows the amount available in 512-byte blocks
  • BSD: nice output showing size in bytes and percentage used

Printing in BSD is always confusing for SysV users, and vice-versa. Again this isn’t as common, since newer OSes support both, but it’s noteworthy nonetheless. BSD systems traditionally used lpr, lpq, and lprm to administer print jobs, whereas SysV had lp, lpstat, and cancel. Most systems adopted the BSD style, since lpr-ng (next generation) provided these commands, and CUPS subsequently adopted the BSD variants.

Other programs, such as du, who, ln, tr and more will have slight differences between SysV and BSD. Heck, the differences between the various Unix standards are confusing enough that a single Unix variant may have multiple directories of utilities. Take a look at Solaris’s /usr/ucb, /usr/xpg4, and /usr/xpg6 directories. Each standard they support, which has differences from POSIX, is documented and implemented in a separate location. Too bad Linux doesn’t comply with any standards.

In the end, the differences outlined here are probably the only ones anyone would ever notice. The nuances between du, for example, may be applicable for people writing shell scripts for systems administration procedures. The differences do turn up often enough to be mentionable, but in reality this level of work requires reading manual pages so often that they’d figure it out quickly. User-level utilities are “similar enough” with the exception of ps.

There are so many other differences in system maintenance procedures that those are more frequently focused on. Once the ‘ps’ hurdle is out of the way, and you understand how the system boots, the main problems are more conceptual, as in “how do I add a user.” These vary by OS, and also by distribution of Linux.

Come back next week to learn about the different ways Unix-like operating systems facilitate systems administration tasks.


No Comments yet... be the first »

Related posts:

  1. Back To Basics: Unix Differences in Performing Tasks
  2. Back to Basics: Unix System Stats Utilities
  3. Back to Basics: Unix File Permissions
  4. Understanding Linux Virtual Memory
  5. LDAP: Understand the Protocol and Work With Entries

Networking 101: Subnetting – Slice Up 32-bits

Posted: February 20th, 2010 | Author: | Filed under: Networking 101 | Tags: , , , , | No Comments »

Welcome to networking 101, edition two. This time around we’ll learn about subnets and CIDR, hopefully in a more manageable manner than some books present it.

But first, let’s get one thing straight: there is no Class in subnetting. In the olden days, there was Class A, B, and C networks. These could only be divided up into equal parts so VLSM, or Variable Length Subnet Masks, were introduced. The old Class C was a /24, B was a /16, and A was a /8. That’s all you need to know about Classes. They don’t exist anymore.

An IP address consists of a host and a network portion. Coupled with a subnet mask, you can determine which part is the subnet, how large the network is, and where the network begins. Operating systems need to know this information in order to determine what IP addresses are on the local subnet and which addresses belong to the outside world and require a router to reach. Neighboring routers also need to know how large the subnet is, so they can send only applicable traffic that direction. Divisions between host and network portions of an address are completely determined by the subnet mask.

Classless Internet Domain Routing (CIDR), pronounced “cider,” represents addresses using the network/mask style. What this really means is that an IP address/mask combo tells you a lot of information:

network part / host part
0000000000000000/0000000000000000

The above string of 32-bits represents a /16 network, since 16 bits are masked.

Throughout these examples (and in the real world), certain subnet masks are referred to repeatedly. They are not special in any way; subnetting is a simple string of 32 bits, masked by any number of bits. It is, however, helpful for memorizing and visualizing things to start with a commonly used netmask, like the /24, and work from there.

Let’s take a look at a standard subnetting table, with a little bit different information:

Subnet mask bits

Number of /24 subnets

Number of addresses

Bits stolen

/24

1

256

0

/25

2

128

1

/26

4

64

2

/27

8

32

3

/28

16

16

4

/29

32

8

5

/30

64

4

6

/31

128

2

7

Because of the wonders of binary, it works out that a /31 has two IP addresses available. Imagine the subnet: 2.2.2.0/31. If we picture that in binary, it looks like:

00000010.00000010.00000010.00000000 (2.2.2.0)
11111111.11111111.11111111.11111110 (31)

The mask is “masking” the used bits, meaning that the bits are used up for network identification. The number of host bits available for tweaking is equal to one. It can be a 0 or a 1. This results in two available IP addresses, just like the table shows. Also, for each additional bit used in the netmask (stolen from the network portion), you can see that the number of available addresses gets cut in half.

Let’s figure out the broadcast address, network address, and netmask for 192.168.0.200/26. The netmask is simple: that’s 255.255.255.192 (26 bits of mask means 6 bits for hosts, 2^6 is 64, and 255-64 is 192). You can find subnetting tables online that will list all of this information for you, but we’re more interested in teaching people how to understand what’s happening. The netmask tells you immediately that the only part of the address we need to worry about is the last byte: the broadcast address and network address will both start with 192.168.0.

Figuring out the last byte is a lot like subnetting a /24 network, but you don’t even need to think about that, if it doesn’t help you. Each /26 network has 64 hosts. The networks run from .0 to .64, .65 to .128, .128 to .192, and from .192 to .256. Our address, 192.168.0.200/26, falls into the .192 to .256 netblock. So the network address is 192.168.0.192/26. And the broadcast address is even simpler: 192 is 11000000 in binary. Take the last six bits (the bits turned “off” by the netmask), turn them “on”, and what do you get? 192.168.0.255. To see if you got this right, now compute the network address and broadcast address for 192.168.0.44/26. (Network address: 192.168.0.0/26; broadcast 192.168.0.63).

It can be hard to visualize these things at first, and it helps to start with making a table. If you calculated that you wanted subnets with six hosts in each of them, (eight, including the network and broadcast address that can’t be used) then you can start making the table. The following is 2.2.2.0/29, 2.2.2.8/29, 2.2.2.16/29 and the final subnet of 2.2.2.249/29.

Subnet Number

Network Address

First IP

Last IP

Broadcast Address

1

2.2.2.0

2.2.2.1

2.2.2.6

2.2.2.7

2

2.2.2.8

2.2.2.9

2.2.2.14

2.2.2.15

3

2.2.2.16

2.2.2.17

2.2.2.22

2.2.2.23

32

2.2.2.249

2.2.2.250

2.2.2.254

2.2.2.255

In reality, you’re much more likely to stumble upon a network where there’s three /26′s and the final /26 is divided up into two /27′s. Being able to create the above table mentally will make things much easier.

That’s really all you need to know. It gets a little trickier with larger subnets in the /16 to /24 range, but the principal is the same. It’s 32 bits and a mask. Do, however, realize that there are certain restrictions governing the use of subnets. We cannot allocate a /26 starting with 10.1.0.32. If we utter the IP/mask of 10.1.0.32/26 to most operating systems, they will just assume we meant 10.1.0.0/26. This is because the /26 space requires 64 addresses, and they must start at a natural bit boundary for the given mask. In the above table, what would 2.2.2.3/29 mean? It means you meant to say 2.2.2.0/29.

Those tricky ones do demand a quick example. Remember how the number of IP addresses in a subnet gets halved when you take another bit from the network side to create a larger mask? The same concept works in reverse. If we have a /25 that holds 128 hosts, and steal a bit from the host (netmask) portion, we now have a /24 that holds 256. Google for a “subnet table” to see the relationship between netmasks and network sizes all at once. If a /16 holds 65536 addresses, a /17 holds half as many, and a /15 holds twice as many. It’s tremendously exciting! Practice, practice, practice. That’s what it takes to understand how this works. Don’t forget, you can always fall back to counting bits.

The next step, should you want to understand more about subnets, is to read up on some routing protocols. We’ll cover some of them soon, but in the next installment of Networking 101, we’re starting our trip up the OSI model.


No Comments yet... be the first »

Related posts:

  1. Networking 101: IP addresses
  2. Networking 101: More Subnets, and IPv6
  3. Networking 101: Understanding Layers
  4. Networking 101: Layer 2, Link and Spanning Tree
  5. What the Heck is a TCAM?

An Introduction to LDAP

Posted: February 19th, 2010 | Author: | Filed under: Linux / Unix | Tags: , | No Comments »

LDAP directory services are nearly ubiquitous these days. Every sysadmin should know how to work with directories, understand how they are constructed, and have a certain level of familiarity with the LDAP protocol itself. In this, part one of two, we will introduce LDAP and explain how entries and schemas work. Next week, the second part will cover the LDAP protocol, working with LDAP entries, and searching and storing data.

LDAP is actually quite simple, even though it does make use of the ITU X.500 standard—a notoriously complex specification. X.500 directories were accessed via DAP, or Directory Access Protocol. It was large, complex, and unruly, so Lightweight DAP was created. That’s almost accurate; in fact, LDBP (Lightweight Directory Browsing Protocol) came first, because all you could do was search. When the functionality to modify entries was implemented, LDAP was born.

LDAP Structure
A directory can be defined as a set of objects with similar attributes, organized in a hierarchical manner. Sorry, but I must use the old phone book analogy now. In a phone book, an object is a person, and each person has a set of similar attributes: a phone number and perhaps an address. LDAP is the same, but you may make use of many other types of attributes.

LDAP directories are organized in a tree manner, and the design often will reflect organizational or geographic boundries. X.500 tells us:

  • A directory is a tree of directory entries
  • An entry contains a set of attributes
  • An attribute has a name, and one or more values.

Attributes are defined in a schema, which specifies what types of things can be attributes and whether or not you can multiple values per attribute.

Every entry in a directory has a unique identifier, called the Distinguished Name (DN). The Relative DN (RDN) is part that specifies the current attribute you’re dealing with, sort of like a relative path in Unix (./file). The DN, then, would be a full path (/var/lib/file). A sample directory entry’s DN, therefore, would look like: cn=”john doe”,dc=mytree. The RDN is cn=”john doe”, and the DN is the full path, starting at the top of the tree. A “cn” simply means the “common name” that the entry is referred to as, and “dc” is the “domain component.”

You will often see examples of LDAP structures that use DNS names for the domain component, such as: dc=example,dc=com. This is not necessary, but since DNS itself often implies organizational boundaries, it usually makes sense to just use your existing naming structure. One final note about a DN; it changes over time. If you change a DN, you’re effectively moving an entry in the tree. Some LDAP servers support unique identifiers that will track the movement of entries, but you often don’t need to care. Just know that even though a DN is unique, it changes over time.

LDIF Example
A sample directory entry (of a person) looks like this:
dn: cn=John Doe,dc=myplace
cn: John Doe
givenName: John
sn: Doe
telephoneNumber: +1 555 555 1234
telephoneNumber: +1 555 555 5555
mail: john@example.com
manager: cn=Bob Smith,dc=example,dc=com
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: person
objectClass: top

All of the attributes (objects) listed above are associated with the DN; it is a single directory entry. Objects (givenName, sn, etc) are defined by schemas. Every entry must list the objectClass that every attribute is using. For example, organizationalPerson defines what values can live in the attribute called “manager.” If the objectClass wasn’t listed, the LDAP server wouldn’t know what values were allowed, so it wouldn’t allow you to define an attribute called manager.

The example above is an LDIF, LDAP Data Interchange Format, entry. That is the entire LDAP entry in text form. You could insert the data into a directory, and in fact, this is exactly what a backup of your directory looks like. It’s just text, and that’s all there is to an LDAP entry. Well almost: most servers also support aliases and references. An LDAP alias can point to another local entry in the same directory, to avoid duplicating information. A reference will provide a new DN to an LDAP client and tell it to go ask another server. Some LDAP servers even support chained references, where the server will go get the answer and return it to the client; the client never knows a referral has taken place. Regardless, LDAP entries are quite simple.

LDAP Schemas
A schema defines the attribute types that entries can contain, as well as the format of their values. It will specify that: Mail contains a well-formed e-mail address, Photo contains a JPEG image, and uidNumber contains an integer, for example.

Here is an example schema we recently created:
attributeTypes: ( 1.1.1.2.1
NAME 'pod'
DESC 'A pod for people to belong in'
EQUALITY caseIgnoreMatch
SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
)
objectClasses: ( 1.1.1.1.1
NAME 'podPerson'
DESC 'A person who belongs in some pods'
SUP top
MUST cn
MAY pod
)

The objectClass is defined, as well as the allowed attributeTypes. Each schema must have a unique OID (object identifier), which is part of the way X.500 works (SNMP is the same way). We created an objectClass called podPerson, gave it a description, an said the entry must contain a ‘cn,’ and may contain a ‘pod.’ The pod attribute can contain any value, because the only restriction specified is that case doesn’t matter. After loading that scheme into our LDAP server, we could then add a ‘pod’ attribute to each person entry.

Since LDAP is so lightweight and simple, it is not suitable for a few things. It’s very tempting to store tons of data in LDAP, since so many applications can reference LDAP. Unix machines can use LDAP for passwd, shadow, group, netgroup, protocols, and just about everything in nsswitch.conf. LDAP is a database, so print accounting programs, configuration management systems, and just about everything that stores data in a DB will support LDAP. It’s fine for most of these things, but LDAP is not ideal for replicating a relational database. The data in LDAP is not ordered, which means you could get results in any order. If your application is querying for only one result at a time, this is fine, but if multiple results are common and order is important, LDAP just won’t work.

Check back next week (i.e. follow me on Twitter and subscribe via RSS, links at top-right of this page) for a look at the protocol, and some practical examples of querying and using LDAP data.


No Comments yet... be the first »

Related posts:

  1. LDAP: Understand the Protocol and Work With Entries
  2. Back to Basics: Unix File Permissions
  3. Multi-user Security in Linux
  4. Working With Unix Variant Differences
  5. Back To Basics: Unix Differences in Performing Tasks

Back to Basics: Unix File Permissions

Posted: February 17th, 2010 | Author: | Filed under: Linux / Unix | Tags: , , | 5 Comments »

The most basic, yet important part of mastering Unix is to fully understand the nuances of file permissions. Tools exist to manage permissions easily, but true enlightenment and quick troubleshooting skills come to those who wholly master the concept. Remember, 80% of Unix problems are permissions issues.

The Concept
At the most basic level, there are three types of access:

  • Read – the ability to open a file and read it
  • Write – the ability to write the file
  • Execute – the ability to execute (run) the file

Directories, though similar, are subject to special rules. Write permissions on a directory imply that you can create new files and directories within. Execute permissions are required to ‘cd’ into the directory, and read permissions are required to list the contents (‘ls’).

You will generally see permissions represented as r, w, or x; for read, write, and execute. Running ‘ls –al’ on the command line will show three sets of these strung together.

For example: -rwxr-xr-x

The dash means that the permission is not set. The first place is always reserved for special identifiers, like ‘d’ for directories or ‘c’ for character devices. The next place begins the actual permissions, for the user, group, and other categories.

Every access control in Unix is based on “who you are.” The user is identified by the uid (user ID), as defined by a person’s user account. The third field in the /etc/password file, for example, specifies what a user’s uid is. Similarly, every user belongs to a default group, as identified by the fourth field in the passwd file. Users can belong to many groups, but they’re always a member of their default group.

The above example of -rwxr-xr-x means that the owner of the file may read, write and execute it, the group members may read or execute it, and everyone else on the system may also read or execute the file.

A full example, from the output of ‘ls -l’ is:
-rw-r–r–  1 charlie root        164 2006-12-10 23:51 test.js

The file named test.js is owned by me with read and write permissions, is set to the root group who can only read it, and also allows everyone else to read it.

How it Really Works
That’s basically enough to get by, but being able to understand the more advanced modes of file permissions, your umask, and the numeric representation demands a full understanding. In reality, there are 8-bits available for each type of attribute. Take a look at Figure 1 and note that wherever you see a 1 in the binary column, a corresponding permission will exist.

Number PermissionsBinary
0---000
1 --x001
2 -w-010
3 -wx011
4 r--100
5 r-x101
6 rw-110
7 rwx111

As you can see, if a “bit” in a certain position of the binary representation is set, the permissions in that space are activated. The number column is the octal representation, and the “Binary” column is how it really works, from the operation system’s perspective.

Example time. Let’s say we wish to give ourselves read/write/execute permissions, the group read/execute, and everyone else read/execute permissions. The following commands both do the same thing:

  • chmod u+rwx .; chmod go+rx .
  • chmod 755 .

Since we know that setting ‘5’ means rx, we can simply say ‘5’ instead of ‘rx.’ The real advntage to knowing the octal representation is that we can set any arbitrary permissions with a single command. Running the chmod command using the mnemonic requires that we run it each time for each set of permissions.

Likewise, to set our umask, we must know how the permissions are numerically represented. The umask is the default mode with which files and directories will get created. It’s a mask, so if we want to create all files with permissions like 755, we must take the mask. Simply subtract 7 from each item, and 022 reveals itself as the magic setting. See the umask man page for further details.

Advanced Modes
There are, in fact, three other modes you can set on a file or directory. All Unixes support the following:

  • 4000 set user id (suid) on execution
  • 2000 set group id on execution
  • 1000 the sticky bit

If suid is enabled, the permissions look like: -rws——
This means that when the file is executed, it will run with the permissions of the owner of the file. It’s dangerous, but some times necessary and quite useful. For example, a file suid and owned by root will always run as root.

When sgid is enabled, the permissions look like: -rwxrws—
When set on a directory, sgid means that all files created within the directory will have the gid set to the current directory’d gid. This is handy when sharing files with other people, who will often forget to give other members read or write permissions.

The sticky bit looks like: -rwx——T
When the sticky bit is enabled, only the owner of the file can change its permissions or delete it. Without the sticky bit, anyone with write permissions can change the modes (including ownership) or delete a file. This one is also handy when sharing files with a group of people.

There are other tidbits of information, once you get into the nuts and bolts of Unix file permissions too. For example, you can also set ACL attributes, which get horribly complex. Yes, you can give individual users access to your files, but it’s better not to. Creating a new group and sticking to general permissions can accomplish most things. Often the extended attributes aren’t necessary, and ACLs likely won’t work over NFS if you’re using Linux.

Spend some time with the chmod manual page to master tricky parts, if they still aren’t clear. It will also mention some implementation-specific limitations you may need to be aware of.


5 Comments »

Related posts:

  1. Back to Basics: Unix System Stats Utilities
  2. Back To Basics: Unix Differences in Performing Tasks
  3. Working With Unix Variant Differences
  4. Multi-user Security in Linux
  5. The Perils of Sudo With User Passwords