psyphi.net blog - Another collection of braingunk and technolint

Active Directory + Linux account integration

Firstly a note of warning. I’ve done this mostly using CentOS but there’s no reason it shouldn’t work just as well on other distributions. I’ve gleaned a lot of this information by scouring a lot of other resources around the internet, FAQs, newsgroups etc. but as far as I can remember I wasn’t able to find a coherent article which described all of the required pieces of the puzzle.

Secondly the objective of this article is to have unified accounting across Windows & Linux, or at least as close as possible. We’re going to use Microsoft Active Directory, Kerberos, Samba, Winbind, pam and nsswitch. We’re also going to end up with consistent uids and gids across multiple linux clients.

/etc/samba/smb.conf

[global]
	workgroup = PSYPHI
	realm = PSYPHI.LOCAL
	security = ADS
	allow trusted domains = No
	use kerberos keytab = Yes
	log level = 3
	log file = /var/log/samba/%m
	max log size = 50
	printcap name = cups
	idmap backend = idmap_rid:PSYPHI=600-20000
	idmap uid = 600-20000
	idmap gid = 600-20000
	template shell = /bin/bash
	winbind enum users = Yes
	winbind enum groups = Yes
	winbind use default domain = Yes

/etc/krb5.conf

[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = PSYPHI.LOCAL
 dns_lookup_realm = true
 dns_lookup_kdc = true
 ticket_lifetime = 24h
 forwardable = yes

[realms]
 EXAMPLE.COM = {
  kdc = kerberos.example.com:88
  admin_server = kerberos.example.com:749
  default_domain = example.com
 }

 PSYPHI.LOCAL = {
 }

[domain_realm]
 .example.com = EXAMPLE.COM
 example.com = EXAMPLE.COM

 psyphi.local = PSYPHI.LOCAL
 .psyphi.local = PSYPHI.LOCAL
[appdefaults]
 pam = {
   debug = false
   ticket_lifetime = 36000
   renew_lifetime = 36000
   forwardable = true
   krb4_convert = false
 }

Next we join the machine to the AD domain – it’s necessary to specify a user with the right privileges. It also prompts for a password.

net ads join -U administrator

We can check things are working so far by trying to create a kerberos ticket using an existing username. Again it prompts us for a password.

kinit (username)

Then klist gives us output something like this:

Ticket cache: FILE:/tmp/krb5cc_0
Default principal: username@PSYPHI.LOCAL

Valid starting     Expires            Service principal
04/28/10 10:57:32  04/28/10 20:57:34  krbtgt/PSYPHI.LOCAL@PSYPHI.LOCAL
	renew until 04/29/10 10:57:32


Kerberos 4 ticket cache: /tmp/tkt0
klist: You have no tickets cached

Cool, so we have a machine joined to the domain and able to use kerberos tickets. Now we can tell our system to use winbind for fetching account information:

/etc/pam.d/system-auth-ac

auth        required      pam_env.so
auth        sufficient    pam_unix.so nullok try_first_pass
auth        requisite     pam_succeed_if.so uid >= 500 quiet
auth        sufficient    pam_krb5.so use_first_pass
auth        required      pam_deny.so

account     required      pam_unix.so broken_shadow
account     sufficient    pam_localuser.so
account     sufficient    pam_succeed_if.so uid < 500 quiet
account     [default=bad success=ok user_unknown=ignore] pam_krb5.so
account     required      pam_permit.so

password    requisite     pam_cracklib.so try_first_pass retry=3
password    sufficient    pam_unix.so md5 shadow nullok try_first_pass use_authtok
password    sufficient    pam_krb5.so use_authtok
password    required      pam_deny.so

session     optional      pam_keyinit.so revoke
session     required      pam_limits.so
session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid
session     required      /lib/security/pam_mkhomedir.so 
session     required      pam_unix.so
session     optional      pam_krb5.so

If we’re on a 64-bit distribution we’ll find that references to /lib need to be switched for /lib64, e.g. /lib64/security/pam_mkhomedir.so . This file will also create new home directories for users if they’re not present during first log-in.

/etc/nsswitch.conf

passwd:     files winbind
shadow:     files winbind
group:      files winbind

hosts:      files dns

bootparams: nisplus [NOTFOUND=return] files

ethers:     files
netmasks:   files
networks:   files
protocols:  files
rpc:        files
services:   files

netgroup:   nisplus

publickey:  nisplus

automount:  files nisplus
aliases:    files nisplus

Now we need to tell a few services to start on boot

chkconfig smb on
chkconfig winbind on

and start a few services now

service smb start
service winbind start

The Winbind+pam configuration can sometimes take a few minutes to settle down – I occasionally find it’s necessary to wait 5 or 10 minutes before accounts are available. YMMV.

getent passwd

Should now list local accounts (which take precedence) followed by domain accounts. Using ssh to the box as a domain user should make new home directories in /home/PSYPHI/username. If you decide to migrate home directories from /home make sure you change uid and gid to the new domain values for that user, then remove the old local account.

There are a handful of limitations of this approach –

Though usernames and groupnames map ok, linux uids still don’t map to the windows uids so permissions don’t quite work across smb/cifs mounts
The standard linux tools for user & group modification don’t work for domain accounts (adduser/usermod/groupadd/… etc.)
Winbind seems unstable. On a lot of systems I’ve resorted to cronning a service winbind restart every 15 minutes, which seriously sucks
… and probably others too

For debugging /var/log/secure is very useful, as are the samba logs in /var/log/samba/.

A Little SQL “EXPLAIN” example

rosetta stone — http://www.flickr.com/photos/banyan_tree/2802823634/sizes/m/

Background: I have a big table called “channel” with a few hundred thousand rows in – nothing vast, but big enough to cause some queries to run slower than I want.

Today I was fixing something else and happened to run

show full processlist;

I noticed this taking too long:

SELECT payload FROM channel  LIMIT 398800,100;

This is a query used by some web-app paging code. Stupid really – payload isn’t indexed and there’s no use of any other keys in that query. Ok – how to improve it? First of all, see what EXPLAIN says:

mysql> explain SELECT payload FROM channel  LIMIT 398800,100;
+----+-------------+---------+------+---------------+------+---------+------+--------+-------+
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows   | Extra |
+----+-------------+---------+------+---------------+------+---------+------+--------+-------+
|  1 | SIMPLE      | channel | ALL  | NULL          | NULL | NULL    | NULL | 721303 |       |
+----+-------------+---------+------+---------------+------+---------+------+--------+-------+
1 row in set (0.00 sec)

Ok. So the simplest way would be to limit a selection of id_channel (the primary key) then select payloads in that set. First I tried this:

SELECT payload
FROM channel
WHERE id_channel IN (
    SELECT id_channel FROM channel LIMIT 398800,100
);

Seems straightforward, right? No, not really.

ERROR 1235 (42000): This version of MySQL doesn't yet support
  'LIMIT & IN/ALL/ANY/SOME subquery'

Next!

Second attempt, using a temporary table, selecting and saving the id_channels I’m interested in then using those in the actual query:

CREATE TEMPORARY TABLE channel_tmp(
  id_channel BIGINT UNSIGNED NOT NULL PRIMARY KEY
) ENGINE=innodb;

INSERT INTO channel_tmp(id_channel)
  SELECT id_channel
  FROM channel LIMIT 398800,100;

SELECT payload
  FROM channel
  WHERE id_channel IN (
    SELECT id_channel FROM channel_tmp
  );

mysql> explain select id_channel from channel limit 398800,100;
+----+-------------+---------+-------+---------------+---------+---------+------+--------+-------------+
| id | select_type | table   | type  | possible_keys | key     | key_len | ref  | rows   | Extra       |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+-------------+
|  1 | SIMPLE      | channel | index | NULL          | PRIMARY | 8       | NULL | 722583 | Using index |
+----+-------------+---------+-------+---------------+---------+---------+------+--------+-------------+
1 row in set (0.00 sec)

mysql> explain select payload from channel where id_channel in (select id_channel from channel_tmp);
+----+--------------+-------------+-------------+---------------+---------+---------+------+--------+-------------+
| id | select_type  | table       | type            | poss_keys | key     | key_len | ref  | rows   | Extra       |
+----+--------------+-------------+-------------+---------------+---------+---------+------+--------+-------------+
|  1 | PRIMARY      | channel     | ALL             | NULL      | NULL    | NULL    | NULL | 722327 | Using where |
|  2 | DEP SUBQUERY | channel_tmp | unique_subquery | PRIMARY   | PRIMARY | 8       | func |      1 | Using index |
+----+--------------+-------------+-------------+---------------+---------+---------+------+--------+-------------+
2 rows in set (0.00 sec)

Let’s try a self-join doing all of the above without explicitly making a temporary table. Self-joins can be pretty powerful – neat in the right places..

mysql> explain SELECT payload
  FROM channel c1,
       (SELECT id_channel FROM channel limit 398800,100) c2
  WHERE c1.id_channel=c2.id_channel;
+----+-------------+------------+--------+---------------+---------+---------+---------------+--------+-------------+
| id | select_type | table      | type   | possible_keys | key     | key_len | ref           | rows   | Extra       |
+----+-------------+------------+--------+---------------+---------+---------+---------------+--------+-------------+
|  1 | PRIMARY     | derived2   | ALL    | NULL          | NULL    | NULL    | NULL          |    100 |             |
|  1 | PRIMARY     | c1         | eq_ref | PRIMARY       | PRIMARY | 8       | c2.id_channel |      1 |             |
|  2 | DERIVED     | channel    | index  | NULL          | PRIMARY | 8       | NULL          | 721559 | Using index |
+----+-------------+------------+--------+---------------+---------+---------+---------------+--------+-------------+
3 rows in set (0.21 sec)

This pulls out the right rows and even works around the “no limit in subselect” unsupported mysql feature but that id_channel selection in c2 still isn’t quite doing the right thing – I don’t like all the rows being returned, even if they’re coming straight out of the primary key index.

A little bit of rudimentary benchmarking appears to suggest that the self-join is the fastest, followed by the original query at approximately one order of magnitude slower and trailing a long way behind at around another four-times slower than that, the temporary table. I’m not sure how or why the temporary table performance happens to be the slowest – perhaps down to storage access, or more likely my lack of understanding. Some time I might even try the in-memory table too for comparison.

Bookmarks for April 24th from 14:14 to 21:25

These are my links for April 24th from 14:14 to 21:25:

Balloon homepage –
beginspace.co.uk –
Welcome | CamTechNet | The Cambridge Technology Community | –
Camrev –
CamCreative –

Bookmarks for April 22nd through April 24th

These are my links for April 22nd through April 24th:

Bookmarks for April 17th through April 20th

These are my links for April 17th through April 20th:

Chef – Opscode – looks useful. need to read more.
Font Squirrel | Handpicked free fonts for graphic designers with commercial-use licenses. –
Flowplayer – Open Source Flash Video Player for the Web –
Libel Reform Campaign – Free Speech Is Not For Sale –
BATC – Streaming Media –

Bookmarks for April 14th through April 16th

These are my links for April 14th through April 16th:

Scroll Sneak: maintaining scroll position between page loads – Peter Coles’ Personal Blog –
FogBugz – Bug & Issue Tracking, Project Management, Help Desk Software –
LPC2103-based Digital Photo Frame –
Pseudorandom notes –
Against Intellectual Monopoly – arguments against copyright & patent law

Bookmarks for March 17th through April 12th

These are my links for March 17th through April 12th:

myExperiment –
PechaKucha 20×20 –
SweetSpotter – automatic spatial audio correction
500 Internal Server Error – 500 Internal Server Error
Flare | Apps | Dependency Graph –

What does Technology Monoculture really cost for SME?

Open, open, open. Yes, I sound like a stuck record but every time I hit this one it makes me really angry.

I regularly source equipment and software for small-medium enterprises, SMEs. Usually these are charities and obviously they want to save as much money as they can with their hardware and software costs. Second-hand hardware is usually order of the day. PCs around 3-years old are pretty easy to obtain and will usually run most current software.

But what about that software? On the surface the answer seems simple: To lower costs use free or Open Source software (OSS). The argument for Linux, OpenOffice and other groupware applications is pretty compelling. So what does it really mean on the ground?

Let’s take our example office:
Three PCs called “office1”, “office2” and “finance” connected together using powerline networking. There’s an ADSL broadband router which provides wireless for three laptops and also a small NAS with RAID1 for backups and shared files.

Okay, now the fun starts. The office has grown “organically” over the last 10 years. The current state is that Office1 runs XP 64-bit; Office2 runs Vista Ultimate and the once-per-week-use “finance” runs Windows 2000 for Sage and a Gift Aid returns package. All three use Windows Backup weekly to the NAS. Office1 & Office2 use Microsoft Office 2007. Office1 uses Exchange for mail and calendars, Office2 uses Windows Mail and Palm Desktop. Both RDP and VNC are also used to manage all machines.

So, what happens now is that the Gift Aid package is retired and the upgrade is to use web access but can’t run on MSIE 6. Okay. Upgrade to MSIE 8. Nope – won’t run on Win2k. How about MSIE 7? Nope, can’t download that any more (good!). Right, then an operating system upgrade is in order.

What do I use? Ubuntu of course. Well, is it that easy? I need to support the (probably antique) version of Sage Accounts on there. So how about Windows XP? Hmm – XP is looking a bit long in the tooth now. Vista? You must be joking – train-wreck! So Windows 7 is the only option. Can’t use Home Premium because it doesn’t support RDP without hacking it. So I’m forced to use Win 7 Pro. That’s Â£105 for the OEM version or Â£150 for the “full” version. All that and I’ll probably still have to upgrade Sage, AND the finance machine is only used once a week. What the hell?

Back to the drawing-board.

What else provides RDP? Most virtualisation systems do – Xen, virtualbox and the like. I use Virtualbox quite a lot and it comes with a great RDP service built in for whatever virtual machine is running. Cool – so I can virtualise the win2k instance using something like the VMWare P2V converter and upgrade the hardware and it’ll run everything, just faster (assuming the P2V works ok)…

No, wait – that still doesn’t upgrade the browser for the Gift Aid access. Ok, I could create a new WinXP virtual machine – that’s more recent than Win2k and bound to be cheaper – because Virtualbox gives me RDP I don’t need the professional version, “xp home” would do, as much as it makes me cringe. How much does that cost? Hell, about Â£75 for the OEM version. What??? For an O/S that’ll be retired in a couple of years? You have to be kidding! And I repeat, Vista is not an option, it’s a bad joke.

I’m fed up with this crap!

Okay, options, options, I need options. Virtualise the existing Win2k machine for Sage and leave the Ubuntu Firefox web browser installation for the updated Gift Aid. Reckon that’ll work? It’ll leave the poor techno-weenie guy who does the finances with a faster PC which is technically capable of doing everything he needs but with an unfamiliar interface.

If I were feeling particularly clever I could put Firefox on the Win2k VM, make the VM start on boot using VBoxHeadless; configure Ubuntu to auto-login and add a Win2k-VM-RDP session as a startup item for the auto-login user. Not a bad solution but pretty hacky, even for my standards (plus it would need to shut-down the domain0 host when the VM shuts down).

All this and it’s still only for one of the PCs. You know what I’d like to do? Virtualise everything and stick them all on a central server. Then replace all the desktop machines with thin clients and auto-login-RDP settings. There’s a lot to be said for that – centralised backups, VM snapshotting, simplified (one-off-cost) hardware investment, but again there’s a caveat – I don’t think that I’d want to do that over powerline networking. I’d say a minimum requirement of 100MBps Ethernet, so networking infrastructure required, together with the new server. *sigh*.

I bet you’re thinking what has all this got to do with technology monoculture? Well, imagine the same setup without any Microsoft involved.

All the same existing hardware, Ubuntu on each, OpenOffice, Evolution Mail & Calendar or something like Egroupware perhaps or even Google Apps (docs/calendar/mail etc. – though that’s another rant for another day). No need for much in the way of hardware upgrades. No need for anything special in the way of networking. Virtualise anything which absolutely has to be kept, e.g. Sage, without enforcing a change to the Linux version.

I don’t know what the answer is. What I do know is that I don’t want to spend up to Â£450 (or whatever it adds up to for upgrade or OEM versions) just to move three PCs to Windows 7. Then again with Windows 8, 9, 10, 2020 FOREVER. It turns out you simply cannot do Microsoft on a shoestring. Once you buy in you’re stuck and people like Microsoft (and they’re not the only ones) have a license to print money, straight out of your pocket into their coffers.

Of course that’s not news to me, and it’s probably not news to you, but if you’re in a SME office like this and willing to embrace a change to OSS you can save hundreds if not thousands of pounds for pointless, unnecessary software. Obviously the bigger your working environment is, the quicker these costs escalate. The sooner you make the change, the sooner you start reducing costs.

Remind me to write about the state of IT in the UK education system some time. It’s like lighting a vast bonfire made of cash, only worse side-effects.

Bookmarks for March 9th through March 17th

These are my links for March 9th through March 17th:

OpenCL Hello World Example –
Introductory Tutorial to OpenCL™ –
Mac Dev Center: OpenCL Programming Guide for Mac OS X: Basic Programming Sample –
Mac Dev Center: OpenCL Programming Guide for Mac OS X: OpenCL on the Mac Platform –
OpenInkpot – Replacement firmware for some ebook readers
Quake-Catcher Network –
wmarow’s disk & disk array calculator –
UX London –

Bookmarks for February 17th through March 5th

These are my links for February 17th through March 5th:

http://www.essexcw.org.uk/ –
Search using a map – Gîtes de France –
Ion Torrent – Semiconductor Sequencing for Life –
lg’s murder at master – GitHub – large-scale code deployment with bittorrent
Virtuoso Open-Source Wiki – Virtuoso Open-Source Edition