Another non-ILM post today – but this is just something I’ve done recently, and I found I had to trawl around a fair bit for the information I needed, and I may as well jot it down. This post shows the steps I took to get the configuration working, after encountering various errors along the way. There may well be other ways.
The project was to install two replica instances of ADAM SP1 using NLB for failover, and to add some custom fields to the user Class.
Created a domain service account and:
- made it local Administrator on both servers,
- gave it the Generate security audits permission,
- gave it the “Create All Child Objects” AD permission to the server objects to allow SPN creation.
I also made the forceguest registry setting on the servers.
The installation of ADAM SP1 was straight-forward, as was creating the unique instance on the first server. Creating the replica on the second server was also straight-forward. All of this is well covered in the ADAM step-by-step guide.
I hadn’t actually used NLB before and initially got distracted by this document and thought I had to set all the network settings myself. Dumb – all you need to do is run Network Load Balancing Manager from the Administrative Tools menu and it will do all the network settings for you.
It is a lot easier if you have two NICs though, like in a normal cluster – one for Public and one for Private.
The surprise was really how basic NLB is. NLB does no port monitoring and will forward traffic to the server as long as it is up. Taking the ADAM service down has no effect on it. So in the case of 2 servers with 50% loading, if you stop the ADAM instance on one server, half the requests will fail.
Obviously this is something others have had problems with so, on MSDN, there’s this nice script that will check particular services on the cluster servers, and stop NLB on a server if the service is stopped.
Another couple of gotchas:
- You have to register the shared network name and ip address of the NLB Cluster in DNS yourself.
- When using ADSI Edit on one of the servers you can’t connect to the NLB network name, though you can connect to the ip address, and you can, of course, connect directly to the individual servers, either by name or ip. I have reproduced this in two completely seperate environments, but I don’t know what causes it.
OK, call me dim, but I didn’t realise I was going to have to provide a unique X.500 OID for the purely local atributes I wanted to add to the user class. I ended up using this OID generator script which produces a hideously long number that you can append your own attribute IDs to.
I also caused myself problems by connecting to “localhost” from the ADAM Schema plugin. You get all sorts of weird errors like “Could not connect to the current schema master server” and “The FSMO role ownership could not be verified” – though not straight away of course, only after you’ve gone through the whole process of entering the details of the new attribute. Things worked much better once I figured out I had to use the full dns name of the primary ADAM server.
Once I successfully added an attribute the to user class I managed to get myself into further trouble by jumping onto the replica server and trying to use the attribute straight away. My reward was the dissappearance of the replica from ADSI Edit! It came back after a service restart, but it was a little disconcerting. After that I found the “Update Schema Now” action in ADSI Edit so I routinely updated all servers after any modification, and I didn’t see the problem again.
Again pretty simple, but it wasn’t immediately obvious to me, so: when you create an user object you have to set a password and then set msDS-UserAccountDisabled to FALSE before you can use it.
Checking Replication between Instances
You’re supposed to use dsdiag for troubleshooting, however it gives misleading errors. According to this MS doc it should run error free but, in trying to track down the errors I found this post from Lee Flight (who seems to be the man with all the ADAM answers on the news groups) saying that not all tests work with ADAM. So why include them then!
The errors I see, even in a completely clean VM environment, are:
[servername] Directory Binding Error 1753:
Win32 Error 1753
This may limit some of the tests that can be performed.
[servername$instancename] DsBindWithSpnEx() failed with error 5,
Win32 Error 5.
Skipping all tests, because server servername$instancename is
not responding to directory service requests
It then, despite what it says about skipping all tests, proceeds to run three CrossRefValidation tests, all of which pass.
I can’t say for sure this is what should happen – all I can is that, apart from these dsdiag errors, everything is working as expected.