When only direct db access will do

Over the years I've had a few "close calls" where a corrupted directory db was brought back from the abyss. Had one of those the last couple of days, when one of our newly installed Oracle Internet Directory (OID) instances started misbehaving after I screwed up the acl (access control list) while trying to use the aci (access control instruction) "wizard" in oidadmin to create a new set of permissions.

In seven years of working with LDAP directories. I've come to view the GUI administration consoles that ship with commercial products with deep suspicion. Ditto for the "automatic" configuration routines in almost all 3rd party apps that claim to be "LDAP compliant". In this case I should have known better, but the whole purpose of the work I've been doing the last few months is to learn as much as I can about the new application stack so I can provide useful input on how we go forward.

Anyway, the problem we (Oracle Support and I) were faced with was a cn=Users container that could no longer be searched, even by cn=orcladmin. Now "cn=orcladmin" is the root account for OID, just like "cn=Directory Manager" for the Netscape family of directories (Netscape, iPlanet, Sun, RedHat), or OpenLDAP's "cn=Manager". Of all the login accounts on the directory, this is the one that is supposed to not be subject to access controls. In our case, every time we tried to search the container over LDAP (using ldapsearch) the directory process would die and get restarted by Oracle's monitor process.

After going through a series of diagnostic procedures (stopping and starting different pieces of the stack and then reading the logs), it was finally determined that the way to deal with this was to use Oracle's "bulk" utilities to access the underlying database and restore it to health.

We first issued an "opmnctl shutdown" to kill all running processes, and then used ldifwrite to dump the entire cn=Users container directly from the datatabase to an LDIF file. Then we used bulkdelete to remove the container, and all it's subentries, from the database. I then edited the LDIF file to remove the corrupt aci from the container base. Finally, bulkload was used to re-load the corrected LDIF back into the database. During this process the data was reindexed and subjected to rigorous checking.

A "opmnctl startall" brought the directory and it's associated processes back on line, and after a few simple tests using ldapsearch and ldapmodify we were satisfied all was well again.

Here's the syntax for the commands used during this recovery (during each operation you will be prompted for the root directory user (cn=orcladmin) password, which should be the same as that for the ODS database user):

[In these examples my directory root is "dc=mycompany,dc=com", "testinf" is the $ORACLE_SID for the infrastructure database where directory data is stored and "mydata.ldif" is my LDIF data file. Where "\" appears at the end of a line, it is a "continuation character" indicating that the line should not be broken by a carriage return]

Dumping the container data from the database
$ORACLE_HOME/ldap/bin/ldifwrite connect=testinf basedn="cn=Users, \
dc=mycompany,dc=com" ldiffile=/tmp/mydata.ldif>


Removing the container from the database
$ORACLE_HOME/ldap/bin/bulkdelete connect=testinf basedn="cn=Users, \
dc=mycompany,dc=com"


Reloading the LDIF data directly into the database
$ORACLE_HOME/ldap/bin/bulkload connect=testinf check="TRUE" \
generate="TRUE" append="TRUE" file=/tmp/mydata.ldif


Lest it appear that Oracle is alone in making such lifesaving "direct to database" methods available, Sun/RedHat and OpenLDAP all have similar tools used to directly manipulated their underlying BerkeleyDB hash table managers (BerkeleyDB is now, ironically, owned by Oracle).

For OpenLDAP these are slapcat for dumping data from the db, and slapadd to load data directly into the db. With OpenLDAP there is no way to directly delete just a portion of the directory tree. You've got to dump and reload the whole thing (making whatever changes you need to the LDIF file in between).

Sun/RedHat have a set of command line scripts, db2ldif and ldif2db that you to dump and reload data from the db, but still require you to in essence reload everything if you want to clean it up. For practical purposes you'd want to do this on a Netscape family directory anyway, to make sure all indexes are regenerated correctly (not always a given with their architecture).

OID: Spit install = Splitting Headache

Alot of people who read this are going to disagree with what I'm about to say. But here it is.

When building an Oracle 10g AS infrastructure most enterprises will do a split install, putting the application server(s) on one box and the metadata repository (Oracle database) on another. In many cases the applications may even be split up, with Oracle Internet Directory (OID) running on one box, SSO another and so on.

This is insane.

OID is the foundation for any Oracle 10g infrastructure. If OID is down, you're hosed.

Restart lsnrctl, or lose your connection with the database either because the db or it's host is being recycled, or because of an errant firewall rule or some other network connectivity transient, and OID will go out to lunch.

As much as I admire and respect all DBA's, in the six months I've worked with OID the only times it's hung or toppled over is when, for some reason, the database sitting on a remote server became unavailable. Most of the time because the instance I needed was down and no one realized it's significance. Actually, make that 2 months. Up until that time all the test installs I used had the database on the same host as the infrastructure and I made damn well sure it was up.

There's nothing more disheartening that having to jump into a VNC session to demonstrate to an experienced DBA that the problem really is a database issue (tnsping is useless for this, the best method is to log on as the system user, make sure your environment is set and fire up sqlplus as ODS).

Many years ago I made a similar argument about locating Microsoft Active Directory database files on a SAN. Given that AD is the heart and soul of the server operating system itself, putting the edb files off on remote storage is clearly asking for trouble.

While using a remote db could have some benefits, like allowing for a high availability configuration such as multiple infrastructure servers connected to a RAC, given the critical place that OID has in the environment I still think it would be better to have the metadata db on the same box as OID. If the environment needs to be highly available, you can still RAC at the app tier or use OID's robust multi-master replication feature.

how do i unlock an EBS account?

Make the value for END_DATE in FND_USER null, of course.

SQL>UPDATE FND_USER
SQL>SET END_DATE = ''
SQL>WHERE USER_NAME = '[Your User Name]';
...
SQL>COMMIT;


To do it in Perl just use the code I've got below (in mod_ebsdb) and substitute the above SQL for the value of $sql (you might want to change what you print to console in order to prevent confusion). Like this:

my $sql= "UPDATE FND_USER
SET END_DATE = ''
WHERE USER_NAME = '$user_name'";

The aforementioned code will commit the transaction, assuming it passes the simple "Does this USER_NAME exist?" test.

indexing attributes on oracle internet directory

Basic task that we all have to do, no matter what vendor's directory we use.

For OID, like almost everything else, it requires a special tool to reindex the database. The tool is called catalog. It's found under $ORACLE_HOME/ldap/bin.
$ORACLE_HOME/ldap/bin/catalog connect=[oiddbsid] add="TRUE" \
attribute="[attributename]" [or file="filename"]

You can do a bunch of attributes with the optional "file=" parameter instead of "attribute=", where the value would be the path to a text file containing a simple list of the attributes to be indexed, one attribute name per line.

Sometimes the change will be effective right after running the command (Oracle actually recommends shutting down OID before indexing -- which makes sense on a heavily used production system). Other times you'll have to restart OID to see the change take place.

Using Perl to Re-Link EBS GUIDS

If you've been following the last few posts you've probably guessed that I'm now in the middle of a pretty intense effort to get control of user management in an Oracle Internet Directory (OID) + Enterprise Business Suite (EBS) environment. What's happened is that the consultants, developers and business teams have been out creating accounts on the EBS database ahead of the security team's effort to create users on OID. Now that we've finally begun to integrate EBS with the new Oracle 10g Application Server Single Sign-On Infrastructure, we need to make sure that the user's OID entries are properly linked with their EBS accounts. This happens "automagically" in most cases, but in our situation there was a last-minute decision to change our DIT (Directory Information Tree) structure that required the deletion and re-creation of all our new OID entries AFTER they'd been linked to their corresponding EBS account.

To avoid having to use the tedious procedure Oracle recommends to fix this that I outlined earlier, I decided to put together a script to automate the process. Below is the result, which still contains some (commented) debugging code that I think others may find useful.
#!/usr/bin/perl
# updateguid.pl Updates USER_GUID on EBS FND_USER from orclguid value
# in corresponding OID user entry.

use strict;
use Net::LDAP;
use Net::LDAP::Entry;
use Net::LDAP::LDIF;
use DBI;


our($OIDhost,$OIDusr,$OIDpw,$ebssid,$ebsappusr,$ebsapppw);

my $HOME = $ENV{'HOME'};

$ENV{'TNS_ADMIN'} = "/etc/oracle";

require "$HOME/etc/orclapp.conf";

my $usrbase = "cn=users,dc=mycorp,dc=com";
my $query = "(givenname=*)"; # Just hit the "real" people
my @attrs = qw(uid orclguid);


fix_guids();

sub fix_guids {

my $ldap = Net::LDAP->new($OIDhost);
my $mesg = $ldap->bind($OIDusr, password =>$OIDpw);
die ("failed to bind with ",$mesg->code(),"\n") if $mesg->code();

$mesg = $ldap->search( base =>$usrbase,
filter =>$query,
scope =>'sub',
attrs =>\@attrs

);

die "Failed to search with ",$mesg->error(),"\n" if $mesg->code();

while (my $entry = $mesg->shift_entry()) {

my $dn = $entry->dn();
my $uid = $entry->get_value('uid');
my $orclguid = $entry->get_value('orclguid');

## Use this code block for testing guid retrieval
# my $user_name = read_ebsdb($uid);
#print $user_name, "\n";
# print $orclguid, " \n";
# my $user_guid = read_ebsdb($uid);
# print $user_guid, "\n";

mod_ebsdb($uid,$orclguid);


}

}



# This subroutine is for testing guid retrieval
sub read_ebsdb {

my $user_name = @_[0];

my $dbh = DBI->connect("dbi:Oracle:$ebssid",
"$ebsappusr",
"$ebsapppw",

) or die "Database not connected: $DBI::errstr";

my $sql = "SELECT USER_GUID
FROM FND_USER
WHERE USER_NAME = '$user_name'";

my $sth = $dbh->prepare($sql);

$sth->execute();

while (my ($user_guid) = $sth->fetchrow()) {

return $user_guid;

}

$sth->finish;
$dbh->disconnect;

}

# This is the subroutine that commits changes to EBS FND_USER
sub mod_ebsdb {

my $user_name = @_[0];
my $orclguid = @_[1];

# Use AutoCommit =>0 to allow rollback in case of error
my $dbh = DBI->connect("dbi:Oracle:$ebssid",
"$ebsappusr",
"$ebsapppw",
{AutoCommit => 0}

) or die "Database not connected: $DBI::errstr";

my $sql = "UPDATE FND_USER
SET USER_GUID = '$orclguid'
WHERE USER_NAME = '$user_name'";

my $sth = $dbh->prepare($sql);

my $rows_affected = $sth->execute();

if ($rows_affected > 1) {

$dbh->rollback();
print "There are $rows_affected users with ";
print "the user name $user_name. Transaction cancelled!\n";

}
else {

$dbh->commit();
print "$user_name USER_GUID is now $orclguid\n";

}

}

__END__;

Perl Script to List Accounts in EBS Database

Just when you thought it was safe to go in the water!

Here's a quick DBD::Oracle script I cooked up to do what I did using sqlplus in the previous article. I use an external config file to store the Oracle SID (i.e. service name), user (in this case 'APPS') and password. Notice how straightforward it is to plug a standard Oracle query into the code and get results back. The only shortcoming with this script is that the sqlplus page formatting doesn't come through, so if you wanted to do a report you'd need to insert your own.

#!/usr/bin/perl

use strict;
use DBI;

our($orclsid,$orclusr,$orclpw);

my $HOME = $ENV{'HOME'};

$ENV{'TNS_ADMIN'} = "/etc/oracle";

require "$HOME/etc/orclapp.conf";


my $dbh = DBI->connect("dbi:Oracle:$orclsid",
"$orclusr",
"$orclpw",

) or die "Database not connected: $DBI::errstr";

my $sql = qq[ SELECT USER_NAME FROM FND_USER ];

my $sth = $dbh->prepare($sql);

$sth->execute();

while (my($user_id) = $sth->fetchrow()) {

print "$user_id\n";

}

$sth->finish;
$dbh->disconnect;

__END__;

My next task will be to script the linking of OID orclguids to EBS user accounts in FND_USER. Should be fun.

Some resources I used to do this were An Introduction to DBD::Oracle on the Pythian site. I also used my own Oracle Instantclient and DBD::Oracle to properly make and install the module.

Note: Important addition! The line "$ENV{'TNS_ADMIN'} = "/etc/oracle" sets the $TNS_ADMIN environment variable so the script can use an external tnsnames.ora file -- which is handy if you're working in a multiple database environment and want to be able to test your connect strings using sqlplus (or listener status with tnsping).