Overcoming multiple personality disorder in the source data

This post is going to describe an approach I’ve used a couple of times this year, in situations which looked quite different at the outset, but actually had one particular characteristic in common: the source data contained multiple records relating to individual people, and I needed access to all of them.

One case involves “contract” records from an HR database – a person who had moved around the company may have several. The second is an LDAP data source for a univeristy where an inidvidual may have both staff and student profiles simultaneously, and as a student, may have multiple active “enrollments”.

In an ideal situation

There is a key best practise at stake here, which I believe becomes even more significant when you put the FIM Portal into the mix:

Each individual should only be represented once in the Metaverse.

In an ideal situation you will project each person only once from your source data. Changes to contract, enrollment etc would be treated simply as attribute updates to the core Metaverse object, which remains the same.

If your source data lists people multiple times, as in my examples above, you might do some automated manipulation prior to presenting the data to the Sync Service – selecting the best representation of each person and merging data where appropriate. Then you can have a nice one-to-one relationship between your projectoin source and your Metaverse.

But what if you need access to those individual contracts?

This was my problem. In the case of the HR database contract records, the local IT only had access to the contract number. For them the important information related to the person’s job at that site. But for us, centrally, looking to automate moves and changes across the organisation, we needed a way to link those seperate contract numbers to one individual.

At first, and against my better judgement, we started off projecting the contracts into the Metaverse as seperate people. I was assured that, when people moved jobs, all their old accounts were deleted including their mailbox. Pretty quickly this turned out to be not the case at all, and I started with the messy gymnastics needed to reconnect existing accounts to the newer better contract in the Metaverse, without actually deleting anything.

Step One in solving data source MPD: Split the data into “Persons” and “Contracts”

When it became clear that a redesign was necessary, I started by splitting the data into “persons” and “contracts”. In fact for the university this was already the case, but with the HR database I had to use some SQL queries in an SSIS package to split the data into “objectType=person” and “objectType=contract” rows. Clearly a unique identifier is needed for both persons and contracts.

My persons have the following basic information:

  • objectType (=person)
  • personID
  • Data that should be independant of the contracts, like name, gender, date of birth.

My contracts must include the personID so I can join them correctly:

  • objectType (=contract)
  • contractID
  • personID
  • Data that is specific to the contract, like department, job title, start and end dates.

Create Seperate MAs for Persons and Contracts

Once the data was seperated I created different MAs for persons and contracts. The aim is to project the person only once into the metaverse, and then join the changeable contract records from the second MA.

But what about the Import Flows?

The big problem here is that you want to flow data into the metaverse from the Contracts MA. If multiple contracts are joined you’ll get the ambiguous-import-flow-from-multiple-connectors error. Besides which you probably don’t want to be importing from multiple contracts anyway. In my case (both cases in fact) I had to try and choose the “best” contract and then just flow the data from that one. As the “best” could change I have to re-evaluate the selection each time.

Why not just join one contract at a time?

The cleanest setup would be to only have one contract joined to the Metaverse object. The IAF rules are then very simple.

But, unless you can block all except one contract per person using Connector Filters, you may find this tricky to achieve.

Partly the problem is the way the Sync Service does its sync’ing – that is, it works on one connector space object at a time. Additionally, the only other connector space objects accessible during this sync operation are those that can be found by following joins.

So, while sync’ing this connector space object we can only know about other connector space objects where a direct join relationship exists.

This essentially means I can only figure out if the current CS object is really the best choice if all possible contracts are joined to the Metaverse person at the same time.

Blocking Import Flows

So here’s where I ended up.

  1. I let the multiple contracts be joined to the Metaverse person, but I only allow import flows from one of the contracts.
  2. I write the “preferred contract” number into an attribute on the person object.
  3. When sync’ing a CS object I compare it to the current “preferred contract”. If this one is better I replace the contract number on the Metaverse object.
  4. All the import flows (which must all be Advanced rules) check if this CS object is the preferred contract before allowing (or skipping) the flow.

While my IAF rules are now more complex, there have been definite advantages to this approach. The non-preferred contracts are still accessible from the Metaverse object and this has allowed us to more easily identify errors in the source data. Sometimes we have forced a particular contract to be preferred by disconnecting the currently preferred contract and syncing another one.

There is also the benefit of having some information as opposed to none. If I was, for example, blocking all expired contracts from entering the connector space in the first place I would be losing valuable deprovisioning information. If the best pick contract for a person is an expired one, I still want to see it.

Flow Blocking Method 1: The Double Sync

I have tried out two different ways of blocking IAFs. The first is simpler to implement but the downside is you have to Sync twice. Because the evaluation of the contract is done as part of the “import_contractID” rule, and this might happen at any point in the set of flow rules, to ensure all IAFs run for the new preferred contract you have to sync twice.

Case "import_contractID"
  If mventry("contractID").Value <> csentry("contractID").Value
      Whatever comparisons you need to do to see if this contract is preferable to the values
      already imported to the Metaverse object. If it's better, replace the contractID.
  End If

Case "import_otherAttribute"
  'Replicates a Direct IAF
  If mventry("contractID").IsPresent AndAlso mventry("contractID").Value = csentry("contractID").Value
    If csentry("otherAttribute").IsPresent Then
      mventry("otherAttribute").Value = csentry("otherAttribute").Value
    End If
  End If


Flow Blocking Method 2: The First Touch

In an effort to avoid the double-sync I have started using another method where I run my comparison tests above the Case statement, and then set a Utils.TransactionProperties to show the test has run.

Utils.TransactionProperties is the way to pass values between different bits of code run as part of the same object’s Sync operation.

While I now only need to sync once, I still don’t know which IAF rule will get called first, so I have to pass all comparison CS attributes to each IAF rule.

Here’s what the MapAttributesForImport Sub looks like now:

Public Sub MapAttributesForImport(ByVal FlowRuleName As String, ByVal csentry As CSEntry, ByVal mventry As MVEntry) Implements IMASynchronization.MapAttributesForImport

  '' Evaluate this CS object to see if it is a better source for attribute flows to the joined Metaverse object.
  '' Using the Utils.TransactionProperties flag will ensure this test is made only at the start of sync'ing the CS object.

  If Not Utils.TransactionProperties.Contains("FlowEnabled") Or Not mventry("contractID").IsPresent Then

    Dim BestMatch As Boolean = False
    If mventry("contractID").IsPresent Then
      If mventry("contractID").Value = csentry("contractID").Value Then
        BestMatch = True
          Comparison tests - set BestMatch=True if this is a better match.
      End If

      If BestMatch Then
        Utils.TransactionProperties("FlowEnabled") = csentry("contractID").Value
          ''Note: I set this to the contractID to avoid unexpected behaviour following a disconnection.
        Utils.TransactionProperties("FlowEnabled") = "false"
      End If

    End If

    '' Do not continue with flow rules if this CS object has a different id to the best match.
    If Not Utils.TransactionProperties("FlowEnabled").Equals(csentry("contractID").Value) Then
      Exit Sub
    End If

    Select Case FlowRuleName

      Case "Import_contractID"
        mventry("contractID").Value = csentry("contractID").Value

      Case "Import_otherAttribute"
        If csentry("otherAttribute").IsPresent Then
          mventry("otherAttribute").Value = csentry("otherAttribute").Value
        End If

  End Select
End Sub

Importing a reference

There is one final advantage I can extract out of seperating my data into persons and contracts – I can, if I need to, also import “contract” objects into the Metaverse and, with the addition of a multivalue table (for database MAs) I can construct a reference attribute that directly links the person to their contract ojects. This would be very useful if I wanted a person’s contracts to be accessible in the FIM Portal.