Requirements Gathering for Integration Projects

Please note: This post originally appeared on (EXTOL has been acquired by Cleo).

Requirements Gathering is a complex process with the purpose of defining a list of capabilities that we expect to meet, once the project is completed.  Given that very “business-y” definition, let’s explore the process of Requirements Gathering and how it relates to Integration projects. In this blog, we will start to address the approach we use to gather requirements and then we’ll apply that approach in subsequent blogs to use cases.

Simply creating a punch-list of features/capabilities/behaviors is not enough to fully define the requirements of an Integration project. Just meeting the defined expectations of A, B, C and so on requires a much deeper dive into what is truly going on, both within the organization and externally with other integration partners (customers, vendors and industry consortiums).

The Requirements Gathering process usually starts when a project is created.  In formal organizations, this initiates with the approval of a project charter, identifying a project Sponsor. In smaller organizations, this can be either a tactical (reactive) or a strategic (forward-thinking) effort directed by a department head or line-of-business manager.  Once the project is initiated, the scope of the project is usually defined and documented. In addition to the Scope document, there are several input documents that are needed to properly create a Requirements document.  To get started, pull out your current system process documentation. Don’t have that, huh? Document the current processes to use as a baseline against the proposed implementation.  Second, find all Service-Level Agreements (SLAs) that are impacted by the systems/applications that may be affected by the proposed project. Third, find any corporate documentation, such as lessons-learned from previous implementation projects and approved corporate policies.  We refer to these as the “rules of engagement”.

Once the supporting “rules of engagement” documents are available, identify all of the systems that are “touched” in the integration. This should be a sub-set, up to all of the systems, in your current system documentation. Using a simple block diagram, draw the systems (both internal and external) and connections between systems that share data.  Also, number the connections for identification and then annotate the connections with the communications methods.  Inventory all of the different communication methods.

The next area to focus on is the data flow; types of data/formats such as XML, EDI or spreadsheets, how much data and the volume-pattern of the data.  This list can get lengthy and encompasses both B2B and A2A transactions such as Inbound/Outbound EDI, ERP/legacy interface files, A2A inter-system data such as flat-files or database-to-database data movements.  Additionally, it’s important to capture any data movement, such as spreadsheets or web-form content, that is brought into/sent from the systems.

Next, focus on the volume-pattern for each data flow.  This is a critical piece.  Volume-patterns have a direct impact on throughput capabilities affecting processor loading, disk activity and communications throughput. The volume-patterns help identify potential constraints in the current or proposed system. It’s unusual for businesses with even modest data volumes to receive the data spread out evenly over a 24-hour period.  It is very typical to see bursts in activity at different times during the day.  For example, a company can receive 1000 invoices a day.  At the rate of 50 per hour over a 24-hour period, this doesn’t seem to be too disconcerting.  But let’s think about a typical business day in the US. What if most of the POs are received near the end of the business day? That changes the maximum system load to maybe 350 POs an hour for 2 hours and the remaining 300 POs are scattered across the other 22 hours of the day.  As you can see, just asking “How many documents per day?” is not sufficient to truly understand the requirements to process the information.

Once the data formats and flow patterns are defined, we can start looking at the business processes that control the flow and see how they are impacted by the data volumes.  A batch interface may be visible that causes systems to block or wait until it is finished.  An example of this is a batch Invoice generation process that takes 2 hours to run.  Invoices cannot be sent until they are generated, so our Outbound EDI invoicing process is “blocked” until the generation process is complete.  We can easily mitigate this bottleneck using database triggers to initiate generation of the EDI data when each invoice is written for a specific customer.  That brings some alignment to the Invoice Generation process and allows a level of real-time processing.

An often-missed aspect of looking at Integration requirements is to not look far enough either upstream or downstream from the proposed integration to identify impact. It is important to see how the proposed integration will fit into existing processes; Is the integration part of a long-running process that spans departmental boundaries?  How about company boundaries?  What if the communications method that transports the outbound data uses a single-threaded blocking model?

As you can see, you need a genuine understanding of the business needs, performance requirements, upstream/downstream processes and a profile of the data volumes.