Throughout integration data pipelines, small amounts of data are stored for re-use in computer memory.  These are typically called variables – in Jitterbit, also known as Data Elements.  This paper discusses variable concepts with the Jitterbit integration platform.

Variables have a type and scope.  Because variables can be seen throughout data integration designs, it’s important to name variables with a convention that assists with a better understanding of their purpose and use.  Note, Jitterbit variable names are case-insensitive.  By using a naming convention for variable types or purpose, the Formula Builder filtering and alphabetic sorting feature accelerate access and comprehension.  This will allow you to more quickly find and manage your relevant variable names “by type”.

We begin by discussing some critical variable types.

Data Types

A variable value typically has a data type.  These include integers, strings, etc. but also can include more complex data types, like time, arrays and dictionaries (key-value pairs)

Date/Time

Date variables hold date values that are convertible into various formats.  They will typically hold start times, end times and used to calculate durations.  I like to use the colon (:) within its name to identify the value as a date/time.  For example, $:name where name is some indication of the variable’s purpose like $:startTime.

CSV (list)

Comma-separated values (CSVs) are a common format of strings and used in files and database queries.  They are typically assigned values rather than returned from functions.  (but Scripts can return CSVs!) In many cases, they are incrementally grown within a While() loop.  I like to begin all CSV-type variable names with L or L_ (for List) –for any variable with even the potential of having 2 or more on (in) the list.  For example, I might use a CSV list name of Lwherein for a parameterized SQL query.

Array

Arrays are like lists but can have 2 or more dimensions.  Jitterbit functions DBLookupAll() and SfLookupAll() return 2-dimensional.  I like to begin all Array variable names with A or A_.  Conversion from an Array type to a CSV (string) is typically done within a While() loop by using an Array index to access an individual Array cell.  For example,

LName=Null();
i=0;While(i<Length(Avalues),
  If(IsNull(Lname),
    Lname=Avalues[i++]
  ,
    Lname+=”,”+Avalues[i++]
  )
)

Note the use of the name – Avalues versus Avalue.  This is so one can write “value=Avalues[i]” and some clarity is achieved.

Conversion from a CSV to an Array is as simple as Avalues=Split(Lname,”,”).

There is a Jitterbit function called ReadArrayString() intended to provide the construction of n-dimensional Arrays from a string but rarely used because it is not reliable today for arrays of 2 or more dimensions.

Finally, if using DBLookAll() or SfLookupAll(), take advantage of the column/field name used in the query to access the returned array’s 2nd dimension.  For example, companyPhone = ACompanies[i][“phone”] where “phone” is the name of a column/field name queried from a database or Salesforce.

Arrays are nice when the order is relevant or multiple dimensions are needed. But Arrays have no easy method to determine if a value already exists in the Array (searchable).  Have a look at the Script below. With an additional “helper” array, you can use the FindValue() function to search arrays.

search4="c";
Set(Arr,"a",-1); Set(Arr,"b",-1); Set(Arr,"c",-1); // Create array Arr: {a,b,c}
Ahlpr=Array(); i=0;While(i<Length(Arr), Set(Ahlpr,i++,-1)); // Create the helper array Ahlpr: {0,1,2}
//FindValue("c",Arr,Ahlpr); // Find "c" in Arr; returns index of "c" in array Arr
If(IsNull(idx=FindValue(search4,Arr,Ahlpr)), // If search4 not found
  Set(Arr,search4,-1); Set(Ahlpr,Length(Arr)-1,-1) // add search4 to Arr
, //else
  idx // return its index
);

Note, the “helper” array above can hold unique values (not necessarily consecutive and numeric) so you can actually “map” values using just these two arrays.

Dictionary

Dictionaries hold key-value pairs.  That is, given a key, the value can be quickly accessed/written from/to the dictionary.  I use a more interesting naming convention for Dictionaries – “valuesByKey” where values is a root name of the Dictionary and Key is a descriptive name of the Dictionary’s key.  When I see “By” in a variable name, I know it’s a Dictionary.  Again, note the use of plural values vs value so one can read value=valuesByKey[“akey”].

Dictionaries can contain complex types like Arrays or CSVs (or another Dictionary).  If so, I will use LvaluesByKey if I’m using a Dictionary of CSVs or AvaluesByKey for a Dictionary of Arrays.  I’ll leave to you to define a naming convention for a Dictionary of dictionaries.

Variable Scope

Variable Scope is a term used to identify the “lifetime” of a variable.  There are two formal variable scopes within Jitterbit – local and global.  Remember, all Jitterbit variable names are case-insensitive.

Local

Local variables imply they exist (their lifetime) only within a Jitterbit Script or Transformation field map.  They are seen only within Scripts or a Transformation field map and should be used if only to minimize global variables (a best practice!).  They many times distinguished from a global variable by the name “not beginning with a dollar-sign ($)”.  (Although one can also read/write global variables using Get()/Set() without using a dollar-sign).  More on global variables below.

I like to use a naming convention for Script parameters.  By default, Script parameters are known by _1, _2, etc. (Today, Scripts accept only strings as parameters and can only return one string as a result.)  I like to keep the leading underscore and use ArgumentList() to reassign to more useful names.  For example, if the Script parameters are _1 (a name string) and _2 (an age integer), I will quickly reassign –at the beginning of the Script– using ArgumentList(_name, _age) so I know these are Script input parameters and their purpose.

Many times, I like to send “optional” parameters to Scripts.  In this case, rather than using ArgumentList(), I will reassign each with a default using IfEmpty().  For example,

_name=IfEmpty(_1,”UNAVAILABLE”); 
_age=IfEmpty(_2,0);

Global

Ideally, you want to minimize your use of Global variables as you would in any programming environment.  One could argue that data integration is unique in that there are good reasons for using Global variables.  In many cases, you have no choice and in others, a best practice.

One very real value of Global variables is seen when provided to Component field values.  In most cases (there are exceptions, so confirmation is necessary), one can provide a Global variable in the following format for various field values.

[gvarName{defaultVal}] where gvarname is the global variable name (without the $ sign) and, optionally a default value (defaultVal) within the braces ({}).  I believe the dollar-sign is ignored in the context but, I’ve been known the keep the dollar-sign in Salesforce SOQL statements as they will appear in the Formula Builder with the leading $ if used.  Not sure this is an intentional feature but it does call-out this unique global variable use in the Formula Builder Data Elements tab.

For example, along with an Email Text, the Email subject line could vary depending upon the message content.  One can use the above format for Email’s To, From, Subject and Text fields.

Remember, Global variables exist for the lifetime of an Operation and values are available to read or change in any chained or called Operations.

The above is an example of using Global variables for Project Configuration.  This and other contexts and conventions are discussed below.  The suggested naming conventions make global variable management far more convenient.  Below are some common uses cases for Global variables and some suggested name management in Jitterbit.  (If you’re familiar with Evernote tag naming conventions, this is very similar)

Metric

Metrics are performance indicators.  Two common metrics are the processing duration (timer) and the number of records processed.  Since these are metrics I always like to collect, I use scripts to manage specific variables to read and write these values as needed.  The Set() and Get() functions allow me to declare global variables “names” at run-time.

For example, a timer Script might be

_timer_name = IfEmpty(_1,"DEFAULT_TIMER_NAME");
_reset = Bool(IfEmpty(_2, false)); //optional
_seconds = Bool(IfEmpty(_3, false)); // optional, return seconds
time = Get(_timer_name);
If(IsNull(time) || Length(time)==0 || _reset, //initialize timer
  Set(_timer_name,Now()) //return Now()
,
If(_seconds, Now()-time, //return duration in seconds
FormatDate(Now()-time,"HH:MM:SS") //or HH:MM:SS
);
)

This way, I can initialize my timer with RunScript(timer,“exampleTimer”), reset my timer or return the duration since initialization in seconds or HH:MM:SS format.  Use of the _seconds parameter is useful if you want to provide an average time per record.  For that, you also need a count value.

Another Script could be developed for a counter or you might try a naming convention of $#counter so you always know this is an integer value to increment or read the current count value.  It’s a best practice to initialize your (thread-safe) counters with InitCounter().

Status

Status values are used for various reporting cases or “whether to proceed”.  I like to use $?status for my status variables and it typically contains a boolean or some small integer value with a specific meaning.

Sometimes a “simple” status is not enough and you might want to augment the status with additional information (perhaps a message).  In cases like this, I use a $[0-9]MsgName naming convention.  The use of an integer value at the front of the global variable name tells me it is a message string (and perhaps a message severity).   I might also vary the integer in the name if I “bubble-up” messages to log child messages into a “parent Operation” log.

Transformations, Operations, and Functions

One case where you have to use a Global variable is within a Transformation if you want the value from another, previously transformed (but same Transformation) field value.  SetInstances() / GetInstance() are good examples.  In these cases, I like to use $@tranVarName for naming these types of variables.  This way, I will know these variables are only used in Transformations directly or indirectly if Scripts are used.  It’s best to avoid using this naming convention outside of transformations and you should reassign to another global if the value is needed elsewhere.

Another case where a global variable is required appears when attempting to re-use one or more utility Operations and you want to parameterize and return information.  In this case, I’ll use $_OpIname as input parameters (similar to Script parameters), $%OpOname for Operation output, and $`OpMsgName for any returned messaging.  By using this naming convention for Utility/Parameterized Operations, I know these are specific to Operation I/O and won’t confuse them with $?status and $0MsgName named variables.

Finally, there are Jitterbit Functions that require global variables to return values.  One example is RegExMatch().  These globals may never be used elsewhere so I’ll use $.funcOname to indicate these are really temporary (but required) global variables.

Configuration

Variables are an excellent place to record and use configuration details of your data pipelines.  A lot of customization goes into each pipeline so being able to “configure” your pipelines adds more value to each.

This is the motivation behind the Jitterbit Project Variable.

Project

Project Variables –a dedicated Jitterbit component– are a good place for recording your project’s “configuration”.  (Project variables appear bolded within Scripts.)  It’s unfortunate but should be noted: Project variable values are not available when using test modes like “Test Connection”, “Test the transformation”, “Test an operation that uses the transformation” or “Test this web service call”.  Here you might want to use [ProjectVar{defaultVal}] syntax so your defaultVal is used for the test mode.

That said, hierarchical naming conventions using a “dot” convention can work well here for the configuration variable list organization.  (Don’t forget to use Component folders as well.) For example, with the Email Message component, one could use Project variables like the following.  I like to use uppercase within Project variable names (mostly out of habit) as I sometimes use the “dot” convention in other global variable names too.

MAIL.smtp
MAIL.to
MAIL.from
MAIL.auth.account
MAIL.auth.password
MAIL.subject
MAIL.text

Finally, Project Variables can be viewed and edited from within Jitterbit’s Web Management Console.  This is advantages for management of these “configuration” variables without using Jitterbit Studio.

Agent (environmental)

Migration from one environment to another can likely require some “re-configuration”.  One simple example might be the necessity to reassign FTP or File Share paths (or Servers) depending upon whether in a Development environment OR a Production environment.  When you “migrate” to your Production environment, you want to minimize, if not eliminate, manual modification for risk of breaking something.  This is where Agent Variables could be useful.

Each Jitterbit Agent has a configuration file called “jitterbit.conf”.  Within this configuration file, you can define Global variables available to all Projects running on that Agent.  Within Jitterbit.conf, see the section titled “[PredefinedServerGlobalDataElement]”.

The Agent will need a restart, but you can define Global variables here for use with all your Projects running on this Agent.  Since this is a “file”, you’ll want to be careful with your use of the backslash character (\) as this could be interpreted as an escape character.  In the case of Windows File Share paths, for example, don’t use the backslash character in the last position as this will escape the trailing single-quote (‘).

As far as a naming convention, I like to lead with a tilde (~) character and use hyphens within the variable name for two reasons.  The first being the tilde identifies the variable as an Agent variable and, because of alphanumeric sorting, will appear just above the Project Variables in the Formula Builder (see screenshot below).  Because of the hyphen-use, the third reason is one cannot simply display the variable’s value like when using other naming conventions.  It’s for this reason, you can put “credentials” in the jitterbit.conf as these typically change between environments.  I don’t suggest you use this approach for credentials in Production environments as the values are really not secure and can still be acquired in less-orthodox ways.

Summary

Although minimizing global variables (aka Data Elements) is a best practice, there are situations where this is unavoidable.  By using a variable naming convention, you can manage your local and global variables depending upon their data type and/or purpose.  As seen, there are four interesting types of global variables – Agent, Project, Default and a required global for variable re-use within various Jitterbit components or functions.  By using a naming convention for various variable types or purpose, the alphabetic sorting and filtering when examining Jitterbit Data Elements in the Formula Builder will allow you to quickly find and manage your relevant variable names by their type/purpose.  I’ve offered some Local and Global variable naming conventions that can help with understanding the variable type and/or purpose, but you don’t have to follow or use these specific conventions.  Here are the few suggestions above summarized in tables.

Scope \ Type CSV Array Dictionary DateTime
Local Lvalues Avalues valuesByKey
Global $Lvalues $Avalues $valuesByKey $:timeStamp

Some contexts where suggested variables are used by purpose.

Purpose \ Context Script Transform Map Indirect
Operation I/O
Status $?status $?status $?status
Counter $#counter $#counter
Input _name $_OpIname
Output $%OpOname
Message $`OpMsgName $`OpMsgName $`OpMsgName
  $[0-9]MsgName $[0-9]MsgName $[0-9]MsgName
Function Out Param $.funcOname $.funcOname
Transform Scope $@tranVarName

Finally, the Configuration variable examples.

Configuration \ Example
Project MAIL.to (replace . with _ if the variable is used within JS)
Agent ~FTP-path

Variables
On the right is a screenshot of the above Global variables within the Data Elements tab of Jitterbit’s Formula Builder.  Notice the alphanumeric sort order.  You can use the Filter field to quickly find or identify your relevant variable purpose.  To see where and how they are used, right-click on each and select View References.

You can define your own naming convention but I encourage convention-use of Transformation-only globals (where I used the @-sign).  The goal is the organize the values listed in Formula Builder’s Data Elements list for easier selection as your Global variable list grows.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s