Five Zero-Code ways to Import Excel into SAS

Your data is in Excel and you need to import it into SAS.  You googled, and discovered 500 different methods.  Which to choose?  We compare and contrast 5 approaches to this perennial problem.  Before we get onto that though – why is it such a problem?

The crux of the issue is:

Flexibility vs Scalability

Excel, as you probably know, is incredibly flexible.  Data can spread in all directions, move around, be positioned anywhere, on any cell, of any worksheet in a workbook.  That workbook can have different names, exist in different locations, be of different types (xls, xlsx, xlsm).  That’s before we get down to whether the data arrives as values, formatted values, formulas, or other dragons .  And the fact that, as it is typically stored on a shared filesystem, it can be changed by anyone, at any time.

SAS is far less flexible in this regard.  Data is nearly always structured in a table, with fixed columns, of fixed data types, in a fixed library / location, with a fixed name (or naming convention).  That table usually lives on a server, perhaps in a database.  This rigidity is actually a Very. Good. Thing.  It provides consistency, which is the basis for scalability.  And the basis of the consistency is how the data is modelled.

The Data Model

Every table in SAS contains some kind of metadata about how the data is structured – the column names, types (character vs numeric), formats (dates, currency), lengths, encoding (UTF8 vs WLATIN1) and more.  The first question you need to ask yourself, when loading Excel data into SAS, is – do I take the model from Excel?  Or am I targeting an existing model in SAS?

Speaking of SAS.  The word “SAS” can mean so many things – do we mean the language?  The platform?  The company?  A specific product?  Let’s break this down, as the choice of tool will depend on the type of “SAS” you have.

Which Flavour of SAS do I have?

The world of SAS can be broken into 3 major platforms:

  • Base SAS.  Traditional SAS, typically installed on your desktop.
  • SAS Meta. An enterprise deployment with mid-tier and metadata server.
  • SAS Viya.  Cloud native, API driven microservices architecture.

The options available to you for importing Excel will vary depending on the flavour you are using.  How do you know which one you have?  Try running the following code in SAS:

filename mc url "https://raw.githubusercontent.com/sasjs/core/main/all.sas";
%inc mc;
%put %mf_getplatform();

The entry in the log will tell you if your environment is BASESAS, SASMETA or SASVIYA.

Importing Excel into SAS

Without further ado, let’s explore the options available!

1 – Import Excel with Data Controller for SAS

It’s super easy to import an arbitrary Excel file to an existing table using Data Controller for SAS.  You simply choose the table you’d like to modify, then drag your Excel file into the browser. 

Data Controller will scan every worksheet in your Excel file to find a range that matches the target table.  How does it do that?  The top of the range is identified by simply checking for a row that contains all columns as per the target table definition, whereas the bottom of the range is simply the first blank primary key value.

Once the range is found, Data Controller will perform a series of automatic checks and validations, and import the data and a copy of the workbook.  If all the checks pass, it’s one button click to load the data (and corresponding workbook) to the SAS server and notify the approver that a change request has been submitted.

Once the approver approves the change, the table is updated.  There is a full audit history and you can even retrieve the original excel file that was submitted.  The tool does NOT let you load new tables to SAS, nor does it let you modify the columns.  The model is therefore protected by IT, whilst the values are managed by the business

Data Controller is available for SAS Meta and SAS Viya.   A Base SAS version is in development.  Like what you see?  Don’t hesitate to book a demo session and meet the team!

2 – Import Excel using SAS Studio

Upload File to SAS Studio

This approach can be used whether you are using SAS Meta, SAS Viya, or even Base SAS (eg with University Edition).  Just open [YOURSERVER]/SASStudio in a browser (on Viya, it will be /SASStudioV).

The first step will be to get the Excel file to a location where it can be accessed by SAS.

On Viya, that will require opening the Explorer menu, expanding the server dropdown, and right clicking on the directory within. You can then choose the “Upload files” option to import your spreadsheet.  Your upload is limited to 100mb (default system setting), and the target directory is typically the unix home directory for your user.

The “home directory” part is a key point – as it means that other users will not necessarily be able to access that source file.  To easily load to other areas on the SAS filesystem you may need to ask your admin to create a symlink, or use an alternative upload mechanism (such as this one).

Import Excel to SAS Studio

Once your file is available on the server, you can begin the import process.  Simply click on the Start Page, “New Import” and follow the steps in the wizard.

The generated SAS code will be shown in the window below, and the output can be directed to either to WORK or a permanent library as desired.

You may need to rename the default target dataset (eg from IMPORT to IMPORT2) in order to run the code.

3 – How to Import Excel using the SAS Add-In for Microsoft Office

Unlike the other examples presented, this one allows you to load data from directly within your Excel workbook!  You must have the requisite permission to make data write-back to the target table selected.  You can also modify column properties and specify an ‘inactivity timeout’ before ‘edit mode’ is closed.

To use, simply open your desired table and click the “Begin Edit” button in the SAS Ribbon.

The SAS Add-In for Microsoft Office is available only for SAS Meta deployments. 

4 – Import Excel using the SAS Enterprise Guide Wizard

SAS Enterprise Guide

To import a spreadsheet using Enterprise Guide you can simply click “File” then “Import Data” and select your Excel file to proceed through the wizard.  An excellent guide to this process is available here.  This process will load an Excel table into your SAS project, where you can run further analyses.

This approach will work for both SAS Meta and Base SAS deployments, the key difference being that for SAS Meta your tables will be on the SAS Server as opposed to the local desktop.

5 – Import Excel using the SAS Data Integration Studio

This option is more for SAS 9 ETL developers building enterprise data flows from stable data sources.  Many ETL teams are forced to build flows Excel, despite it’s (deserved) reputation as an “unstable data source”.

One way to perform this task is to set up a library using the EXCEL engine, then register the tables within it.  This involves a number of steps, the screenshots for which are below!

Import Excel to SAS DI Studio

Comparison of Methods

With so many methods, how do you choose the one that is right for you?  This depends on the volume, velocity, variety, and purpose of the data you are loading.  If your Excel is large, has a static structure, and arrives directly from a source system on a regular basis in a fixed location, then you would probably want to build an automated flow using Data Integration Studio.  For ad-hoc data, prepared by technical analysts for departmental reporting then either Enterprise Guide, SAS Studio or the SAS Addin are potential choices.
 
For businesssourced data (such as model parameter sets, reference data, actuarial assumptions) that need to be updated in an IT-secured environment then Data Controller is an ideal choice.  Particularly given that it eliminates the need for a shared directory and reduces the risk of downstream batch incidents due to ‘validate on load’ features.
 
Data Controller works well as a zero-code option for Excel imports, in the following scenarios:
  • The extraction process must be dynamic, as data can sometimes have additional columns or differently named worksheets
  • You need to upload data rapidly and don’t have time for a fully automated ETL solution to be built & deployed
  • You must retain the original Excel, along with change metadata, for audit purposes
  • Your SAS Admin does not have capacity for ad-hoc data modification requests
  • You would like to separate the role of Data Submitter and Data Approver
  • Your data model needs protecting from accidental corruption
  • You need automatic Data Quality rules applied at source
Below is a further comparison of the different options:
Data Controller compared
 

If you’d like to discuss potential use cases for Data Controller, or to get a deep dive into any of it’s features, you can begin the process right now by requesting a demo session!

EUC Management Systems need these 12 Attributes

End User Computing (EUC) applications are unavoidable – the challenge is not to erase them, but to embrace automated approaches to EUC management that will identify, clean, secure, backup, and integrate EUC data with full auditability, ownership, and approval.

The Much-Maligned EUC

EUC applications such as Excel, Access Databases, and locally executed programs, are often targeted as the source of a myriad of risks – such as financial misstatements, internal fraud, incorrect models, and potential for business process disruption.  The rationale being that business developed / owned applications are not subject to the same access controls, development & testing standards, documentation and release management processes as can be found over the “IT Fence”.  Whilst this is probably true, the inherent flexibility of EUCs that can be quickly updated without service desk requests, project codes, or lost arms & legs – means that EUCs are, regardless, here to stay.

The challenge is to find a way to shine a light onto this “Shadow IT”, and provide a framework by which EUC data can be extracted in a simple, safe, secure, scalable, and auditable fashion.

EUC Use Case Diagram

EUCs can be Controlled

The ‘war on EUCs’ cannot be won – it simply isn’t practical to ban them, or to migrate / redevelop every closely held and highly complex legacy VBA application. Until alternative solutions for Citizen Developers to build Enterprise Apps (such as SASjs) become mainstream, simple measures / controls on the EUCs themselves must be implemented – such as version control, readonly attributes, embedded documentation, peer review etc.

In the meantime, a management system for EUCs is the ideal place for capturing the requisite metadata needed to monitor, audit, and secure the data therein. Such a management system should have, as a minimum, the following attributes:

EUC Data Quality at Source

The ability to run data quality routines at the point of data upload (from EUC to secure IT environment) provides instant feedback to EUC operators that will allow them to make corrections and avoid costly post-upload investigations, re-runs, or worse – incorrect results. As part of this process, it should be easy to create and update those Data Quality rules. A longer discussion of Data Quality can be found here.

EUC Data Review (4 eyes)

After EUC data is submitted, it should be reviewed before the target database is updated.  It should be possible (but not mandatory) for this check to be performed by a different individual. When performing that check, it should only be necessary to review new / changed / deleted records. For changed records, the reviewer should also be able to see the original values. If the data is approved, the target table is updated. If rejected, the staged data can simply be archived.

Roles & Responsibilities (RACI)

By capturing who is actually submitting the data, we can see who is responsible for each EUC. By reviewing who is signing off on that data, we have an indication of who is accountable. And by seeing who is being notified of changes to that data, we can deduce who are being consulted / informed. It will then be unnecessary to conduct time-consuming interviews or audits to produce instantly out of date and error-prone EUC ownership documentation!

EUC Data Security

EUCs are often present on network shares, with opaque access policies and few (if any) controls to prevent unintentional deletion or corruption of data. An EUC management system should ensure data protection from the point of EUC integration right through to the loading of the data to the target table(s). End users should not require write access to the target databases! Neither should individuals in IT be regularly relied upon to run manual scripts for loading business critical data.  Finally, it should be possible to restrict (at least to table level) which groups are given permission to edit or approve data.

Ease of Use

Adding new tables / EUCs to the system should be a BAU (configuration) task, and possible without needing to secure IT development resource.  The process should be so well defined, that new EUC operators can safely integrate their processes with minimum (if any) engagement from IT.

EUC Traceability

Understanding the flow of data into regulatory reports is essential for ensuring the accuracy of the figures they contain. Whilst this can be done automatically in some IT systems (eg SAS Metadata or Prophet Diagram View) the lineage breaks down when data flow crosses system borders. An EUC management system therefore should keep a full history to enable traceback of data items, right back to a copy of the EUC from where the data arrived.

EUC Data Integration

Any “system” worth it’s salt will enable easy integration and flexible workflows to ensure that subsequent processes can be triggered on relevant events (such as EUC submission, or data approval). There should be no manual steps other than the act of submitting the data, and reviewing / approving the data.

Version control / automated testing

This should really go without saying, however the reality is that there are still many teams (yes, even in IT) who work without source control. Don’t even think about building a complex data management system without solid source control and a comprehensive test harness. Not to mention automated build and deployment.  When it comes to a system that is responsible for maintenance of business data, it is imperative that it is robust, performant, and filled with checks and controls.

Documentation

Whilst a decent system should be intuitive enough to operate without a manual, when it comes to maintaining, extending, or using advanced features – documentation is essential, and should be updated regularly. New feature? Write the test, make the fix, build & deploy, pass the test, update the documentation, release.  Documentation should be useful for users, developers, and administrators – with diagrams, screenshots, and process flows.

Scalability

During month end, temperatures are high and the pressure is on. The last thing you need on BD2 is system failure, especially when it’s 4:30 on a Friday and 150 users are affected. Be sure your platform of choice is proven, supported, and highly available.

EUC Auditability

One of the biggest business benefits of an EUC Management System is the ability to trace data directly back to a locked down copy of the EUC that it came from.  The system should therefore make it easy to identify and locate that copy, to see who submitted it, who signed it off, and what the precise changes were (adds, updates, deletes).

Data Controller for SAS

Before you go ahead and build / maintain your own ‘black box’ bespoke EUC reporting solution, take a look at what the Data Controller has to offer (in addition to everything described above):

  • A proven methodology for EUC capture
  • A smooth and performant review and approve experience
  • Ability to run bespoke SAS programs before / after every edit or approve
  • Secured by SAS
  • Extensive documentation
  • Roadmap (version restore, data access reports, data profiling)
  • Flexible licence (one-time fee / optional support agreement)
  • Easy / simple deployment (entirely within your existing SAS platform)

We can also provide an on-site consultant to perform the deployment and user training.  Get in touch to learn more!