Your Alteryx server is the heart of your organisation’s analytic operation. It enables an explosion of value, transforming proof of concept works into production applications.
The challenge is that explosion of value can come at the expense of an eruption of analytic chaos. Burying the value deep in the testing and evolving workflows published to your Alteryx Server.
While Alteryx has provided some best practices on deploying your Alteryx Server hardware, I haven’t found any guidance on managing your server’s content.
I want to collate my thoughts and integrate the community feedback on building a great Alteryx server environment.
This post will act as the content synopsis for (future) posts with greater detail on implementing the practices.
Managing users and groups
The first process when deploying an Alteryx Server is deciding on how to onboard users. There are four authentication methods, each with a unique set of advantages.
The four methods are:
- Local built-in authentication
- Integrated Windows Authentication (IWA)
- Integrated Windows Authentication with Kerberos
- SAML Authentication
Alteryx recommends either of the IWA options as they leverage the file system permissions, assist with run-as-user permissions, and allow the importing of active directory groups. Unfortunately, IWA doesn’t support single sign-on across systems, only what Windows Active Directory can communicate with. Most Software as a Service (SaaS) applications expect the simplicity of SAML, and many organisations are moving to AD systems like Azure Active Directory for their authentication.
Within the authentication system, you can set users’ roles when they first log onto the server. These roles define whether the user can publish content or just interact with the content created by others. The default roles can be assigned based on active directory groups, manually assigned or left as the default that all users are assigned.
Once your authentication process has been established, you need a mechanism for assigning and managing permissions throughout your server. This is where groups are used. Groups allow the server admins and content owners to manage who can access the assets published to the server.
Sharing content around Alteryx Server
Collaborating with colleagues and customers is one of the major benefits of having an Alteryx Server. Making that collaboration possible requires the effective use of collections and groups.
Collections are the primary sharing mechanism for your Alteryx server. For a new Alteryx Server, this is the only way you should share assets (workflows, apps, schedules and the associated results) with other users.
Inside a collection, you can add individual users or Alteryx groups to permission your users. Unfortunately, you can’t add an AD group to the collection even if you are using IWA. The workaround to this is to create an Alteryx Group (potentially with the same name as your AD group of interest) and populate that group with the active directory group. I will detail this workaround in a future post.
Building a sustainable Alteryx Server requires that as many processes as possible are automated. That ranges from regularly running workflows and identifying which workflows are regularly run in an ad hoc process and scheduling them but also automating the workflow processes for onboarding new users and managing collection access, scheduling permissions, API access and more.
These systems can be automated using combinations of the built-in scheduling systems and leveraging the Alteryx Server APIs. A full process will be described in a future post, but one example could be the following:
A new user joins your Alteryx server using the SAML authentication option. This user needs an assignment to his team Group (using the group API endpoint). He can request workflows are scheduled but does not have permission to enable the schedule himself. A helpdesk ticket can allow the schedule creation on receiving authorisation. He also has a complex workflow pipeline with conditional logic. This logic is implemented with Apache Airflow using the workflow schedule API endpoint.
This is just a small sample of the automation processes that could be employed, but the important point is that none of these processes should require human intervention aside from authorising any particular request. Once any requests have been accepted, an automatic system (e.g. a scheduled Alteryx workflow or custom PowerShell script) should implement the request without further intervention.
As your Alteryx server usage grows and demands on server processing time grow, maintaining quality workflows and structuring the workflow improvements will become increasingly important. The only way this can be achieved is to systematically deploy workflows to your production server. This includes leveraging metadata standards, testing systems, and workflow promotion acceptance. In short, this requires a Software Development Life Cycle (SDLC) process.
Deploying this system needs an Alteryx Sandbox server, a separate server environment where workflow testing can occur outside the production environment. Using the sandbox server allows for testing Alteryx workflows in a controlled environment to ensure connections work, file permissions are correct, and the deployment options are valid. Once this validation is done, you can then promote the workflow to the production environment.
Part of the SDLC also includes ensuring your workflows meet your organisation’s standards. This could include ensuring the workflow description has been populated or that there is a “DocString” comment box that provides additional information and context for any workflow reviewers.
Monitoring your Alteryx Server
The final Alteryx Server best practice is how you are monitoring your server environment. This monitoring covers two areas, server activity monitoring and hardware system monitoring.
The server activity monitoring requires extracting the workflow run information (e.g. how often a workflow run, who runs the workflow, and how long it take) and the user activity information from the MongoDB instance that supports the Alteryx server. This data extraction can then feed an analytic database, like Snowflake or Azure Synapse, for continued monitoring in the visualisation application of your choice. It is also a good place to add exception monitoring for key workflows to ensure workflow runs don’t generate errors or warnings. And to monitor the workflow runtime for any anomalies.
The hardware monitoring needs the metrics to be collected by a separate agent. This agent will collect the server hardware operating metrics, including (but not limited to) CPU usage, RAM usage, Page file size, and network activity. Monitoring these metrics can give an initial indication of server capacity issues or changes in operational patterns. By managing these metrics, your server administrator can provide an early warning of potential issues before they become problems. It can also allow for the tuning of workflow patterns allowing for more optimal use of your server hardware.
A well-managed Alteryx server can transform proof-of-concept works into production applications and provide significant value to organisations. While Alteryx has provided best practices on deploying the server hardware, managing the content on the server is equally important, and there is a lack of guidance on this. Authentication methods, group management, sharing content, workflow automation, and workflow lifecycle management are essential for a great Alteryx server environment. A systematic approach to deploying workflows in a Software Development Life Cycle process is necessary, and monitoring the server activity and hardware system is critical. Share your thoughts on the best practices to build a great Alteryx server environment by joining the discussion on LinkedIn or the Alteryx community discussion.