ML Model Workflow Management

Introducing ML model setup in BMC Edge as a technical enhancement & a strategic enabler to keep BMC as a leader in Operation Management industry.

My Role

Lead designer, Strategy alignment, End-to-end UX & interaction design, Design system contribution

Team

8+ teams: UX Design, Product, Research, Design Systems, Data Science, MLOps, IoT Edge & Branch monitoring, Engineering

Lesson

Even highly technical users benefit from layered information design, where complexity is revealed only when needed, reducing cognitive load.

Results & impact

60%

Faster

ML model setup

55% Adoption

Across Edge customer

~20 Minutes

Anomaly model setup

Mapped the detailed process & identified missing features

After customer interviews, I mapped the detailed process including all necessary feature for each step. This informed the design requirements and helped the cross-functional team to better plan for development.

Screenshot 2025-09-16 at 12.01.30 AM.png

Background

The BMC Edge ML (Machine Learning) Workflow is important because it supports intelligent, real-time decision-making at the edge of enterprise IT environments, especially in complex, distributed, and hybrid infrastructure landscapes.

Why it matters?

Introducing ML workflows in BMC Edge is not just a technical enhancement, it's a strategic enabler to keep BMC as a leader in Operation Management industry.

Identifying the right problem to solve

Step 1: Create new ML model

Process

I broke down the design process into 4 steps and started with low-hanging fruit item which had the least technical dependency. I leveraged existing design pattern for step 1 to ensure alignment with the release deadline.

Navigating resource constraints is a key UX skill. With limited time and budget restricting direct user outreach, I strategically leveraged stakeholders who were also the actual target users. I facilitated workshops to co-create solutions that balanced user needs, business goals, and technical feasibility.

Workflow sequence for creating a new model

1. Name

2. Description

3. Task type

4. Algorithm type (select from existing)

5. Algorithm (select from existing)

6. Framework

Defined where in IoT Edge information architecture

Identified capability gaps for the workflow

Flow exploration

How might we design a simplified ML model lifecycle management enabling technical users to create, train, deploy models to speed up anomaly detection?

01 - Create New Model

Need: New model creation connecting to algorithms

Users needed a clear and intuitive way to start the process of creating a new ML model, ensuring entry point isn't buried and require prior knowledge of where to begin.

Action

Introduced a prominent “New Model” entry point on the ML dashboard followed by connecting to related algorithm.

Impact

More users initiated model builds confidently, and the workflow began with better-prepared inputs, improving downstream success rates.

Step 2: Configuring model

Process

Based on the research and workflow mapping I've done earlier, I ensured design reflects the right sequence for tasks users need to perform.

Through working collaboratively with MLOps experts and iterative testing, we refined the workflow’s usability and uncovered key insights that shaped the direction of our next milestone: Training and deploying models.

Configuration steps:

1. Define device profile

2. Customize data collection intervals

3. Define data filters

4. Collect health scores

02- Configure Input Data

Need: Define 'Device Profile'

Users needed a structured way to define which device types, sensor metrics, and operational behaviors should be included in the model.

Action

Designed a Device Profile definition step where users select device groups, specify metrics, and business attributes. The UI includes presets for common device classes and inline validation to flag missing or incompatible data sources.

Impact

This ensured the ML model was grounded in the specific characteristics of the devices being monitored, improving input consistency and reducing false positives in anomaly detection.

MLConfigure 2025-11-21 at 6.25.38 PM.gif

02 - Set Data Collection Rules

Need: Set Data Collection Rules

Users needed a way to define how device data is collected and normalized across distributed edge environments, since inconsistent sampling intervals and unmerged sensor streams led to noisy inputs and unreliable anomaly detection. Without control over these parameters, ML models struggled to reflect real operational behavior.

Action

Added granular data collection step to include clear controls for selecting data collection intervals, choosing which signals to merge or compare, and option to opt-in for health score reporting impacting business ROI.

Impact

Time-synchronized data flowing into the model increased users confidence that the model was learning from representative device behavior, leading to more stable and actionable insights in production.

MLHealthscore 2025-11-21 at 6.35.21 PM.gif

Step 3: Training model

Process

Training and deployment go hand-in-hand, similar workflows combine the two steps into one. We strategically decided to break it down into two separate steps to keep the workflow scalable for feature addition and branching in future.

Also, this helped lift some cognitive load for less technical users giving them more confidence about the tasks they need to perform at each step without overwhelming them.

Training model steps:

1. Set up/define events

2. Run training

3. Monitoring training

03 - Setup & Run Training

Need: Setup & observe training progress

Users needed a straightforward way to provide training data for the model, but data often came from different systems, formats, and time ranges.

Action

Designed a quick way for users to create a new training model and upload custom data for the training.

Impact

This reduced setup errors and rework, ensuring models started from clean, compatible data inputs.

03 - Event Setup

Need: Define training parameters

Users needed a clear way to define which events the model should flag, how critical each condition is, and how long the system should observe behavior before triggering alerts. Without this, models produce noisy or premature alerts, causing alert fatigue.

Action

Introduced a setup panel where users specify threshold conditions, assign severity levels, and set a stabilization period to prevent early false positives during model warm-up.

Impact

This improved alert relevance, allowing the model to mature before influencing operational workflows. Users gained confidence that alerts reflect meaningful and stable deviations.

Reflection

Step 4: Deploy model to node(s)

Process

This step is specific to IoT Edge, where models should be deployed to nodes. I designed batch deployment and undeployment to nodes for faster task completion and provide extra control based on the needs we heard from users.

Deployment tasks:

1. Model deployment (nodes)

2. Model troubleshoot & undeploy (nodes)

3. Monitoring deployment status

04 - Deploy model

Need: Clear path to deploy models

Users need a clear and reliable way to deploy model into production environments. The process must show live metrics including, latency, error rate and uptime. It should indicate deployment errors and failures for previous deployments.

Action

The interface includes environment selection, and real-time metrics for monitoring.

Impact

This made deployment accessible to less-technical users, reduced dependency on engineering teams, and decreased deployment time from days to minutes.

04 - Manage Deployments

Need: Deploy/undeploy to node

Ability to deploy/undeploy a specific model to a node.

Action

Introduced a panel to manage deployment with indications showing which node(s) is causing error.

Impact

This improved operational efficiency for models and surfacing up precisely the nodes that are causing models to fail.

Bridging technical depth with usability

How to translate highly technical MLOps workflows into guided experiences that reduced complexity without oversimplifying critical details

Cross-domain collaboration & design

Gained experience aligning data scientists, ML engineers, and product stakeholders, each with different mental models, around a unified design framework.

Want to see more?

Goal

Design an intuitive workflow for technical (and less technical) users to configure, train and deploy ML models within IoT Edge space.

Challenge

Should we re-think the entire experience or make incremental improvements? What will give us the biggest user value return on our resource investment?