Hi, I’m Scott Arthur.
Staff SRE — Reliability / Observability / Agentic Engineering
I’m a passionate technologist whose background spans web and software development, infrastructure automation, cloud platforms, Kubernetes, CI/CD, observability, and systems administration. Recently, my work has increasingly focused on agentic engineering: using LLMs, coding agents, structured context, and automation to improve how engineers build, review, support, and operate software.
Engaging with people to understand the challenges they’re facing, and devising an iterative approach to solve them, is my favourite part of what I do. Sometimes this involves a complex technological strategy, but often a simple, pragmatic approach can be just as effective.
I take pride in thinking about user experience in everything I build, whether the end-user is a customer, a less technical stakeholder, or a highly skilled developer. I believe we all enjoy working with tools and systems that reduce cognitive load, make feedback loops clear, and get out of our way as much as possible.
Contact Me
scott@scottatron.com
github.com/scottatron
linkedin.com/in/scottatron
Skills
AI Engineering & Agentic Systems
I work hands-on with LLMs, coding agents, and AI-augmented engineering workflows. My recent work includes building and operating personal and work-adjacent agent systems, evaluating agent frameworks and MCP-style tool integrations, using AI agents for code review and delivery support, and designing workflows where scripted capture, durable memory, and reasoning agents each have clear responsibilities.
I’m especially interested in the production shape of AI systems: evaluation, observability, privacy boundaries, prompt and context management, tool permissions, human-in-the-loop control, and the operational patterns needed to make AI useful without making systems opaque.
LLMOps & AI Operationalisation
My SRE and observability background gives me a practical lens on LLMOps. For AI systems, a successful request is not just a 200 response: it also needs to be grounded, useful, safe, cost-aware, and auditable. I think about AI services in terms of quality signals, evaluation loops, traces, cost controls, user trust, and the handoff points where a human needs to come back into the loop.
Staff SRE / Platform Engineering
At Wesfarmers OneDigital, I focus on whatever is creating the most operational noise: helping teams understand their applications, performance, operations, and failure modes. This includes observability strategy, SLOs, incident response, cloud migration, CI/CD, platform support, and connecting reliability work to customer and business outcomes across OnePass, OneData, and the broader Wesfarmers retail ecosystem.
Technical Communication, Storytelling & Enablement
I have recently been developing a stronger public and internal technical communication practice. At OneDigital, this has included Engineering Community Talks, Klaxon Therapy / on-call workshops, team knowledge-sharing sessions, Agentic Coding Working Group contributions, and the Agentic Engineering All-In Day, where I presented on agentic code reviews and AWS/GCP platform guidance.
Most recently, I was invited to be a panelist at the Datadog Live Melbourne event, speaking about observability, SLOs, customer journeys, signal-vs-noise-vs-cost trade-offs, and how observability needs to evolve for AI and LLM systems. That speaking work reflects the kind of enablement I value: not just building patterns, but explaining them clearly enough that other teams can adopt them safely.
Software Development
These days I write most of my code in Go, Python, and TypeScript — Go and Python for platform tooling, automation, and AI-agent workflows, and TypeScript for web frontends and Node-based tooling. I came up through Ruby, which is still the language I’m most fluent in and where my instincts for clean, expressive code were formed.
I value simple, maintainable code, fast feedback loops, and using tests to both inform design and give confidence that a system does what users and the business need it to do.
Cloud Infrastructure & Platform Engineering
I’m a strong advocate for infrastructure as code and GitOps-style delivery: source-controlled, declarative, idempotent changes with good audit trails. I’ve worked extensively with Terraform, Kubernetes, and container platforms — from championing containers and Kubernetes adoption at IOOF through to running modern workloads on AWS and GCP at Wesfarmers OneDigital.
My focus is on the practices that make platforms genuinely usable by delivery teams day to day: CI/CD pipelines, reusable automation, sensible guardrails, and fast feedback loops that let teams ship and operate software reliably and efficiently.
Observability
In my experience, observability is more than a buzzword—it’s a critical part of the software delivery lifecycle. It’s not just about gathering data, but transforming that data into actionable insights.
Observability has been key in my work, closing feedback loops and guiding decisions. It has been instrumental in tracking user journeys, meeting business service-level objectives, and helping teams move from “is the system up?” to “is it supporting the customer and business outcome?”
I’ve had the opportunity to work with a range of tools, including Datadog, the Elastic platform, Splunk, New Relic, Prometheus, and Grafana. But it’s not just about the tools—it’s about how we use them to understand our systems, spot patterns, and make data-driven decisions that improve software quality and user satisfaction.
To me, observability is about proactively maintaining the health and performance of our software, reducing cognitive load during incidents, and constantly striving for improvement.
Experience
2024–PresentWesfarmers OneDigital — Staff SRE
At Wesfarmers OneDigital, I work across platform, reliability, observability, and developer enablement for systems supporting OnePass, OneData, and the broader Wesfarmers digital retail ecosystem.
Key contributions include:
- Observability strategy and operational maturity: helping teams connect telemetry, customer journeys, incident response, and business outcomes rather than treating observability as only dashboards and infrastructure metrics.
- Datadog platform evolution: contributing to Datadog adoption and optimisation, PagerDuty to Datadog On-Call migration planning and communications, RUM rollout and cost modelling, and training teams on trace search, logging, dashboards, and incident workflows.
- SLO and customer-journey framing: drafting and publishing SLA/SLO material for OnePass Mobile SSO and working through async event SLO questions for OneShare-style event flows, including latency definition, missing SLIs, and business impact.
- Cloud and platform architecture: supporting GCP Foundations and OneData migration planning, including Cloud Run / GKE / Composer evaluation, artifact registry strategy, package governance, and secure container workflows.
- CI/CD and delivery systems: working on Buildkite migration and hosted-agent options, with a focus on reusable patterns, feedback loops, and reducing delivery friction.
- AI-augmented engineering practice: using LLMs and agentic coding tools extensively for research, code iteration, review support, documentation, and workflow automation; maintaining a structured knowledge base of AI-agent patterns, memory, sandboxing, MCP, and agent orchestration approaches.
- Internal and external speaking: presenting and facilitating Engineering Community Talks, Klaxon Therapy / Datadog On-Call workshops, team knowledge-sharing sessions, Agentic Coding Working Group discussions, and the Agentic Engineering All-In Day; externally representing OneDigital on the Datadog Live Melbourne panel.
2022–2024LimePoint — Senior Software Engineer
At LimePoint, I worked as a software engineer contributing to the OpsChain product.
This work involved Ruby on Rails, Linux containers, Kubernetes, web frontend development with TypeScript and React, and close collaboration around product direction and user experience.
During my time at LimePoint, I also developed my product management skills—helping to focus the goals of the product and ensure we were delivering a cohesive and attractive solution to potential users.
2018–2021IOOF — Platform Infrastructure Team Lead
In 2018 I was given the opportunity to take over the leadership role for my team at IOOF.
During this time, I continued to be involved in hands-on technical leadership within the team, while also developing my skills in team management, team roadmap planning, stakeholder communication, and vendor management.
Notable work completed under my leadership included:
- Migrating the core of our internal container platform to Kubernetes—deployed on-premise using Rancher.
This involved keeping the existing developer experience largely the same while replacing the core of the platform. We developed simple CRDs that helped deploy and operate our internal representation of an environment. - Initial phase of increasing our AWS adoption in preparation for onboarding workloads that were part of the merger with MLC.
This involved building out a multi-account organisation structure with guard rails and best practices to satisfy stringent compliance and security requirements while still giving application teams as much autonomy as possible.
2014–2018IOOF — Infrastructure Programmer
During my initial 4 years at IOOF, I developed my skills in infrastructure automation and software delivery tools, while contributing to our team’s efforts to make it easy for software delivery teams to efficiently deliver software for the business.
While I was involved in all aspects of our team’s infrastructure automation work, I gave particular focus to our internal platform-as-a-service and the underlying Linux container ecosystem it was based on. This platform aimed to provide the tools and techniques our delivery teams needed to efficiently deploy, operate, and observe their applications.
I was often in the de-facto technical lead position within the team, providing technical leadership as well as mentoring less experienced members of the team.
2006–2014Self-Employed — Web Developer
I worked as a self-employed web developer for 8 years, building websites and web applications for small to medium sized businesses in New Zealand and Australia.
2002–2006Working In — Web Administrator
During my time at Working In, I was a member of the web team and was responsible for email newsletter development, coordinating work on our websites with our external web development company, and search engine optimisation of our website’s content.