0:00
Everybody wants to talk about what AI can build.
0:03
I'm a lot more interested in what gets cut, what
0:06
gets exposed, and who gets paged when it goes
0:09
sideways. Because once you get past the demos,
0:12
that's where the real story starts. Hey, I'm
0:32
Brian Teller. I work in DevOps and SRE, and I
0:35
run Teller's Tech. This is Ship It Weekly, where
0:37
I filter the noise and focus on what actually
0:40
changes how we run infrastructure and own reliability.
0:43
Show notes and links are on shipitweekly .fm.
0:47
If the show's been useful, follow it wherever
0:49
you listen. ratings help way more than they should
0:52
if you want more signal between episodes check
0:55
out oncallbrief .com got five main stories today
0:58
then the lightning round and we'll wrap with
1:00
the human closer first a follow -up to that block
1:03
layoff story because the ai angle is starting
1:06
to look a whole lot messier than the original
1:09
headline then meta buying multbook which sounds
1:12
ridiculous until you look at what it says about
1:15
the agent story and the security problems already
1:18
showing up around it After that, Atlassian is
1:21
making a pretty similar workforce move. Then,
1:24
GitHub gave one of the more honest breakdowns
1:27
I've seen lately of what actually went wrong
1:29
in a real platform outage. And finally, one AI
1:33
story that actually feels grounded, with Claude
1:37
helping Mozilla find real Firefox bugs. Let's
1:45
start with Block. Because we talked about the
1:48
original headline before, and the follow -up
1:50
makes that story a lot more interesting. When
1:53
Block announced the cuts, the broad framing was
1:56
basically that AI changed what it means to build
1:59
and run a company. And to be fair, Jack Dorsey
2:01
really did say that. In Block's Q4 2025 shareholder
2:05
letter, he said the company was going from over
2:08
10 ,000 people to just under 6 ,000, and that
2:12
intelligence tools have changed what it means
2:14
to build and run a company. He also said Block
2:17
believed a much smaller team using those tools
2:20
could do more and do it better. The thing that
2:23
makes this more interesting is that the same
2:25
shareholder letter also says 2025 was a strong
2:29
year, with Q4 gross profit at $2 .87 billion,
2:34
up 24 % year over year. So this was not framed
2:37
like emergency surgery. It was framed like a
2:40
strategic AI native reset. Now, some of the follow
2:44
-up reporting is catching up to that framing.
2:46
The Guardian talked to current and former block
2:49
workers who basically said, yeah, AI can help
2:52
in some places, but no, it cannot just replace
2:55
large chunks of the actual work, especially in
2:59
areas that need judgment, strategy, domain context,
3:02
or any kind of regulated decision making. That's
3:05
the part I think matters for listeners. I don't
3:08
think the right take here is AI is fake. That's
3:11
lazy. The better take is that executives are
3:14
starting to use AI as the language for explaining
3:17
changes that are also about headcount, efficiency,
3:20
investor expectations, and management philosophy.
3:24
And those are not the same thing. From an ops
3:26
angle, this is the question I'd ask leadership
3:29
every single time. If output is supposed to go
3:32
up because of AI, what exactly is scaling the
3:35
safety net? Because more generated output plus
3:38
fewer humans does not magically equal better
3:40
operations. It usually means thinner on call,
3:44
less tribal knowledge, fewer reviewers, and more
3:47
pressure on the systems that are supposed to
3:50
catch bad changes before customers do. Small
3:53
fast teams are great. I like small fast teams.
3:55
But small fast teams only work if the breaks
3:58
are real. Do this Monday. If your company is
4:02
in the AI productivity mode, look at the guardrails
4:05
like they actually matter. Is rollback clean?
4:08
Are deploy approvals matched to risk? Are you
4:11
tracking on -call pain, MTTR, and pages per week
4:15
while headcount shifts around? Because if leadership
4:18
says velocity is going up, but the human metrics
4:21
get uglier? That tells you a lot faster than
4:23
the slide deck will. That's why this feels like
4:26
more than just a layoff story. Now on to Meta
4:28
and Moltbook. On the surface, this sounds like
4:35
internet nonsense. Meta bought a social network
4:38
for AI agents. Okay, weird. But once you get
4:41
past how absurd that sounds, it is actually a
4:44
pretty useful signal. Reuters and AP both reported
4:47
that Meta is acquiring Multbook and bringing
4:50
its co -founders, Matt and Ben, into Meta's AI
4:53
efforts. Multbook is basically a Reddit -like
4:56
place where AI agents post, comment, and interact
4:59
with each other. So Meta is not buying a normal
5:02
social network here. It's buying a piece of infrastructure
5:05
around agent -to -agent interaction. And that
5:08
would already be interesting on its own, but
5:11
the security context is what makes this a real
5:14
op story. Wiz disclosed in February that Multbook
5:18
had an exposed database that revealed private
5:21
messages, user emails, and around 1 .5 million
5:25
API keys. Reuters separately reported the issue
5:29
was fixed after disclosure. So the bigger lesson
5:32
is not just haha weird AI bot town got hacked.
5:35
The lesson is that agent ecosystems are showing
5:38
up before identity, trust, permissions and blast
5:42
radius controls are actually mature. This is
5:44
the part I'd hit on the mic. If agents are going
5:47
to do anything meaningful on behalf of users
5:50
or companies, then identity stops being a product
5:53
detail and becomes a control plane problem. Who
5:56
is the agent? What can it do? What secrets can
5:59
it touch? What instructions can influence it?
6:02
What logs exist when it does something dumb?
6:06
Moltbook is a goofy story, but it's also kind
6:08
of a preview of the actual mess we're walking
6:11
into. And that's part of why the Atlassian story
6:14
matters too, because now this starts to feel
6:17
less isolated. Atlassian is where this starts
6:24
to feel less like a one -off. Atlassian said
6:27
on March 11th that it is cutting about 10 % of
6:30
its workforce, roughly 1 ,600 employees. In its
6:35
own announcement, the company said it wants to
6:37
self -fund more investment in AI and enterprise,
6:41
move faster, and adopt the fact that AI is changing
6:45
the skills and roles it needs. The phrasing is
6:48
softer than blocks. Atlassian is not saying AI
6:51
replace. people. But they are very clearly saying
6:54
that AI changes the shape of the company, and
6:57
that headcount decisions are following from that.
6:59
That matters, because once you have multiple
7:01
large tech companies making moves like this in
7:04
a short window, it starts to look less like one
7:07
eccentric CEO and more like a real executive
7:10
playbook. AI is now getting used to not just
7:13
sell tools, but to justify restructuring. And
7:16
maybe in some cases that'll be right. Maybe some
7:18
teams really do get more leverage. But I think
7:21
it's way too early to pretend most orgs have
7:25
actually re -architected their workflows, controls,
7:28
and incentives well enough to deserve the headcount
7:31
assumptions they are making. So from the DevOps
7:34
and SRE seat, the question is pretty blunt. Are
7:37
we redesigning the operating model too, or just
7:40
the org chart? Because if you cut staff and say
7:43
AI makes everyone faster, but you don't also
7:46
tighten ownership, change management, release
7:49
boundaries, and service accountability, then
7:51
what you really did was also increase ambiguity
7:54
and call it strategy. That's not transformation.
7:57
That's debt with nicer branding. All right, enough
8:00
org chart AI talk for a second. Back to regular
8:03
infrastructure pain, GitHub. GitHub published
8:09
a pretty candid post on March 11th about the
8:12
recent availability issues. They called out three
8:15
major incidents on February 2nd, February 9th,
8:18
and March 5th, and said the core problems were
8:21
rapid load growth, architectural coupling that
8:25
let localized failures cascade, and weak ability
8:28
to shed load from misbehaving clients. For GitHub
8:32
actions specifically, a February 2nd hosted runner
8:35
outage was caused by a loss of telemetry that
8:38
led security policies to get applied to backend
8:42
storage accounts, which then blocked access to
8:45
critical VM metadata. And on March - 5th, one
8:48
of the action's incidents involved a Redis failover
8:51
problem that left a cluster without a writable
8:54
primary. Honestly, I like this story because
8:57
it feels real. This is not a fluffy, we take
9:00
reliability seriously post. This is GitHub saying,
9:03
yeah, growth plus coupling plus not enough load
9:07
shedding discipline can absolutely hurt you,
9:09
even when you are GitHub. They also said 12 .5
9:13
% of GitHub traffic is now being served from
9:16
Azure Central US, and they are aiming for 50
9:19
% by July as part of a broader resilience push.
9:23
So there is real architectural movement happening
9:26
behind the scenes, not just PR. language. And
9:28
I think this is a nice grounding story for the
9:31
whole episode. While leadership teams are talking
9:33
about AI changing everything, the pager is still
9:36
going off for the usual reasons. Cascading dependencies,
9:40
failover assumptions that don't hold, misbehaving
9:44
clients, operational blind spots, load growth
9:47
outrunning architecture. That stuff did not go
9:50
away. If anything, the faster the rest of the
9:53
industry moves, the more punishing those fundamentals
9:55
become. Do this Monday. Pick one critical internal
9:59
platform you own and ask a very boring question.
10:03
What's our equivalent of GitHub's coupling problem?
10:06
Where would one localized failure spread further
10:09
than it should? And if one client or one workload
10:12
goes bad, can you actually protect the rest of
10:15
the system? Or are you just hoping rate limits
10:18
and dashboards save you? And to balance that
10:20
out, here's one AI story that actually feels
10:23
practical. This one comes from Anthropic and
10:31
Mozilla. Anthropic published that Claude Opus
10:34
4 .6 found 22 Firefox vulnerabilities over the
10:38
course of two weeks. And Mozilla said those reports
10:42
were real enough and actionable enough that fixes
10:45
shipped in Firefox 1 .4 .8. Anthropic said 14
10:50
of the findings were high severity. Mozilla's
10:53
own write -up said the bug reports were useful.
10:56
because they included minimal test cases that
10:59
Firefox engineers could reproduce and validate
11:02
quickly. That's important because a lot of the
11:05
AI security chatter still collapses under contact
11:09
with reality. This one didn't. This is where
11:11
I think the current value story for AI is more
11:14
credible. Bug hunting. Security triage. Review
11:18
assistance. Broader coverage. Faster surfacing
11:21
of things humans still need to validate. That
11:24
feels a lot more real to me right now than giant
11:27
sweeping claims that you can just wipe out huge
11:29
chunks of a company because the models got better.
11:32
Anthropic also published a separate labor market
11:35
report last week saying they found no measurable
11:39
unemployment impact yet in the most AI exposed
11:42
occupations. Though there is tentative evidence
11:45
that hiring into those roles has slowed a bit
11:48
for workers age 22 to 25. That's a useful reality
11:52
check. The labor story is still messy and early,
11:55
even while the tolling story is clearly moving
11:58
fast. Do this Monday. If your team is evaluating
12:01
AI for security work, start in suggestion mode,
12:05
not autonomy mode. Let it find stuff. Let it
12:08
propose patches. But keep human approval, audit
12:11
trails, and normal review pressure in place.
12:14
Treat it like CI, not magic. If it is opening
12:17
PRs, touching code, or influencing release flow,
12:21
it needs the same boundaries you would expect
12:23
from any overconfident junior engineer with way
12:26
too much access. All right, a few quick ones
12:30
before we wrap. AWS announced that policy in
12:40
Bedrock Agent Core is now generally available.
12:43
The reason I like this story is simple. It lets
12:46
teams define centralized controls for agent -tool
12:50
interactions outside the agent code itself, with
12:53
natural language authoring that converts to Cedar.
12:56
That is a very loud signal that even AWS knows
13:00
agent behavior needs externalized policy and
13:03
governance, not just trust the prompt. Cloudflare
13:07
dropped its 2026 threat report. And the interesting
13:11
frame there is attacker measure of effectiveness.
13:14
Their point is basically that attackers are optimizing
13:18
for throughput and results, not elegance, and
13:21
they are increasingly abusing trusted platforms
13:24
and cloud tooling to get there. That fits the
13:27
broader theme of the episode really well. The
13:30
future attack surface is not just malware in
13:33
a zip file. It's automation. trust chains, and
13:37
systems that look normal until they really don't.
13:40
GitHub added native Dependabot support for pre
13:43
-commit hooks, which is a smaller story, but
13:46
honestly a nice one for teams that care about
13:49
supply chain hygiene and don't want pre -commit
13:52
configs quietly rotting in repos forever. It
13:55
is one of those changes that won't get a huge
13:57
headline. but it will save some teams from carrying
14:01
stale tooling longer than they realize. And AWS
14:04
also added stateful MCP server support in Bedrock
14:08
Agent Core runtime. That matters because it makes
14:12
the agent stack more real. more persistent, and
14:15
more likely to move into production -shaped workflows
14:18
instead of toy demos. Dedicated session micro
14:22
-VMs, session context, progress notifications,
14:26
multi -turn elicitation. This stuff is getting
14:29
infrastructure now, not just hype. So what's
14:32
the takeaway from all of this? I think the cleanest
14:43
takeaway here is that AI is no longer just a
14:46
feature story. It's a workforce story. a governance
14:50
story, a security story, and a reliability story
14:53
all at once. Block and Atlassian show how quickly
14:57
executives are willing to turn AI into staffing
15:01
logic. Meta buying Moltbook shows how fast people
15:04
are trying to build the agent layer before the
15:08
trust model is really settled. GitHub is the
15:11
reminder that even with all of that noise, the
15:14
real operational pain still comes from the very
15:17
normal system's problems. And Anthropic plus
15:20
Mozilla is the reminder that some of this stuff
15:23
really is useful right now, just not always in
15:27
the laziest version of the story. So the job
15:30
is still the same. Don't get hypnotized by the
15:33
loudest framing. Figure out where the value is
15:36
real, where the risk is moving, and what controls
15:39
you owe the humans who still have to clean up
15:42
the mess when one of these bets goes sideways.
15:46
Guardrails still matter. Ownership still matters.
15:50
Reliability still matters. Alright, that's it
15:54
for this week of Ship It Weekly. Quick recap.
15:57
Block's AI layoff story is getting messier. Meta
16:01
buying Moltbook, Atlassian making the same kind
16:04
of move in a different voice, GitHub explaining
16:07
the outages, and Claude actually helping find
16:10
real Firefox bugs. Links and show notes are on
16:14
shipitweekly .fm. You can also find the video
16:17
versions on YouTube. And if you want the DevOps
16:20
news before the show, check out on callbrief
16:23
.com. If this episode was useful, Follow or subscribe
16:26
wherever you listen. And send it to the person
16:29
on your team who keeps hearing AI will make us
16:32
faster while nobody wants to talk about what
16:35
that means for safety, staffing, or reliability.
16:38
I'm Brian, and I'll see you next week.
For this episode, the theme that kept showing up was pretty simple: AI is crossing out of the “tooling” bucket and into the parts of the stack that change how companies operate, how platforms fail, and how trust actually gets enforced.
Not just code suggestions. Not just faster PRs. Not just nicer demos.
Now it’s showing up in layoffs, org redesign, agent identity, security boundaries, and platform instability. Block tied a major workforce reset to “intelligence tools.” Atlassian said AI is changing the mix of skills and roles it needs. Meta bought Moltbook, which is basically a weird little lab experiment for agent-to-agent behavior that already came with a security stain on it. And GitHub had to come out and say, pretty directly, that they have not met their own availability standards lately.
That’s why I don’t think this episode is really “about AI” in the lazy sense.
It’s about what happens when AI stops being a side tool and starts becoming part of the operating model.
The Block story is the clearest example. In the shareholder letter, Jack Dorsey said “intelligence tools have changed what it means to build and run a company,” and argued that a significantly smaller team could do more and do it better. But the follow-up reporting immediately made the story messier, pointing to other pressures too, including crypto weakness, overstaffing, and stock pressure. That gap is the interesting part. Not whether AI helps, because obviously it does in some contexts. The interesting part is how fast “AI” is becoming a clean explanation for decisions that are also about cost, structure, expectations, and management philosophy.
And Atlassian matters because it makes Block feel less isolated.
Their March 11 update was explicit: about 10% of the company, around 1,600 people, while self-funding more investment in AI and enterprise sales and reorganizing to move faster. They also said, pretty plainly, that while their approach is not “AI replaces people,” it would be disingenuous to pretend AI doesn’t change the mix of skills needed or the number of roles required in certain areas. That’s a very different tone than Block, but it lands in a similar place. AI is no longer just being sold as leverage. It is being used as staffing logic.
From the DevOps and SRE seat, that creates a very practical question.
If leadership is going to claim more output from fewer people, what exactly is scaling the safety net?
Because generated output scales fast. Human review, operational context, on-call coverage, and rollback discipline usually do not. That part is my inference, obviously, but it’s the inference these stories keep pushing me toward. If AI becomes the reason to cut faster than you improve your controls, then the real result is not “transformation.” It’s just a thinner human layer sitting behind a more aggressive delivery system.
The Moltbook story is the other side of this.
On paper it sounds goofy. Meta bought a social network for AI agents. Fine. Weird internet headline. But Reuters is clear that this is not just a joke acquisition. Meta is bringing the founders into Superintelligence Labs, and the whole thing points at where the agent race is headed. At the same time, Reuters also notes that Moltbook’s rise came with security problems, including a flaw that exposed private messages, thousands of emails, and more than a million credentials before Wiz reported it and the issue was fixed. That’s why the story matters. Not because “robots posting on a forum” is inherently important, but because it previews the trust problem. Once agents start acting on behalf of users, teams, or companies, identity, permissioning, auditability, and blast radius stop being product details and start becoming platform concerns.
That’s also why the AWS Bedrock AgentCore Policy announcement was a good lightning-round item.
It is basically AWS saying, out loud, that agent-tool interactions need centralized, fine-grained controls that operate outside the agent code itself. Security, compliance, and operations teams need to define what agents are allowed to do without rewriting the agent every time. That feels like the grown-up version of this whole conversation. Not “trust the prompt.” Not “the model seemed fine in a demo.” Policy, validation, interception, governance. The same old boring words that always matter once software starts touching real systems.
Then there’s GitHub, which was honestly one of the most useful stories in the bunch because it brought the whole episode back to reality.
GitHub said the most significant incidents happened on February 2, February 9, and March 5, and tied the instability to rapid load growth, architectural coupling, and a weak ability to shed load from misbehaving clients. On the Actions side, one outage came from a telemetry gap that caused security policies to hit key internal storage accounts and block VM metadata access. Another came from a Redis failover that left a cluster with no writable primary. That is just real platform engineering pain. No fluff. No fake confidence. Just growth, dependency coupling, failover assumptions, and systems that turned out to be less isolated than they needed to be.
And that part connects directly to stuff we’ve already talked about on the show.
We were already on the Block layoff angle in a previous week’s episode, AWS Bahrain/UAE Data Center Issues Amid Iran Strikes, ArgoCD vs Flux GitOps Failures, GitHub Actions Hackerbot-Claw Attacks (Trivy), RoguePilot Codespaces Prompt Injection, Block “AI Remake” Layoffs, Claude Code Security.
And on the GitHub outage side, we’ve hit that theme more than once already in Special: When the Cloud Has a Bad Day: Cloudflare, AWS us-east-1, GitHub outages and When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep. So this episode is less a brand-new theme and more the next step in the same pattern: AI is changing the pressure on the system, but the failures still show up in trust boundaries, control planes, and operational weak points.
That’s why I liked ending the main stories with Anthropic and Mozilla.
Because it keeps the episode from collapsing into “AI hype bad” or “AI layoffs bad” and pretending that’s the whole picture. Anthropic said Claude Opus 4.6 found 22 Firefox vulnerabilities in two weeks, 14 of them high severity, and Mozilla shipped fixes in Firefox 148. That’s a much more grounded version of the value story. Bug hunting, security review, broader coverage, more signal for humans to validate and act on. That feels way more real to me right now than the giant hand-wave of “smaller teams can just do more now, trust us.”
If I had to boil the whole thing down, I think the real divide is this:
There’s the AI story companies want to tell, and then there’s the AI story operators actually have to live with.
The company story is leverage, speed, restructuring, transformation, and the future.
The operator story is guardrails, permissions, blast radius, audit trails, outage recovery, and who still has to wake up when the system behaves in a way nobody modeled.
That’s where this episode lived for me.
Not “is AI good or bad.”
More like: where is it actually useful, where is it being used as cover language, and what new control points do platform teams need to care about before the hype gets translated into production reality?
Past Ship It Weekly references
Block layoff episode:
AWS Bahrain/UAE Data Center Issues Amid Iran Strikes, ArgoCD vs Flux GitOps Failures, GitHub Actions Hackerbot-Claw Attacks (Trivy), RoguePilot Codespaces Prompt Injection, Block “AI Remake” Layoffs, Claude Code Security
GitHub outages episodes:
Special: When the Cloud Has a Bad Day: Cloudflare, AWS us-east-1, GitHub outages
When guardrails break prod: GitHub “Too Many Requests” from legacy defenses, Kubernetes nodes/proxy GET RCE, HCP Vault resilience in an AWS regional outage, and PCI DSS scope creep
Source links mentioned
Block Q4 2025 shareholder letter
What was really behind Jack Dorsey laying off nearly half of Block’s staff?
An important update on our team - Atlassian
Meta acquires AI agent social network Moltbook - Reuters
Wiz on the Moltbook exposure
Addressing GitHub’s recent availability issues - GitHub
Partnering with Mozilla to improve Firefox’s security - Anthropic
Policy in Amazon Bedrock AgentCore is now generally available - AWS